RE: How to implement a proximity search using LINES as slop

2011-02-08 Thread Livia Hauser
Hi Pierre, many thanks for your idear. I had a look to Payloads ... should it possible to store the line number as payload? Best Regards, Livia -Ursprüngliche Nachricht- Von: "Pierre GOSSE" Gesendet: 08.02.2011 09:37:53 An: "java-user@lucene.apache.org" Betreff: RE: How to implement

boosting on a sint results in high cpu spikes and ultimately hangs solr

2011-02-08 Thread Ben VandenBos
Hi, We've been using solr for several years now with great success. Recently, we modified a boost query to reference a dynamic sint field defined as follows: We index a handful of these fields per document. Their field names are of the form: __specialty_percent_i Examples: _5

HA Configuration / Best Practices

2011-02-08 Thread BrightMinds Dev
Hi, We are developing a site with a 4 tier design (RP, UI, WS, DB) and on the WS tier are looking at how we would setup Lucene in a HA configuration i.e. so there is no single point of failure. The initial deployment will involve pairs of servers at each tier. As there are at least 2 server

Lucene Questions about query and highlighter~^^

2011-02-08 Thread Gong Li
Hi, I am coding a *local pdf search engine* in Java.(If someone did it before, could you please give some tips?) So I need query parse. Assume I want to search for "hello user" in the document. *Q1*. I have 4 kinds of queries in my program. They are: 1. Match Exact words or phases. e.g. "hell

[ANNOUNCEMENT] NLP-based Analyzer library for Lucene

2011-02-08 Thread Lars Buitinck
Dear all, For anyone wanting to add some NLP abilities to Lucene, I've released a small library at https://github.com/larsmans/lucene-stanford-lemmatizer . This library performs part-of-speech tagging (determining word categories such as noun, verb), filtering based on part-of-speech and lemmatizi

Lucene Questions about query and highlighter~^^

2011-02-08 Thread Gong Li
Hi, I am coding a *local pdf search engine* in Java.(If someone did it before, could you please give some tips?) So I need query parse. Assume I want to search for "hello user" in the document. *Q1*. I have 4 kinds of queries in my program. They are: 1. Match Exact words or phases. e.g. "hell

Re: Extending org.apache.lucene.analysis.br.BrazilianAnalyzer to discard numeric tokens

2011-02-08 Thread Georger Araujo
2011/2/7 Robert Muir > On Sun, Feb 6, 2011 at 3:28 PM, Georger Araujo > wrote: > > Hi, > > I started using Lucene a few weeks ago, and I must say I'm amazed. Hats > off > > to the developers and the community! > > I'd like to write a custom analyzer whose only difference to > > org.apache.lucene

Re: Works on Windows, crashes on Linux

2011-02-08 Thread Ian Lea
That stack trace appears to be triggered in IndexWriter.addDocument() rather than .open() which leads one to speculate that the directory is fine, or was at any rate, but something is messing with the contents. Evidently you are running this in tomcat - have you got multiple threads writing to the

RE: How to implement a proximity search using LINES as slop

2011-02-08 Thread Pierre GOSSE
Hi Livia, One way of doing this line slope would be to implement a custom tokenizer that could tokenize on new line, and split each token into the words it contains. I.e. Each word of a line would be seen as being at the same position (and having same offset and length as the complete line).