RE: querying multi-value fields

2009-10-12 Thread Angel, Eric
22 ccc23 So, you really don't care about the slop, since you can set it to less than the magic number you return from PositionIncrementGap. BTW, slop indicates holes, not total terms. So with a slop of 0 all the words need to be next to each other, regardless of whether there are

RE: querying multi-value fields

2009-10-12 Thread Angel, Eric
; To achieve what you want, do not tokenize the values you query/add to this > field. > > On Mon, Oct 12, 2009 at 4:05 PM, Angel, Eric wrote: > > > I have documents that store multiple values in some fields (using the > > document.add(new Field()) with the same field name)

querying multi-value fields

2009-10-12 Thread Angel, Eric
I have documents that store multiple values in some fields (using the document.add(new Field()) with the same field name). Here's what a typical document looks like: doc.option="value1 aaa" doc.option="value2 bbb" doc.option="value3 ccc" I want my queries to only match individual values,

RE: Realtime & distributed

2009-10-11 Thread Angel, Eric
s less than ideal because query performance soon degrades >> (similar to an unoptimized index). >> >> Hopefully in the future we can offer searching over >> IndexWriter's RAM buffer where indexing and search speed would >> be roughly what it is today. That combined

Realtime & distributed

2009-10-08 Thread Angel, Eric
Does anyone have any recommendations? I've looked at Katta, but it doesn't seem to support realtime searching. It also uses hdfs, which I've heard can be slow. I'm looking to serve 40gb of indexes and support about 1 million updates per day. Thx ---

RE: 2.9: TopScoreDocCollector

2009-10-08 Thread Angel, Eric
dSearcherMethod(); Weight weight = query.weight(searcher); boolean allowOutOfOrder = weight.scoresDocsOutOfOrder(); TopScoreDocCollector coll = TopScoreDocCollector.create(numHits, allowOutOfOrder); searcher.search(weight, (Filter) null, coll); -jake On Wed, Oct 7, 2009 at 7:26 PM, Ang

2.9: TopScoreDocCollector

2009-10-07 Thread Angel, Eric
According to the documentation for 2.9, TopScoreDocCollector.create(numHits, boolean), the second parameter is whether documents are scored in order by the input - How do I choose? In other words, how would I know if the documents are scored in order or not? Eric

RE: Distributed Lucene Questions

2009-06-01 Thread Angel, Eric
Has anyone used Katta in production? It looks very interesting and feature-rich, but I'm wondering how stable it is and whether or not it can support fine-grained queries - for example, constant score queries, MultiSearcher, etc. -Original Message- From: Ken Krugler [mailto:kkrugler_li...

RE: Indexing and Searching Web Application

2009-01-20 Thread Angel, Eric
There's a reopen() method in the IndexReader class. You can use that. -Original Message- From: Amin Mohammed-Coleman [mailto:ami...@gmail.com] Sent: Tuesday, January 20, 2009 5:02 AM To: java-user@lucene.apache.org Subject: Re: Indexing and Searching Web Application Am I supposed to clo

RE: clustering with compass & terracotta

2009-01-16 Thread Angel, Eric
om/2008/11/software-announcement-lusql-database- to.html http://zzzoot.blogspot.com/2008/09/katta-released-lucene-on-grid.html http://zzzoot.blogspot.com/2008/06/lucene-concurrent-search-performance. html http://zzzoot.blogspot.com/2008/06/simultaneous-threaded-query-lucene.ht ml 2009/1/15 Angel, Eric : >

RE: Lucene index updation and performance

2009-01-16 Thread Angel, Eric
You can simply call IndexWriter.addDocument() for new jobs and IndexWriter.updateDocument http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/index/IndexWri ter.html Also, don't forget to optimize your index. Depending on your volume, you might want to optimize during slow traffic. Eric A

clustering with compass & terracotta

2009-01-15 Thread Angel, Eric
I just ran into this http://www.compass-project.org/docs/2.0.0/reference/html/needle-terracot ta.html and was wondering if any of you had tried anything like this and if so, what your experience was like. Eric

RE: Google finance-like suggestible search field

2009-01-14 Thread Angel, Eric
Peter, Why don't you put all your "autocompletable" values into a single document field and just query a single field? Google seems to only use two fields for autocomplete - symbol and company name. Eric -Original Message- From: Hayes, Peter [mailto:peter.ha...@fmr.com] Sent: Wednesday

RE: ShingleMatrixFilter for synonyms

2009-01-13 Thread Angel, Eric
pache/lucene/analysis/shingle/ShingleF ilterTest.java As for multi-word tokens, you just have to make sure they don't get injected before something that would remove any portion of them. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From:

ShingleMatrixFilter for synonyms

2009-01-13 Thread Angel, Eric
Does anyone have an example using this? I have a SynonymEngine that returns a an array list of strings, some of which may be multiple words. How can I incorporate this with my SynonymEngine at index time? Also, the javadoc for the ShingleMatrixFilter class says: Without a spac