Re: Example settings for TieredMergePolicy : Lucene 4.0

2013-02-01 Thread Michael McCandless
You shouldn't need to call forceMerge: Lucene will periodically do "natural" merges, which will keep the file count down. Try indexing millions of documents and watch how the files change Or, turn on IndexWriter's infoStream to see when merges are done. Mike McCandless http://blog.mikemccan

Re: Example settings for TieredMergePolicy : Lucene 4.0

2013-02-01 Thread saisantoshi
Thanks. I read this ( and also tried it out in my code) and understand that forceMerge(1) is not advisable for performance reasons. My question here is if we don't have a way to compress these files, it will produce enormous amount of files which will lead to some file system issues ( such as excee

Re: Example settings for TieredMergePolicy : Lucene 4.0

2013-02-01 Thread Adrien Grand
Hi, On Fri, Feb 1, 2013 at 6:51 PM, saisantoshi wrote: > Prior to 4.0, there was an optimize() in the IndexWriter which was merging > the index files. Is there any settings that can be done on the > TieredMergePolicy so that I want to limit the number of files produced > during the indexing. Seg

Survey to measure the utility of Lucene's projects

2013-02-01 Thread jorge.ruiz
I'm doing a piece of research for my master thesis. The project consists on ranking the utility of Lucene related projects according to their relevance to improve their sorting in the future. I'd appreciate it if you show what is the relevance of these projects according to your experience and know

Example settings for TieredMergePolicy : Lucene 4.0

2013-02-01 Thread saisantoshi
I am using the TieredMergePolicy and using the compound index: TieredMergePolicy mergePolicy = new TieredMergePolicy(); indexWriterConfig.setMergePolicy(mergePolicy.setNoCFSRatio(1.0d)); Prior to 4.0, there was an optimize() in the IndexWriter which was merging the index files. Is there any sett

Re: IndexWriterConfig.OpenMode.CREATE vs OpenMode.APPEND (index files)

2013-02-01 Thread saisantoshi
>>Are you closing or committing your IndexWriter after each added document? Because if you add 100 docs you should not see 100 versions of these files, only one set of files in the end (many docs are written to one segment). No. What I meant to say here is if 100 users have updated the document

Re: Getting the number of all hits for the SpanQuery

2013-02-01 Thread Igor Shalyminov
Hi again! So far I think that the easiest way to get all span matches is indeed this method (Lucene v 4.1 code): public Spans getSpans(final AtomicReaderContext context, Bits acceptDocs, Map termContexts) But there is no annotation for this code except 'for internal use only', and the input pa

Re: How to get field names and types from an IndexSearcher

2013-02-01 Thread Rolf Veen
On Fri, Feb 1, 2013 at 12:43 PM, Michael McCandless wrote: > There is actually one way to check if a field was indexed numerically: > you can seek to the first term in the field, and attempt to parse it > as a long/float/etc., and if that throws a NumberFormatException, it > was indexed numerical

Re: IndexWriterConfig.OpenMode.CREATE vs OpenMode.APPEND (index files)

2013-02-01 Thread Michael McCandless
It is by design, and 2.4 works the same way. Are you closing or committing your IndexWriter after each added document? Because if you add 100 docs you should not see 100 versions of these files, only one set of files in the end (many docs are written to one segment). Each segment holds the docum

Re: How to get field names and types from an IndexSearcher

2013-02-01 Thread Michael McCandless
On Fri, Feb 1, 2013 at 3:17 AM, Rolf Veen wrote: > On Thu, Jan 31, 2013 at 9:55 PM, Michael McCandless > wrote: > >> But are you wanting to, eg, make a NumericRangeQuery if you detect the >> field was indexed numerically, and otherwise a TermRangeQuery, or >> something...? (Not easy) > > This is

Re: what's the difference of facet and group search ??

2013-02-01 Thread Michael McCandless
Facets are the fields with values and counts you see on the left here: http://www.amazon.com/s/ref=sr_nr_scat_1292115011_ln?rh=n%3A1292115011%2Ck%3Alcd+monitor&keywords=lcd+monitor&ie=UTF8&qid=1359718274&scn=1292115011&h=5ad599a40cf84af588b7564b59277b44e2dc1f2e While grouping (I don't have an exa

what's the difference of facet and group search ??

2013-02-01 Thread wgggfiy
rt, I'm totally puzzled, Can anyone explain it with an example ? thx. - -- Email: wuqiu.m...@qq.com -- -- View this message in context: http://lucene.472066.n3.nabble.com/what-s-the-difference-of-facet-and-group-search-tp4037914.html Sent from

Re: Multiple faceting in lucene

2013-02-01 Thread Shai Erera
I'm glad to hear it helped you, Ramprakash. Don't hesitate to post questions to the list if you need further assistance! Shai On Fri, Feb 1, 2013 at 9:12 AM, Ramprakash Ramamoorthy < youngestachie...@gmail.com> wrote: > On Fri, Jan 25, 2013 at 6:23 PM, Shai Erera wrote: > > > Hi > > > > Are t

Re: How to properly use updatedocument in lucene.

2013-02-01 Thread Ian Lea
There is no way to update without reindexing the entire document. It might be less confusing if the IndexWriter.updateDocument() methods were called maybe replaceDocument() but they're not. It would also help if lucene could reject attempts to pass a Document read from the index to these methods

Re: How to get field names and types from an IndexSearcher

2013-02-01 Thread Rolf Veen
On Thu, Jan 31, 2013 at 9:55 PM, Michael McCandless wrote: > But are you wanting to, eg, make a NumericRangeQuery if you detect the > field was indexed numerically, and otherwise a TermRangeQuery, or > something...? (Not easy) This is what I want, yes. But I begin to understand that this is not