Re: Hunspell low level interface in Lucene 4.8

2014-06-16 Thread Michal Lopuszynski
Hi Robert, thank you for your answer! Hmmm... I need a plain stemmer, i.e. a functionality taking a word and returning a list of stems. Wrapping every word in tokenstream, which does a lot of things I do not need, seems like an overkill and waste of resources... Is there any problem with

Re: [lucene 4.6] NPE when calling IndexReader#openIfChanged

2014-06-16 Thread Michael McCandless
Wait, in fst.ByteStore I see only 5'485'824 -- does this mean 5485824 bytes, or ~5.2 MB? This is probably correct, meaning this is the RAM to hold the terms index. But I can't see from your heap dump output where the other ~51.3 MB is being used by StandardDirectoryReader. Mike McCandless

Re: Hunspell low level interface in Lucene 4.8

2014-06-16 Thread Robert Muir
You don't have to wrap every word in a tokenstream, they can be reused! Sorry, but i think this is really the best API if you want to use lucene's analyzers. You can use the tokenstream API with 4.8 and benchmark it against using that stemmer api with 4.7 :) On Mon, Jun 16, 2014 at 4:16 AM,

RE: Lucene Upgrade from 2.9.x to 4.7.x

2014-06-16 Thread Buddhavarapu, Suresh
I was trying the Demo application from 4.7.2 on an index created by 2.9.1. I get a org.apache.lucene.index.IndexFormatTooOldException exception. I tried the upgrader tool. Same exception again. Is there an upgrader tool that can work with a 2.9.1 tool? Or Do I have to build one? Any guidelines

RE: Lucene Upgrade from 2.9.x to 4.7.x

2014-06-16 Thread Uwe Schindler
Hi, You must first download the 3.6.2 Lucene version and upgrade using the upgrade tool from the lucene-core-3.6.2.jar. After this, your index is in Lucene 3.6 format, which can be read with Lucene 4. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail:

Index Not Finding Results some times

2014-06-16 Thread Andrew Norman
Hi, I am using Lucene 3.6.2 (I cannot upgrade due to 3rd party dependencies). I have written the following code below to illustrate the problem. I create a single document, add three fields, put it into the index. When I attempt to find the document using exact matches I can find the document 2

Lucene 4.8.1 - Taxonomy

2014-06-16 Thread Mrugesh Patel
Hi, I would like to open taxonomy indices in a tool (like Luke). Please could you help? Currently I am able to open other lucene indices in Luke 4.8.1 but unable to open taxonomy indices. When I try to open taxonomy indices in Luke 4.8.1 then it shows

Re: Facets in Lucene 4.7.2

2014-06-16 Thread Sandeep Khanzode
Hi Shai, Thanks for the response. Appreciated! I understand that this particular use case has to be handled in a different way. Can you please help me with the below questions?  1.] Is there any API that gives me the count of a specific dimension from FacetCollector in response to a search

Re: Lucene 4.8.1 - Taxonomy

2014-06-16 Thread Shai Erera
Err ... are you sure there's an index in the directory that you point Luke at? I see that the exception points to . which suggests the local directory from where Luke was run. There's nothing special about the taxonomy index, as far as Luke should concern. However, note that I do not recommend

Re: Facets in Lucene 4.7.2

2014-06-16 Thread Sandeep Khanzode
Correction on [4] below. I do get doc/pos/tim/tip/dvd/dvm files in either ase. What I meant was the number of those files appear different in both cases. Also, does commit() stop the world and behave serially to flush the contents?   --- Thanks n Regards, Sandeep Ramesh

RE: Index Not Finding Results some times

2014-06-16 Thread Allison, Timothy B.
The problem is that you are using an analyzer at index time but then not at search time. StandardAnalyzer will convert Name1 to name1 at index time. At search time, because you aren't using a query parser (which would by default lowercase your terms) you are literally searching for Name1 which

Re: Index Not Finding Results some times

2014-06-16 Thread Andrew Norman
Ah, that now makes sense. Changed the code and now it works. Thanks for the help. On Monday, 16 June 2014, Allison, Timothy B. talli...@mitre.org wrote: The problem is that you are using an analyzer at index time but then not at search time. StandardAnalyzer will convert Name1 to name1 at

SortingMergePolicy for already sorted segments

2014-06-16 Thread Ravikumar Govindarajan
I am planning to use SortingMergePolicy where all the merge-participating segments are already sorted... I understand that I need to define a DocMap with old-new doc-id mappings. Is it possible to optimize the eager loading of DocMap and make it kind of lazy load on-demand? Ex: Pass

Re: ShingleAnalyzerWrapper question

2014-06-16 Thread Manjula Wijewickrema
Dear Steve, It works. Thanks. On Wed, Jun 11, 2014 at 6:18 PM, Steve Rowe sar...@gmail.com wrote: You should give sw rather than analyzer in the IndexWriter actor. Steve www.lucidworks.com On Jun 11, 2014 2:24 AM, Manjula Wijewickrema manjul...@gmail.com wrote: Hi, In my

Re: SortingMergePolicy for already sorted segments

2014-06-16 Thread Shai Erera
I'm not sure that I follow ... where do you see DocMap being loaded up front? Specifically, Sorter.sort may return null of the readers are already sorted ... I think we already optimized for the case where the readers are sorted. Shai On Tue, Jun 17, 2014 at 4:04 AM, Ravikumar Govindarajan

RE: Lucene Upgrade from 2.9.x to 4.7.x

2014-06-16 Thread Buddhavarapu, Suresh
Thanks Uwe. I tried this path and I do not find any .cfs files. All that I see in my index directory after running upgrader is following files. -rw--- 1 root root 245 Jun 16 22:38 _1.fdt -rw--- 1 root root 45 Jun 16 22:38 _1.fdx -rw--- 1 root root 2809 Jun 16 22:38 _1.fnm