AW: [lucene 4.6] NPE when calling IndexReader#openIfChanged

2014-06-13 Thread Clemens Wyss DEV
> limit how many fields have norms enabled We have one index for approx 7000 pdfs (24GB). Of course no content is STOREd (but ANALYZEd). This very index occupies 4GB on disk and the corresponding IndexReader is 60MB. Are norms per default enabled org.apache.lucene.document .TextField? > use dis

searching in hierarchical structures

2014-06-13 Thread Sascha Janz
we use lucene to search in hierarchical structures.  like a folder structure in filesystem.   the documents have an extra field, which specifies the location of the document.   so if you want to search documents under a specific folder you have to query a prefix in this field.   but if the docume

Facets in Lucene 4.7.2

2014-06-13 Thread Sandeep Khanzode
Hi,   I am evaluating Lucene Facets for a project. Since there is a lot of change in 4.7.2 for Facets, I am relying on UTs for reference. Please let me know if there are other sources of information.  I have a couple of questions: 1.] All categories in my application are flat, not hierarchical.

Re: [lucene 4.6] NPE when calling IndexReader#openIfChanged

2014-06-13 Thread Michael McCandless
On Fri, Jun 13, 2014 at 3:02 AM, Clemens Wyss DEV wrote: >> limit how many fields have norms enabled > We have one index for approx 7000 pdfs (24GB). Of course no content is STOREd > (but ANALYZEd). This very index occupies 4GB on disk and the corresponding > IndexReader is 60MB. > Are norms per

fuzzy/case insensitive AnalyzingSuggester )

2014-06-13 Thread Clemens Wyss DEV
Looking for an AnalyzingSuggester which supports - fuzzyness - case insensitivity - small (in memors) footprint (*) (*)Just tried to "hand" my big IndexReader (see oher post " [lucene 4.6] NPE when calling IndexReader#openIfChanged") into JaspellLookup. Got an OOM. Is there any (Jaspell)Lookup im

AW: [lucene 4.6] NPE when calling IndexReader#openIfChanged

2014-06-13 Thread Clemens Wyss DEV
Thanks a lot! >"large text fields" What is a good limit (in characters) to switch from StringField to TextField? Do Analyzers (e.g. GermanAnalyzer) help a lot in reducing the size of an Index? > Add XXXDocValuesField instead of e.g. StringField. Does this apply only for StringFields? Or for Tex

Re: [lucene 4.6] NPE when calling IndexReader#openIfChanged

2014-06-13 Thread Michael McCandless
On Fri, Jun 13, 2014 at 8:53 AM, Clemens Wyss DEV wrote: > Thanks a lot! >>"large text fields" > What is a good limit (in characters) to switch from StringField to TextField? > Do Analyzers (e.g. GermanAnalyzer) help a lot in reducing the size > of an Index? It's more based on your app's requi

Re: Facets in Lucene 4.7.2

2014-06-13 Thread Shai Erera
Hi You can check the demo code here: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_8/lucene/demo/src/java/org/apache/lucene/demo/facet/. This code is updated with each release, so you always get a working code examples, even when the API changes. If you don't mind managing th

JTRES 2014: Deadline extended to June 23

2014-06-13 Thread w...@dtu.dk
(Apologies if you reveive multiple copies of this message.) DEADLINE EXTENDED TO JUNE 23, 2014 The 12th International Workshop on Java Technologies for Real-time and Embedded Systems - JTRES 2014 Oc

Re: Facets in Lucene 4.7.2

2014-06-13 Thread Sandeep Khanzode
Hi Shai,   Thanks so much for the clear explanation. I agree on the first question. Taxonomy Writer with a separate index would probably be my approach too. For the second question: I am a little new to the Facets API so I will try to figure out the approach that you outlined below. However, t

Indexing size increase 20% after switching from lucene 4.4 to 4.5 or 4.8 with BinaryDocValuesField

2014-06-13 Thread Zhao, Gang
I used lucene 4.4 to create index for some documents. One of the indexing fields is BinaryDocValuesField. After I change the dependency to lucene 4.5. The index size for 1 million documents increases from 293MB to 357MB. If I did not use BinaryDocValuesField, the index size increases only about