Hi Bruce,
Thanks for investigating! Can you open a bug report on
https://issues.apache.org/jira/browse/LUCENE ?
Uwe
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Bruce Karsh [mailto:bruceka...@gmail.
Here it fails because -verbose is not set:
$ java -cp ./lucene-core-4.4-SNAPSHOT.jar
org.apache.lucene.index.IndexUpgrader ./INDEX
Exception in thread "main" java.lang.IllegalArgumentException: printStream
must not be null
at
org.apache.lucene.index.IndexWriterConfig.setInfoStream(IndexWriterConf
Hi,
I am getting an exception while indexing files, i tried debugging but
couldnt figure out the problem.
I have a custom analyzer which creates the token stream , i am indexing
around 15k files, when i start the indexing after some time i get this
exception:
java.lang.IllegalArgumentException:
That would be great!
On Mon, Sep 16, 2013 at 1:41 PM, Benson Margulies wrote:
> Thanks, I might pitch in.
>
>
> On Mon, Sep 16, 2013 at 12:58 PM, Robert Muir wrote:
>
>> Mostly because our tokenizers like StandardTokenizer will tokenize the
>> same way regardless of normalization form or whether
Can anyone shed light as to why this is a token filter and not a char
filter? I'm wishing for one of these _upstream_ of a tokenizer, so that the
tokenizer's lookups in its dictionaries are seeing normalized contents.
Thanks, I might pitch in.
On Mon, Sep 16, 2013 at 12:58 PM, Robert Muir wrote:
> Mostly because our tokenizers like StandardTokenizer will tokenize the
> same way regardless of normalization form or whether its normalized at
> all?
>
> But for other tokenizers, such a charfilter should be usefu
> Is Luke showing you stored fields? If so, this makes no sense ...
> Field.Store.NO (single or multiple calls) should have resulted in no
> stored fields.
It shows the field but shows the content as
--
Alan Burlison
--
-
To
Have you considered storing your indexes server-side? I haven't used
compression but usually the trade-off of compression is CPU usage which
will also be a drain on battery life. Or maybe consider how important the
highlighter is to your users - is it worth the trade-off of either disk
space or bat
org.apache.lucene.analysis.miscellaneous.PerFieldAnalyzerWrapper in
analyzers-common is what you need. There's an example in the
javadocs. Build and use the wrapper instance in place of
StandardAnalyzer or whatever you are using now.
--
Ian.
On Mon, Sep 16, 2013 at 5:36 PM, Scott Smith wrote
I want to be sure I understand this correctly. Suppose I have a search that
I'm going to run through the query parser that looks like:
body:"some phrase" AND keyword:"my-keyword"
clearly "body" and "keyword" are field names. However, the additional
information is that the "body" field is anal
Mostly because our tokenizers like StandardTokenizer will tokenize the
same way regardless of normalization form or whether its normalized at
all?
But for other tokenizers, such a charfilter should be useful: there is
a JIRA for it, but it has some unresolved issues
https://issues.apache.org/jira
On Mon, Sep 16, 2013 at 9:52 AM, Alan Burlison wrote:
> On 16 September 2013 12:40, Michael McCandless
> wrote:
>
>> If you use Field.Store.NO for all fields for a given document then no
>> field should have been stored. Can you boil this down to a small test
>> case?
>
> repeated calls to
>
> d
On 16 September 2013 12:40, Michael McCandless
wrote:
> If you use Field.Store.NO for all fields for a given document then no
> field should have been stored. Can you boil this down to a small test
> case?
repeated calls to
doc.add(new TextField("content", c, Field.Store.NO)))
result in a sin
On 16 September 2013 11:47, Ian Lea wrote:
> Not exactly dumb, and I can't tell you exactly what is happening here,
> but lucene stores some info at the index level rather than the field
> level, and things can get confusing if you don't use the same Field
> definition consistently for a field.
>
That is strange.
If you use Field.Store.NO for all fields for a given document then no
field should have been stored. Can you boil this down to a small test
case?
Mike McCandless
http://blog.mikemccandless.com
On Mon, Sep 16, 2013 at 6:33 AM, Alan Burlison wrote:
> I'm creating multiple inst
Not exactly dumb, and I can't tell you exactly what is happening here,
but lucene stores some info at the index level rather than the field
level, and things can get confusing if you don't use the same Field
definition consistently for a field.
>From the javadocs for org.apache.lucene.document.Fie
I'm creating multiple instances of a field, some with Field.Store.YES
and some with Field.Store.NO, with Lucene 4.4. If Field.Store.YES is
set then I see multiple instances of the field in the documents in the
resulting index, if I use Field.Store.NO then I only see a single
field. Is that expected
Hi John,
I just had a look at Mike's benchs[1][2] which don't show any
performance difference from approximately 1 year. But this only tests
a conjunction of two terms so it might still be that latency worsened
for more complex queries.
[1] http://people.apache.org/~mikemccand/lucenebench/AndHigh
I am using Apache Lucene in Android. I have around 1 GB of Text documents
(Logs). When I Index these text documents using this
*new Field(ContentIndex.KEY_TEXTCONTENT, contents, Field.Store.YES,
Field.Index.ANALYZED,TermVector.WITH_POSITIONS_OFFSETS)*, the index
directory is consuming 1.59GB memory
19 matches
Mail list logo