RE: High CPU usage duing index and search

2007-08-15 Thread Chew Yee Chuang
Greetings,

I have tested with Mysql, the grouping is ok when there is not much records in 
the table, but when I come across to performed grouping in a table which have 3 
millions of records, It really take a very long time to finish. Thus, Im 
looking at lucene and hope it can help.

Thank you
eChuang, Chew

-Original Message-
From: testn [mailto:[EMAIL PROTECTED] 
Sent: Monday, August 13, 2007 4:34 PM
To: java-user@lucene.apache.org
Subject: RE: High CPU usage duing index and search


To me, it looks like what you are trying to achieve is more suitable to be
the database where it can help you do grouping and sorting, etc. But if you
still want to achieve it using lucene, you might want to post some code so
that I can go through it and see why it uses so much resource then.


Chew Yee Chuang wrote:
> 
> Hi testn,
> 
> I have tested Filter, it is pretty fast, but still take a lot of CPU
> resource, Maybe it could due to the number of filter I run.
> 
> Thank you
> eChuang, Chew
> 
> 
> -Original Message-
> From: testn [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, August 07, 2007 10:37 PM
> To: java-user@lucene.apache.org
> Subject: RE: High CPU usage duing index and search
> 
> 
> Check out Filter class. You can create a separate filter for each field
> and
> then chain them together using ChainFilter. If you cache the filter, it
> will
> be pretty fast. 
> 
> 
> Chew Yee Chuang wrote:
>> 
>> Greetings,
>> 
>> Yes, process a little bit and stop for a while really reduce the CPU
>> usage,
>> but I need to find out a balance so that the indexing or searching will
>> not
>> have so much delay.
>> 
>> Execute 20,000 queries at a time is because the process is generating the
>> aggregation data for reporting,
>> E.g Gender (M,F), Department (Accounting, R&D, Financial,...etc), 
>> 1Q - Gender:M AND Department: Accounting
>> 2Q - Gender:M AND Department: R&D
>> 3Q - Gender:M AND Department: Financial
>> 4Q - Gender:F AND Department: Accounting
>> 5Q - 
>> Thus, the more combination, the more query need to run. For now, I still
>> can't get any idea on how to reduce it, just thinking maybe there is a
>> different way to index it so that I can get It easily.
>> 
>> Any help would be appreciated.
>> 
>> Thanks
>> eChuang, Chew
>> 
>> -Original Message-
>> From: karl wettin [mailto:[EMAIL PROTECTED] 
>> Sent: Thursday, August 02, 2007 7:11 AM
>> To: java-user@lucene.apache.org
>> Subject: Re: High CPU usage duing index and search
>> 
>> It sounds like you have a fairly busy system, perhaps 100% load on the
>> process is not that strange, at least not during short periods of time.
>> 
>> A simpler solution would be to nice the process a little bit in order to
>> give your background jobs some more time to think.
>> 
>> Running a profiler is still the best advice I can think of. It should
>> clearly show you what is going on when you run out of CPU.
>> 
>> --  
>> karl
>> 
>> 1 aug 2007 kl. 04.29 skrev Chew Yee Chuang:
>> 
>>> Hi,
>>>
>>> Thanks for the link provided, actually I've go through those  
>>> article when I
>>> developing the index and search function for my application. I  
>>> haven’t try
>>> profiler yet, but I monitor the CPU usage and notice that whatever  
>>> index or
>>> search performing, the CPU usage raise to 100%. Below I will try to
>>> elaborate more on what my application is doing and how I index and  
>>> search.
>>>
>>> There are many concurrent process running, first, the application  
>>> will write
>>> records that received into a text file with tab separated each  
>>> different
>>> field. Application will point to a new file every 10mins and start  
>>> writing
>>> to it. So every file will contains only 10mins record, approximate  
>>> 600,000
>>> records per file. Then, the indexing process will check whether  
>>> there is a
>>> text file to be index, if it is, the thread will wake up and start  
>>> perform
>>> indexing.
>>>
>>> The indexing process will first add documents to RAMDir, Then  
>>> later, add
>>> RAMDir into FSDir by calling addIndexNoOptimize() when there is  
>>> 100,000
>>> documents(32 fields per doc) in RAMDir. There is only 1 IndexWriter 
>>> (FSDir)
>>> was created but a few IndexWriter(RAMDir) was created during the whole
>>> process. Below are some configuration for IndexWriters that I  
>>> mentioned:-
>>>
>>> IndexWriter (RAMDir)
>>> - SimpleAnalyzer
>>> - setMaxBufferedDocs(1)
>>> - Filed.Store.YES
>>> - Field.Index.NO_NORMS
>>>
>>> IndexWriter (FSDir)
>>> - SimpleAnalyzer
>>> - setMergeFactor(20)
>>> - addIndexesNoOptimize()
>>>
>>> For the searching, because there are many queries(20,000) run  
>>> continuously
>>> to generate the aggregate table for reporting purpose. All this  
>>> queries is
>>> run in nested loop, and there is only 1 Searcher created, I try  
>>> searcher and
>>> filter as well, filter give me better result, but both also utilize  
>>> lots of
>>> CPU resources.
>>>
>>> Hope this info will help and so

How to search over all fields in a clean way?

2007-08-15 Thread Ridwan Habbal
Hello all, 
 
when we search over an index docs we use code such:
 
Analyzer analyzer = new StandardAnalyzer();
String defaultSearchField = "all";
QueryParser parser = new QueryParser(defaultSearchField, analyzer);
IndexSearcher indexSearcher = new IndexSearcher(this.indexDirectory);
Hits hits = indexSearcher.search(parser.parse(query));
 
The problem is when you want to search over ALL fields in each doc. What's more 
is that the fields are created dynamically. In other words, The number and 
identifiers of the fields of my docs vary from each other. so it's irrational 
to type all fields names, further more i don't know them. 
I thought of a primitive solution: copy all fields to one field, however this 
doubles the index size, and it might conflict with some fields names since the 
fields of docs are dynamic. 
 
could some one help me? 
 
Thanks in advance. 
Ridwan
_
Explore the seven wonders of the world
http://search.msn.com/results.aspx?q=7+wonders+world&mkt=en-US&form=QBRE

Re: How to search over all fields in a clean way?

2007-08-15 Thread Erik Hatcher
copying all fields to a single searchable field is quite reasonable,  
and won't double your index size if you set the new field to be  
unstored.


Erik


On Aug 15, 2007, at 5:38 AM, Ridwan Habbal wrote:


Hello all,

when we search over an index docs we use code such:

Analyzer analyzer = new StandardAnalyzer();
String defaultSearchField = "all";
QueryParser parser = new QueryParser(defaultSearchField, analyzer);
IndexSearcher indexSearcher = new IndexSearcher(this.indexDirectory);
Hits hits = indexSearcher.search(parser.parse(query));

The problem is when you want to search over ALL fields in each doc.  
What's more is that the fields are created dynamically. In other  
words, The number and identifiers of the fields of my docs vary  
from each other. so it's irrational to type all fields names,  
further more i don't know them.
I thought of a primitive solution: copy all fields to one field,  
however this doubles the index size, and it might conflict with  
some fields names since the fields of docs are dynamic.


could some one help me?

Thanks in advance.
Ridwan
_
Explore the seven wonders of the world
http://search.msn.com/results.aspx?q=7+wonders+world&mkt=en- 
US&form=QBRE



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: query question

2007-08-15 Thread karl wettin


15 aug 2007 kl. 07.18 skrev Mohammad Norouzi:

I am using WhitespaceAnalyzer and the query is " icdCode:H* " but  
there is
no result however I know that there are many documents with this  
field value
such as H20, H20.5 etc. this field is tokenized and indexed  
what is

wrong with this?
when I test this query with Luke it will return no result as well.


Can you also use Luke to inspect documents you know should contain these
terms and make sure it really is in there?

--
karl

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: 答复: Indexing correctly?

2007-08-15 Thread John Paul Sondag
It worked!  My indexing time went from over 6 hours to 592 seconds!  Thank
you guys so much!

--JP

On 8/14/07, karl wettin <[EMAIL PROTECTED]> wrote:
>
>
> 14 aug 2007 kl. 21.34 skrev John Paul Sondag:
>
> > What exactly is a RAMDirectory, I didn't see it mentioned on that
> > page.  Is
> > there example code of using it?   Do I just create a Ram Directory
> > and then
> > use it like it's a normal directory?
>
> Yes, it is just like FSDirectory, but resides in RAM and is not
> persistent.
>
> 
>
> --
> karl
>
>
>
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>


Re: formalizing a query

2007-08-15 Thread Sagar Naik

Hey,

I think u can try :

MultiFieldQueryParser.parse(String[] queries, String[] fields, 
BooleanClause.Occur[] flags,

 Analyzer analyzer)

The flags arrray will get u ORs and ANDs in places u need

- Sagar Naik

Abu Abdulla alhanbali wrote:

Thanks for the help,

please provide the code to do that.

I tried with this one but it didn't work:

Query filterQuery = MultiFieldQueryParser.parse(new String{query1, query2,
query3, query4,  }, new String{field1, field2, field1, field2, ... },
new KeywordAnalyzer());

this results in:

field1:query1 OR field2:query2 OR
field1:query3 OR field2:query4 ... etc

and NOT:

(field1:query1 AND field2:query2) OR
(field1:query3 AND field2:query4) ... etc

please help.


On 8/10/07, Erick Erickson <[EMAIL PROTECTED]> wrote:
  

I *strongly* suggest you get a copy of Luke. It'll allow you to form
queries
and see the results and you can then answer this kind of question as well
as many others.

Meanwhile, please see
http://lucene.apache.org/java/docs/queryparsersyntax.html

Erick

On 8/10/07, Abu Abdulla alhanbali <[EMAIL PROTECTED]> wrote:


Hi,

I need your help in formalizing this query:

(field1:query1 AND field2:query2) OR
(field1:query3 AND field2:query4) OR
(field1:query5 AND field2:query6) OR
(field1:query7 AND field2:query8) ... etc

Please give the code since I'm new to lucene
how we can use MultiFieldQueryParser or any parser to do the job

greatly appreciated

  


  



--
Always vizz it us @ visvo.com


--
This message has been scanned for viruses and
dangerous content and is believed to be clean.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Question about highlighting returning nothing

2007-08-15 Thread Donna L Gresh
I'm working on refining my stopwords by looking at the highest scoring 
document returned for each search, and using the highlighter to show which 
terms were significant in choosing that document. This has been extremely 
helpful in improving my searches. I've noticed though that sometimes the 
highlighter returns nothing, even though there is a non-zero score for the 
match. Could someone explain why this is so? A recent discussion I found 
spoke about the usefulness of returning, say, the first bit of text when 
this happens, but there wasn't any discussion of *why* this occurs--
Thanks

Donna



Seeking Advice

2007-08-15 Thread Michael Bell
We are writing a mail archiving program. Each piece of the message (eg each 
attachment) is stored separately.

I'll try to keep this short and sweet :)

Currently we index the main header fields, like

subject
sender
recipients (space delimited)

etc.

This stuff is really only needed once per e-mail

We also index the attachment info:

attachment size (changed to a range like "large", "medium", etc)
attachment name
full text index
etc.

This stuff is needed to be distinct for each attachment in the e-mail

Our current algorithm is wasteful, but I see no better way to do it.

In a loop, for each attachment (and once if we have none), we add all the main 
header stuff and the attachment stuff, as a separate Document per attachment. 
This is wasteful, because the main header stuff is needlessly repeated.

Now, it would seem better and more efficient to have one Document for the whole 
e-mail, storing the main header stuff only once, and storing the Attachment 
stuff as multiple instances of the same field. Lucene supports this.


The problem is then a search on attachment stuff will return cross cartesian 
results.

Example

 if I have 2 attachments one named A.doc and one B.doc. And A.doc contains the 
full text "turnip" and B.doc contains the text "dog".


Now if the user enters a search requesting email that contains Attachment name 
A.Doc, and contents dog, the results will be

For the Per-Document storage:

no results found (correct I'd argue)

For the Single Document storage:

1 result found (because the full text and names of both are stored in the same 
Document albeit different Field instances)

While tempted by the siren call of the Single Document method, it seems like 
this would return unexpected results from the users point of view (although one 
could argue otherwise, since holistically searching the e-mail as a whole it's 
returning the "right" results.

What do you folks think? Any ideas for a better way to approach this?

Thanks

Mike






   
Ready
 for the edge of your seat? 
Check out tonight's top picks on Yahoo! TV. 
http://tv.yahoo.com/

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



LUCENE-423: thread pool implementation of parallel queries

2007-08-15 Thread Renaud Waldura
Could someone who understands Lucene internals help me port
https://issues.apache.org/jira/browse/LUCENE-423 to Lucene 2.0? I have beefy
hardware (32 cores) and want to try this out, but it won't compile.
 
There are 2 issues: 
1- maxScore
On line 412 TopFieldDocs constructor now needs a maxScore. Same on line 328,
TopDocs constructor.
 
2- Weight vs. Query
Lines 202 and 206, the Searchable interface now wants a Weight instead of a
Query.
 
I understand the syntax changes I would have to make -- I can edit the file
and make the compiler errors go away -- but not the implications of these
changes in a multi-threaded context.
 
--Renaud
 


Re: Seeking Advice

2007-08-15 Thread Michael J. Prichard

Hey Michael,

Are you writing this software for yourself or for reselling?  We built 
an email archiving service and we use lucene as our search engine.  We 
approach this a little differently.


BUT, i don't think it is wasteful to index the header information with 
the attachment.  Just don't store the fields.


-Michael



Michael Bell wrote:

We are writing a mail archiving program. Each piece of the message (eg each 
attachment) is stored separately.

I'll try to keep this short and sweet :)

Currently we index the main header fields, like

subject
sender
recipients (space delimited)

etc.

This stuff is really only needed once per e-mail

We also index the attachment info:

attachment size (changed to a range like "large", "medium", etc)
attachment name
full text index
etc.

This stuff is needed to be distinct for each attachment in the e-mail

Our current algorithm is wasteful, but I see no better way to do it.

In a loop, for each attachment (and once if we have none), we add all the main 
header stuff and the attachment stuff, as a separate Document per attachment. 
This is wasteful, because the main header stuff is needlessly repeated.

Now, it would seem better and more efficient to have one Document for the whole 
e-mail, storing the main header stuff only once, and storing the Attachment 
stuff as multiple instances of the same field. Lucene supports this.


The problem is then a search on attachment stuff will return cross cartesian 
results.

Example

 if I have 2 attachments one named A.doc and one B.doc. And A.doc contains the full text 
"turnip" and B.doc contains the text "dog".


Now if the user enters a search requesting email that contains Attachment name 
A.Doc, and contents dog, the results will be

For the Per-Document storage:

no results found (correct I'd argue)

For the Single Document storage:

1 result found (because the full text and names of both are stored in the same 
Document albeit different Field instances)

While tempted by the siren call of the Single Document method, it seems like this would 
return unexpected results from the users point of view (although one could argue 
otherwise, since holistically searching the e-mail as a whole it's returning the 
"right" results.

What do you folks think? Any ideas for a better way to approach this?

Thanks

Mike






   
Ready for the edge of your seat? 
Check out tonight's top picks on Yahoo! TV. 
http://tv.yahoo.com/


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

  



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Question about highlighting returning nothing

2007-08-15 Thread Donna L Gresh
Well, in my case the highlighting was returning nothing because of (my 
favorite acronym) PBCAK--

I don't store the text in the index, so I have to retrieve it separately 
(from a database) for the highlighting, and my database was not in sync 
with the index, so in a few cases the document in the index had been 
deleted from the database--thus a score, but no document text.

But I guess my original question remains; under what conditions would the 
highlighter return nothing? Only if no terms matched?

Donna 


Re: Question about highlighting returning nothing

2007-08-15 Thread Lukas Vlcek
Donna,

I have been investigation highlighters in Lucene recently a bit. The humble
experience I've learned so far is that highlighting is completely different
task from indexing/searching tandem. This simple fact is not obvious to a
lot of people. In your particular casue it would be helpful if you can post
more technical details about your system settings. Not only it is important
if the field to be highlighted is stored but also it is important if you
allow for query rewrite and what king of queries you are using (Prefix,
Wildcard ... etc).

Just my 2 cents.

Lukas

On 8/15/07, Donna L Gresh <[EMAIL PROTECTED]> wrote:
>
> Well, in my case the highlighting was returning nothing because of (my
> favorite acronym) PBCAK--
>
> I don't store the text in the index, so I have to retrieve it separately
> (from a database) for the highlighting, and my database was not in sync
> with the index, so in a few cases the document in the index had been
> deleted from the database--thus a score, but no document text.
>
> But I guess my original question remains; under what conditions would the
> highlighter return nothing? Only if no terms matched?
>
> Donna
>


AW: High CPU usage duing index and search

2007-08-15 Thread Steinert, Fabian
Hi Chew,
 
with Lucene you could try the following:
 
Make one query for each single value in each category (each Term):
1Q - Gender:M
2Q - Department:Accounting
3Q - Department:R&D
4Q - ...
 
with a custom HitCollector like the following example taken from 
org.apache.lucene.search.HitCollector API Spec:
 
   Searcher searcher = new IndexSearcher(indexReader);
   final BitSet bits = new BitSet(indexReader.maxDoc());
   searcher.search(query, new HitCollector() {
   public void collect(int doc, float score) {
 bits.set(doc);
   }
 });
 
Thus you'l get one BitSet for each Term with bits set at DocIDs containing the 
Term.
 
Then do the combinatory part on these BitSets like:
 
  BitSet FemaleAndRnD = ((BitSet) RnD.clone()).andNot(Male);

Cheers,
Fabian
 



Von: Chew Yee Chuang [mailto:[EMAIL PROTECTED]
Gesendet: Mi 15.08.2007 10:31
An: java-user@lucene.apache.org
Betreff: RE: High CPU usage duing index and search



Greetings,

I have tested with Mysql, the grouping is ok when there is not much records in 
the table, but when I come across to performed grouping in a table which have 3 
millions of records, It really take a very long time to finish. Thus, Im 
looking at lucene and hope it can help.

Thank you
eChuang, Chew



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

out of order

2007-08-15 Thread testn

Using Lucene 2.2.0, I still sporadically got doc out of order error. I
indexed all of my stuff in one thread. Do you have any idea why it happens?

Thanks!
-- 
View this message in context: 
http://www.nabble.com/out-of-order-tf4276385.html#a12172277
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: out of order

2007-08-15 Thread Michael McCandless

"testn" <[EMAIL PROTECTED]> wrote:
> 
> Using Lucene 2.2.0, I still sporadically got doc out of order error. I
> indexed all of my stuff in one thread. Do you have any idea why it
> happens?

Hm, that is not good.  I thought we had finally fixed this with
LUCENE-140.  Though un-corrected disk errors could in theory lead to
this too.

Are you able to easily reproduce it?  Can you post the full exception?

Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: 答复: Indexing correctly?

2007-08-15 Thread Erick Erickson
OK, what worked? Using a RAMDir?

Erick

On 8/15/07, John Paul Sondag <[EMAIL PROTECTED]> wrote:
>
> It worked!  My indexing time went from over 6 hours to 592 seconds!  Thank
> you guys so much!
>
> --JP
>
> On 8/14/07, karl wettin <[EMAIL PROTECTED]> wrote:
> >
> >
> > 14 aug 2007 kl. 21.34 skrev John Paul Sondag:
> >
> > > What exactly is a RAMDirectory, I didn't see it mentioned on that
> > > page.  Is
> > > there example code of using it?   Do I just create a Ram Directory
> > > and then
> > > use it like it's a normal directory?
> >
> > Yes, it is just like FSDirectory, but resides in RAM and is not
> > persistent.
> >
> > 
> >
> > --
> > karl
> >
> >
> >
> >
> >
> > -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
>


Re: Seeking Advice

2007-08-15 Thread Erick Erickson
Rather than use efficiency arguments to drive the behavior of the
app, I'd recommend that you define the expected behavior and
make that behavior happen as necessary.

What would you estimate is the ratio of meta-data to attachments?
And what is the ratio of documents that have multiple attachments?
I actually suspect that the number of e-mails that have multiple
attachments is small enough that storing the meta-data with each
document would result in a minuscule size increase,
but you'll only find that by gathering some statistics 

Erick

On 8/15/07, Michael Bell <[EMAIL PROTECTED]> wrote:
>
> We are writing a mail archiving program. Each piece of the message (eg
> each attachment) is stored separately.
>
> I'll try to keep this short and sweet :)
>
> Currently we index the main header fields, like
>
> subject
> sender
> recipients (space delimited)
>
> etc.
>
> This stuff is really only needed once per e-mail
>
> We also index the attachment info:
>
> attachment size (changed to a range like "large", "medium", etc)
> attachment name
> full text index
> etc.
>
> This stuff is needed to be distinct for each attachment in the e-mail
>
> Our current algorithm is wasteful, but I see no better way to do it.
>
> In a loop, for each attachment (and once if we have none), we add all the
> main header stuff and the attachment stuff, as a separate Document per
> attachment. This is wasteful, because the main header stuff is needlessly
> repeated.
>
> Now, it would seem better and more efficient to have one Document for the
> whole e-mail, storing the main header stuff only once, and storing the
> Attachment stuff as multiple instances of the same field. Lucene supports
> this.
>
>
> The problem is then a search on attachment stuff will return cross
> cartesian results.
>
> Example
>
> if I have 2 attachments one named A.doc and one B.doc. And A.doc contains
> the full text "turnip" and B.doc contains the text "dog".
>
>
> Now if the user enters a search requesting email that contains Attachment
> name A.Doc, and contents dog, the results will be
>
> For the Per-Document storage:
>
> no results found (correct I'd argue)
>
> For the Single Document storage:
>
> 1 result found (because the full text and names of both are stored in the
> same Document albeit different Field instances)
>
> While tempted by the siren call of the Single Document method, it seems
> like this would return unexpected results from the users point of view
> (although one could argue otherwise, since holistically searching the e-mail
> as a whole it's returning the "right" results.
>
> What do you folks think? Any ideas for a better way to approach this?
>
> Thanks
>
> Mike
>
>
>
>
>
>
>
> Ready
> for the edge of your seat?
> Check out tonight's top picks on Yahoo! TV.
> http://tv.yahoo.com/
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>


Re: Seeking Advice

2007-08-15 Thread Michael J. Prichard
I actually know from experience.  Around 20% +/- 5% of emails will have 
attachments.  If that helps.  Again, I say index as much info as you 
can.  Store what you think it necessary.


Erick Erickson wrote:

Rather than use efficiency arguments to drive the behavior of the
app, I'd recommend that you define the expected behavior and
make that behavior happen as necessary.

What would you estimate is the ratio of meta-data to attachments?
And what is the ratio of documents that have multiple attachments?
I actually suspect that the number of e-mails that have multiple
attachments is small enough that storing the meta-data with each
document would result in a minuscule size increase,
but you'll only find that by gathering some statistics 

Erick

  



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: out of order

2007-08-15 Thread testn

I use RAMDirectory and the error often shows the low number. Last time it
happened with message "7<=7". Nest time it happens, I will try to capture
the stacktrace.



Michael McCandless-2 wrote:
> 
> 
> "testn" <[EMAIL PROTECTED]> wrote:
>> 
>> Using Lucene 2.2.0, I still sporadically got doc out of order error. I
>> indexed all of my stuff in one thread. Do you have any idea why it
>> happens?
> 
> Hm, that is not good.  I thought we had finally fixed this with
> LUCENE-140.  Though un-corrected disk errors could in theory lead to
> this too.
> 
> Are you able to easily reproduce it?  Can you post the full exception?
> 
> Mike
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/out-of-order-tf4276385.html#a12173705
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]