Re: Compound file format file size question

2004-06-18 Thread James Dunn
Otis,

Thanks for the response.

Yeah, I was copying the file to a brand new hard-drive
and it was formatted to FAT32 by default, which is
probably why it couldn't handle the 13GB file.  

I'm converting the drive to NTFS now, which should get
me through temporarily.  In the future though, I may
break the index up into smaller sub-indexes so that I
can distribute them across seperate physical disks for
better disk IO.

Thanks for your help!

Jim
--- Otis Gospodnetic [EMAIL PROTECTED]
wrote:
 Hello,
 
 --- James Dunn [EMAIL PROTECTED] wrote:
  Hello all,
  
  I have an index that's about 13GB on disk.  I'm
 using
  1.4 rc3 which uses the compound file format by
  default.
  
  Once I run optimize on my index, it creates one
 13GB
  ..cfs file.  This isn't a problem on Linux (yet),
 but
  I'm having some trouble copying the file over to
 my
  Windows XP box.
 
 What is the exact problem? The sheer size of it or
 something else? 
 Just curious...
 
  Is there some way using the compound file format
 to
  set the maximum file size and have Lucene break
 the
  index into multiple files once it hits that limit?
 
 Can't be done with Lucene, but I seem to recall some
 discussion about
 it.  Nothing concrete, though.
 
  Or do I need to go back to using the non-compound
 file
  format?
 
 The total size should be (about) the same, but you
 could certainly do
 that, if having more smaller files is better for
 you.
 
 Otis
 
  Another solution, I suppose, would be to break up
 my
  index into seperate smaller indexes.  This would
 be my
  second choice, however.
  
  Thanks a lot,
  
  Jim
 
 

-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Memory usage

2004-05-27 Thread James Dunn
Otis,

My app does run within Tomcat.  But when I started
getting these OutOfMemoryErrors I wrote a little unit
test to watch the memory usage without Tomcat in the
middle and I still see the memory usage.

Thanks,

Jim
--- Otis Gospodnetic [EMAIL PROTECTED]
wrote:
 Sorry if I'm stating the obvious.  Is this happening
 in some
 stand-alone unit tests, or are you running things
 from some application
 and in some environment, like Tomcat, Jetty or in
 some non-web app?
 
 Your queries are pretty big (although I recall some
 people using even
 bigger ones... but it all depends on the hardware
 they had), but are
 you sure running out of memory is due to Lucene, or
 could it be a leak
 in the app from which you are running queries?
 
 Otis
 
 
 --- James Dunn [EMAIL PROTECTED] wrote:
  Doug,
  
  We only search on analyzed text fields.  There are
 a
  couple of additional fields in the index like
  OBJECT_ID that are keywords but we don't search
  against those, we only use them once we get a
 result
  back to find the thing that document represents.
  
  Thanks,
  
  Jim
  
  --- Doug Cutting [EMAIL PROTECTED] wrote:
   It is cached by the IndexReader and lives until
 the
   index reader is 
   garbage collected.  50-70 searchable fields is a
   *lot*.  How many are 
   analyzed text, and how many are simply keywords?
   
   Doug
   
   James Dunn wrote:
Doug,

Thanks!  

I just asked a question regarding how to
 calculate
   the
memory requirements for a search.  Does this
   memory
only get used only during the search operation
   itself,
or is it referenced by the Hits object or
 anything
else after the actual search completes?

Thanks again,

Jim


--- Doug Cutting [EMAIL PROTECTED] wrote:

   James Dunn wrote:
   
   Also I search across about 50 fields but I
 don't
   
   use
   
   wildcard or range queries. 
   
   Lucene uses one byte of RAM per document per
   searched field, to hold the 
   normalization values.  So if you search a 10M
   document collection with 
   50 fields, then you'll end up using 500MB of
 RAM.
   
   If you're using unanalyzed fields, then an
 easy
   workaround to reduce the 
   number of fields is to combine many in a
 single
   field.  So, instead of, 
   e.g., using an f1 field with value abc,
 and an
   f2 field with value 
   efg, use a single field named f with
 values
   1_abc and 2_efg.
   
   We could optimize this in Lucene.  If no
 values of
   an indexed field are 
   analyzed, then we could store no norms for the
   field
   and hence read none 
   into memory.  This wouldn't be too hard to
   implement...
   
   Doug
   
   

   
  
 

-

   To unsubscribe, e-mail:
   [EMAIL PROTECTED]
   For additional commands, e-mail:
   [EMAIL PROTECTED]
   






__
Do you Yahoo!?
Friends.  Fun.  Try the all-new Yahoo!
 Messenger.
http://messenger.yahoo.com/ 

   
  
 

-
To unsubscribe, e-mail:
   [EMAIL PROTECTED]
For additional commands, e-mail:
   [EMAIL PROTECTED]

   
  
 

-
   To unsubscribe, e-mail:
   [EMAIL PROTECTED]
   For additional commands, e-mail:
   [EMAIL PROTECTED]
   
  
  
  
  
  
  __
  Do you Yahoo!?
  Friends.  Fun.  Try the all-new Yahoo! Messenger.
  http://messenger.yahoo.com/ 
  
 

-
  To unsubscribe, e-mail:
 [EMAIL PROTECTED]
  For additional commands, e-mail:
 [EMAIL PROTECTED]
  
 
 

-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 





__
Do you Yahoo!?
Friends.  Fun.  Try the all-new Yahoo! Messenger.
http://messenger.yahoo.com/ 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Memory usage

2004-05-26 Thread James Dunn
Hello,

I was wondering if anyone has had problems with memory
usage and MultiSearcher.

My index is composed of two sub-indexes that I search
with a MultiSearcher.  The total size of the index is
about 3.7GB with the larger sub-index being 3.6GB and
the smaller being 117MB.

I am using Lucene 1.3 Final with the compound file
format.

Also I search across about 50 fields but I don't use
wildcard or range queries. 

Doing repeated searches in this way seems to
eventually chew up about 500MB of memory which seems
excessive to me.

Does anyone have any ideas where I could look to
reduce the memory my queries consume?

Thanks,

Jim




__
Do you Yahoo!?
Friends.  Fun.  Try the all-new Yahoo! Messenger.
http://messenger.yahoo.com/ 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Memory usage

2004-05-26 Thread James Dunn
Will,

Thanks for your response.  It may be an object leak. 
I will look into that.

I just ran some more tests and this time I create a
20GB index by repeatedly merging my large index into
itself.

When I ran my test query against that index I got an
OutOfMemoryError on the very first query.  I have my
heap set to 512MB.  Should a query against a 20GB
index require that much memory?  I page through the
results 100 at a time, so I should never have more
than 100 Document objects in memory.  

Any help would be appreciated, thanks!

Jim
--- [EMAIL PROTECTED] wrote:
 This sounds like a memory leakage situation.  If you
 are using tomcat I
 would suggest you make sure you are on a recent
 version, as it is known to
 have some memory leaks in version 4.  It doesn't
 make sense that repeated
 queries would use more memory that the most
 demanding query unless objects
 are not getting freed from memory.
 
 -Will
 
 -Original Message-
 From: James Dunn [mailto:[EMAIL PROTECTED]
 Sent: Wednesday, May 26, 2004 3:02 PM
 To: [EMAIL PROTECTED]
 Subject: Memory usage
 
 
 Hello,
 
 I was wondering if anyone has had problems with
 memory
 usage and MultiSearcher.
 
 My index is composed of two sub-indexes that I
 search
 with a MultiSearcher.  The total size of the index
 is
 about 3.7GB with the larger sub-index being 3.6GB
 and
 the smaller being 117MB.
 
 I am using Lucene 1.3 Final with the compound file
 format.
 
 Also I search across about 50 fields but I don't use
 wildcard or range queries. 
 
 Doing repeated searches in this way seems to
 eventually chew up about 500MB of memory which seems
 excessive to me.
 
 Does anyone have any ideas where I could look to
 reduce the memory my queries consume?
 
 Thanks,
 
 Jim
 
 
   
   
 __
 Do you Yahoo!?
 Friends.  Fun.  Try the all-new Yahoo! Messenger.
 http://messenger.yahoo.com/ 
 

-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 

-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 


__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Problem Indexing Large Document Field

2004-05-26 Thread James Dunn
Gilberto,

Look at the IndexWriter class.  It has a property,
maxFieldLength, which you can set to determine the max
number of characters to be stored in the index.

http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexWriter.html

Jim

--- Gilberto Rodriguez
[EMAIL PROTECTED] wrote:
 I am trying to index a field in a Lucene document
 with about 90,000 
 characters. The problem is that it only indexes part
 of the document. 
 It seems to only index about 65,00 characters. So,
 if I search on terms 
 that are at the beginning of the text, the search
 works, but it fails 
 for terms that are at the end of the document.
 
 Is there a limitation on how many characters can be
 stored in a 
 document field? Any help would be appreciated,
 thanks
 
 
 Gilberto Rodriguez
 Software Engineer
    
 370 CenterPointe Circle, Suite 1178
 Altamonte Springs, FL 32701-3451
    
 407.339.1177 (Ext.112) • phone
 407.339.6704 • fax
 [EMAIL PROTECTED] • email
 www.conviveon.com • web
  
 This e-mail contains legally privileged and
 confidential information 
 intended only for the individual or entity named
 within the message. If 
 the reader of this message is not the intended
 recipient, or the agent 
 responsible to deliver it to the intended recipient,
 the recipient is 
 hereby notified that any review, dissemination,
 distribution or copying 
 of this communication is prohibited. If this
 communication was received 
 in error, please notify me by reply e-mail and
 delete the original 
 message.
 

-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 





__
Do you Yahoo!?
Friends.  Fun.  Try the all-new Yahoo! Messenger.
http://messenger.yahoo.com/ 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Memory usage

2004-05-26 Thread James Dunn
Erik,

Thanks for the response.  

My actual documents are fairly small.  Most docs only
have about 10 fields.  Some of those fields are
stored, however, like the OBJECT_ID, NAME and DESC
fields.  The stored fields are pretty small as well. 
None should be more than 4KB and very few will
approach that limit.

I'm also using the default maxFieldSize value of
1.  

I'm not caching hits, either.

Could it be my query?  I have about 80 total unique
fields in the index although no document has all 80. 
My query ends up looking like this:

+(F1:test F2:test ..  F80:test)

From previous mails that doesn't look like an enormous
amount of fields to be searching against.  Is there
some formula for the amount of memory required for a
query based on the number of clauses and terms?

Jim



--- Erik Hatcher [EMAIL PROTECTED] wrote:
 How big are your actual Documents?  Are you caching
 Hits?  It stores, 
 internally, up to 200 documents.
 
   Erik
 
 
 On May 26, 2004, at 4:08 PM, James Dunn wrote:
 
  Will,
 
  Thanks for your response.  It may be an object
 leak.
  I will look into that.
 
  I just ran some more tests and this time I create
 a
  20GB index by repeatedly merging my large index
 into
  itself.
 
  When I ran my test query against that index I got
 an
  OutOfMemoryError on the very first query.  I have
 my
  heap set to 512MB.  Should a query against a 20GB
  index require that much memory?  I page through
 the
  results 100 at a time, so I should never have more
  than 100 Document objects in memory.
 
  Any help would be appreciated, thanks!
 
  Jim
  --- [EMAIL PROTECTED] wrote:
  This sounds like a memory leakage situation.  If
 you
  are using tomcat I
  would suggest you make sure you are on a recent
  version, as it is known to
  have some memory leaks in version 4.  It doesn't
  make sense that repeated
  queries would use more memory that the most
  demanding query unless objects
  are not getting freed from memory.
 
  -Will
 
  -Original Message-
  From: James Dunn [mailto:[EMAIL PROTECTED]
  Sent: Wednesday, May 26, 2004 3:02 PM
  To: [EMAIL PROTECTED]
  Subject: Memory usage
 
 
  Hello,
 
  I was wondering if anyone has had problems with
  memory
  usage and MultiSearcher.
 
  My index is composed of two sub-indexes that I
  search
  with a MultiSearcher.  The total size of the
 index
  is
  about 3.7GB with the larger sub-index being 3.6GB
  and
  the smaller being 117MB.
 
  I am using Lucene 1.3 Final with the compound
 file
  format.
 
  Also I search across about 50 fields but I don't
 use
  wildcard or range queries.
 
  Doing repeated searches in this way seems to
  eventually chew up about 500MB of memory which
 seems
  excessive to me.
 
  Does anyone have any ideas where I could look to
  reduce the memory my queries consume?
 
  Thanks,
 
  Jim
 
 
 
 
  __
  Do you Yahoo!?
  Friends.  Fun.  Try the all-new Yahoo! Messenger.
  http://messenger.yahoo.com/
 
 
 

-
  To unsubscribe, e-mail:
  [EMAIL PROTECTED]
  For additional commands, e-mail:
  [EMAIL PROTECTED]
 
 
 

-
  To unsubscribe, e-mail:
  [EMAIL PROTECTED]
  For additional commands, e-mail:
  [EMAIL PROTECTED]
 
 
 
  __
  Do You Yahoo!?
  Tired of spam?  Yahoo! Mail has the best spam
 protection around
  http://mail.yahoo.com
 
 

-
  To unsubscribe, e-mail:
 [EMAIL PROTECTED]
  For additional commands, e-mail:
 [EMAIL PROTECTED]
 
 

-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 





__
Do you Yahoo!?
Friends.  Fun.  Try the all-new Yahoo! Messenger.
http://messenger.yahoo.com/ 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Memory usage

2004-05-26 Thread James Dunn
Doug,

Thanks!  

I just asked a question regarding how to calculate the
memory requirements for a search.  Does this memory
only get used only during the search operation itself,
or is it referenced by the Hits object or anything
else after the actual search completes?

Thanks again,

Jim


--- Doug Cutting [EMAIL PROTECTED] wrote:
 James Dunn wrote:
  Also I search across about 50 fields but I don't
 use
  wildcard or range queries. 
 
 Lucene uses one byte of RAM per document per
 searched field, to hold the 
 normalization values.  So if you search a 10M
 document collection with 
 50 fields, then you'll end up using 500MB of RAM.
 
 If you're using unanalyzed fields, then an easy
 workaround to reduce the 
 number of fields is to combine many in a single
 field.  So, instead of, 
 e.g., using an f1 field with value abc, and an
 f2 field with value 
 efg, use a single field named f with values
 1_abc and 2_efg.
 
 We could optimize this in Lucene.  If no values of
 an indexed field are 
 analyzed, then we could store no norms for the field
 and hence read none 
 into memory.  This wouldn't be too hard to
 implement...
 
 Doug
 

-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 





__
Do you Yahoo!?
Friends.  Fun.  Try the all-new Yahoo! Messenger.
http://messenger.yahoo.com/ 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Memory usage

2004-05-26 Thread James Dunn
Doug,

We only search on analyzed text fields.  There are a
couple of additional fields in the index like
OBJECT_ID that are keywords but we don't search
against those, we only use them once we get a result
back to find the thing that document represents.

Thanks,

Jim

--- Doug Cutting [EMAIL PROTECTED] wrote:
 It is cached by the IndexReader and lives until the
 index reader is 
 garbage collected.  50-70 searchable fields is a
 *lot*.  How many are 
 analyzed text, and how many are simply keywords?
 
 Doug
 
 James Dunn wrote:
  Doug,
  
  Thanks!  
  
  I just asked a question regarding how to calculate
 the
  memory requirements for a search.  Does this
 memory
  only get used only during the search operation
 itself,
  or is it referenced by the Hits object or anything
  else after the actual search completes?
  
  Thanks again,
  
  Jim
  
  
  --- Doug Cutting [EMAIL PROTECTED] wrote:
  
 James Dunn wrote:
 
 Also I search across about 50 fields but I don't
 
 use
 
 wildcard or range queries. 
 
 Lucene uses one byte of RAM per document per
 searched field, to hold the 
 normalization values.  So if you search a 10M
 document collection with 
 50 fields, then you'll end up using 500MB of RAM.
 
 If you're using unanalyzed fields, then an easy
 workaround to reduce the 
 number of fields is to combine many in a single
 field.  So, instead of, 
 e.g., using an f1 field with value abc, and an
 f2 field with value 
 efg, use a single field named f with values
 1_abc and 2_efg.
 
 We could optimize this in Lucene.  If no values of
 an indexed field are 
 analyzed, then we could store no norms for the
 field
 and hence read none 
 into memory.  This wouldn't be too hard to
 implement...
 
 Doug
 
 
  
 

-
  
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 
  
  
  
  
  
  
  __
  Do you Yahoo!?
  Friends.  Fun.  Try the all-new Yahoo! Messenger.
  http://messenger.yahoo.com/ 
  
 

-
  To unsubscribe, e-mail:
 [EMAIL PROTECTED]
  For additional commands, e-mail:
 [EMAIL PROTECTED]
  
 

-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 





__
Do you Yahoo!?
Friends.  Fun.  Try the all-new Yahoo! Messenger.
http://messenger.yahoo.com/ 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Preventing duplicate document insertion during optimize

2004-04-30 Thread James Dunn
Kevin,

I have a similar issue.  The only solution I have been
able to come up with is, after the merge, to open an
IndexReader against the merge index, iterate over all
the docs and delete duplicate docs based on my
primary key field.

Jim

--- Kevin A. Burton [EMAIL PROTECTED] wrote:
 Let's say you have two indexes each with the same
 document literal.  All 
 the fields hash the same and the document is a
 binary duplicate of a 
 different document in the second index.
 
 What happens when you do a merge to create a 3rd
 index from the first 
 two?  I assume you now have two documents that are
 identical in one 
 index.  Is there any way to prevent this?
 
 It would be nice to figure out if there's a way to
 flag a field as a 
 primary key so that if it has already added it to
 just skip.
 
 Kevin
 
 -- 
 
 Please reply using PGP.
 
 http://peerfear.org/pubkey.asc
 
 NewsMonster - http://www.newsmonster.org/
 
 Kevin A. Burton, Location - San Francisco, CA, Cell
 - 415.595.9965
AIM/YIM - sfburtonator,  Web -
 http://peerfear.org/
 GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D
 8D04 99F1 4412
   IRC - freenode.net #infoanarchy | #p2p-hackers |
 #newsmonster
 
 

 ATTACHMENT part 2 application/pgp-signature
name=signature.asc






__
Do you Yahoo!?
Win a $20,000 Career Makeover at Yahoo! HotJobs  
http://hotjobs.sweepstakes.yahoo.com/careermakeover 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Problems From the Word Go

2004-04-29 Thread James Dunn
Alex,

Could you send along whatever error messages you are
receiving?

Thanks,

Jim
--- Alex Wybraniec [EMAIL PROTECTED]
wrote:
 I'm sorry if this is not the correct place to post
 this, but I'm very
 confused, and getting towards the end of my tether.
 
 I need to install/compile and run Lucene on a
 Windows XP Pro based machine,
 running J2SE 1.4.2, with ANT.
 
 I downloaded both the source code and the
 pre-compile versions, and as yet
 have not been able to get either running. I've been
 through the
 documentation, and still I can find little to help
 me set it up properly.
 
 All I want to do (to start with) is compile and run
 the demo version.
 
 I'm sorry to ask such a newbie question, but I'm
 really stuck.
 
 So if anyone can point me to an idiots guide, or
 offer me some help, I would
 be most grateful.
 
 Once I get past this stage, I'll have all sorts of
 juicer questions for you,
 but at the minute, I can't even get past stage 1
 
 Thank you in advance
 Alex
 ---
 Outgoing mail is certified Virus Free.
 Checked by AVG anti-virus system
 (http://www.grisoft.com).
 Version: 6.0.672 / Virus Database: 434 - Release
 Date: 28/04/2004
 
 

-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 





__
Do you Yahoo!?
Win a $20,000 Career Makeover at Yahoo! HotJobs  
http://hotjobs.sweepstakes.yahoo.com/careermakeover 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: ArrayIndexOutOfBoundsException

2004-04-28 Thread James Dunn
Philippe, thanks for the reply.  I didn't FTP my index
anywhere, but your response does make it seem that my
index is in fact corrupted somehow.

Does anyone know of a tool that can verify the
validity of a Lucene index, and/or possibly repair it?
 If not, anyone have any idea how difficult it would
be to write one?  

Thanks,

Jim 

--- Phil brunet [EMAIL PROTECTED] wrote:
 
 Hi.
 
 I had this problem when i transfered a Lucene index
 by FTP in ASCII mode. 
 Using binary mode, i never has such a problem.
 
 Philippe
 
 From: James Dunn [EMAIL PROTECTED]
 Reply-To: Lucene Users List
 [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Subject: ArrayIndexOutOfBoundsException
 Date: Mon, 26 Apr 2004 12:15:39 -0700 (PDT)
 
 Hello all,
 
 I have a web site whose search is driven by Lucene
 1.3.  I've been doing some load testing using
 JMeter
 and occassionally I will see the exception below
 when
 the search page is under heavy load.
 
 Has anyone seen similar errors during load testing?
 
 I've seen some posts with similar exceptions and
 the
 general consensus is that this error means that the
 index is corrupt.  I'm not sure my index is corrupt
 however.  I can run all the queries I use for load
 testing under normal load and I don't appear to get
 this error.
 
 Is there any way to verify that a Lucene index is
 corrupt or not?
 
 Thanks,
 
 Jim
 
 java.lang.ArrayIndexOutOfBoundsException: 53 = 52
  at
 java.util.Vector.elementAt(Vector.java:431)
  at

org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:135)
  at

org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:103)
  at

org.apache.lucene.index.SegmentReader.document(SegmentReader.java:275)
  at

org.apache.lucene.index.SegmentsReader.document(SegmentsReader.java:112)
  at

org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:107)
  at

org.apache.lucene.search.MultiSearcher.doc(MultiSearcher.java:100)
  at

org.apache.lucene.search.MultiSearcher.doc(MultiSearcher.java:100)
  at
 org.apache.lucene.search.Hits.doc(Hits.java:130)
 
 
 
 
 
 __
 Do you Yahoo!?
 Yahoo! Photos: High-quality 4x6 digital prints for
 25¢
 http://photos.yahoo.com/ph/print_splash
 

-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 
 

_
 Hotmail : un compte GRATUIT qui vous suit partout et
 tout le temps ! 
 http://g.msn.fr/FR1000/9493
 
 

-
 To unsubscribe, e-mail:
 [EMAIL PROTECTED]
 For additional commands, e-mail:
 [EMAIL PROTECTED]
 





__
Do you Yahoo!?
Win a $20,000 Career Makeover at Yahoo! HotJobs  
http://hotjobs.sweepstakes.yahoo.com/careermakeover 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: 'Lock obtain timed out' even though NO locks exist...

2004-04-28 Thread James Dunn
Which version of lucene are you using?  In 1.2, I
believe the lock file was located in the index
directory itself.  In 1.3, it's in your system's tmp
folder.  

Perhaps it's a permission problem on either one of
those folders.  Maybe your process doesn't have write
access to the correct folder and is thus unable to
create the lock file?  

You can also pass lucene a system property to increase
the lock timeout interval, like so:

-Dorg.apache.lucene.commitLockTimeout=6

or 

-Dorg.apache.lucene.writeLockTimeout=6

The above sets the timeout to one minute.

Hope this helps,

Jim

--- Kevin A. Burton [EMAIL PROTECTED] wrote:
 I've noticed this really strange problem on one of
 our boxes.  It's 
 happened twice already.
 
 We have indexes where when Lucnes starts it says
 'Lock obtain timed out' 
 ... however NO locks exist for the directory. 
 
 There are no other processes present and no locks in
 the index dir or /tmp.
 
 Is there anyway to figure out what's going on here?
 
 Looking at the index it seems just fine... But this
 is only a brief 
 glance.  I was hoping that if it was corrupt (which
 I don't think it is) 
 that lucene would give me a better error than Lock
 obtain timed out
 
 Kevin
 
 -- 
 
 Please reply using PGP.
 
 http://peerfear.org/pubkey.asc
 
 NewsMonster - http://www.newsmonster.org/
 
 Kevin A. Burton, Location - San Francisco, CA, Cell
 - 415.595.9965
AIM/YIM - sfburtonator,  Web -
 http://peerfear.org/
 GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D
 8D04 99F1 4412
   IRC - freenode.net #infoanarchy | #p2p-hackers |
 #newsmonster
 
 

 ATTACHMENT part 2 application/pgp-signature
name=signature.asc






__
Do you Yahoo!?
Win a $20,000 Career Makeover at Yahoo! HotJobs  
http://hotjobs.sweepstakes.yahoo.com/careermakeover 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



ArrayIndexOutOfBoundsException

2004-04-26 Thread James Dunn
Hello all,

I have a web site whose search is driven by Lucene
1.3.  I've been doing some load testing using JMeter
and occassionally I will see the exception below when
the search page is under heavy load.

Has anyone seen similar errors during load testing?

I've seen some posts with similar exceptions and the
general consensus is that this error means that the
index is corrupt.  I'm not sure my index is corrupt
however.  I can run all the queries I use for load
testing under normal load and I don't appear to get
this error.

Is there any way to verify that a Lucene index is
corrupt or not? 

Thanks,

Jim

java.lang.ArrayIndexOutOfBoundsException: 53 = 52
at java.util.Vector.elementAt(Vector.java:431)
at
org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:135)
at
org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:103)
at
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:275)
at
org.apache.lucene.index.SegmentsReader.document(SegmentsReader.java:112)
at
org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:107)
at
org.apache.lucene.search.MultiSearcher.doc(MultiSearcher.java:100)
at
org.apache.lucene.search.MultiSearcher.doc(MultiSearcher.java:100)
at
org.apache.lucene.search.Hits.doc(Hits.java:130)





__
Do you Yahoo!?
Yahoo! Photos: High-quality 4x6 digital prints for 25¢
http://photos.yahoo.com/ph/print_splash

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]