Is there a specific reason that you write your text in this way? I mean, indentions instead of line breaks? It makes it very hard to read, if you ask me.
Just my 2 cents. :) /Jimi mogul | jimi hullegård | system developer | hudiksvallsgatan 4, 113 30 stockholm sweden | +46 8 506 66 172 | +46 765 27 19 55 | [EMAIL PROTECTED] | www.mogul.com > -----Original Message----- > From: Marcelo Ochoa [mailto:[EMAIL PROTECTED] > Sent: den 26 september 2008 20:54 > To: java-user@lucene.apache.org > Subject: [SPAM] - Re: Caused by: java.io.IOException: read > past EOF on Slave - Found word(s) list error in the Text body > > Mike: > Actually there is more issues at first glance with > OJVMDirectory integration. > Note this, I am creating an index with two simple documents: > INFO: Performing: SELECT /*+ DYNAMIC_SAMPLING(0) RULE NOCACHE(T1) */ > T1.rowid,F1,extractValue(F2,'/emp/name/text()') > "name",extractValue(F2,'/emp/@id') "id" FROM LUCENE.T1 for > update nowait > Sep 26, 2008 3:44:16 PM org.apache.lucene.indexer.TableIndexer index > FINE: Document<stored/uncompressed,indexed<rowid:AAARLCAAEAAAm2QAAA> > indexed,tokenized<F1:001> indexed,tokenized<name:ravi> > indexed,tokenized<id:01>> > Sep 26, 2008 3:44:16 PM org.apache.lucene.indexer.TableIndexer index > FINE: Document<stored/uncompressed,indexed<rowid:AAARLCAAEAAAm2QAAB> > indexed,tokenized<F1:003> indexed,tokenized<name:murthy> > indexed,tokenized<id:03>> > IW 10 [Root Thread]: flush: segment=_0 docStoreSegment=_0 > docStoreOffset=0 flushDocs=true flushDeletes=true flushDocStores=false > numDocs=2 > numBufDelTerms=0 > IW 10 [Root Thread]: index before flush > IW 10 [Root Thread]: DW: flush postings as segment _0 numDocs=2 > IW 10 [Root Thread]: DW: oldRAMSize=111616 newFlushedSize=166 > docs/MB=12,633.446 new/old=0.149% > IFD [Root Thread]: now checkpoint "segments_1" [1 segments ; > isCommit = false] > IW 10 [Root Thread]: LMP: findMerges: 1 segments > IW 10 [Root Thread]: LMP: level -1.0 to 2.2741578: 1 segments > IW 10 [Root Thread]: CMS: now merge > IW 10 [Root Thread]: CMS: index: _0:C2->_0 > IW 10 [Root Thread]: CMS: no more merges pending; now return > IW 10 [Root Thread]: now flush at close > IW 10 [Root Thread]: flush: segment=null docStoreSegment=_0 > docStoreOffset=2 flushDocs=false flushDeletes=true flushDocStores=true > numDocs=0 numBufDelTerms=0 > IW 10 [Root Thread]: index before flush _0:C2->_0 > IW 10 [Root Thread]: flush shared docStore segment _0 > IW 10 [Root Thread]: DW: closeDocStore: 2 files to flush to > segment _0 numDocs=2 > IW 10 [Root Thread]: CMS: now merge > IW 10 [Root Thread]: CMS: index: _0:C2->_0 > IW 10 [Root Thread]: CMS: no more merges pending; now return > IW 10 [Root Thread]: now call final commit() > IW 10 [Root Thread]: startCommit(): start sizeInBytes=0 > IW 10 [Root Thread]: startCommit index=_0:C2->_0 changeCount=2 > IW 10 [Root Thread]: now sync _0.fnm > IW 10 [Root Thread]: now sync _0.frq > IW 10 [Root Thread]: now sync _0.prx > IW 10 [Root Thread]: now sync _0.tis > IW 10 [Root Thread]: now sync _0.tii > IW 10 [Root Thread]: now sync _0.nrm > IW 10 [Root Thread]: now sync _0.fdx > IW 10 [Root Thread]: now sync _0.fdt > IW 10 [Root Thread]: done all syncs > IW 10 [Root Thread]: commit: pendingCommit != null > IFD [Root Thread]: now checkpoint "segments_2" [1 segments ; > isCommit = true] > IFD [Root Thread]: deleteCommits: now decRef commit "segments_1" > IFD [Root Thread]: delete "segments_1" > IW 10 [Root Thread]: commit: done > IW 10 [Root Thread]: at close: _0:C2->_0 > Sep 26, 2008 3:44:16 PM org.apache.lucene.indexer.LuceneDomainIndex > ODCIIndexCreate > FINER: RETURN 0 > > Index created. > > And when I am trying to read the index I got: > INFO: Analyzer: [EMAIL PROTECTED] > Sep 26, 2008 3:44:48 PM > org.apache.lucene.indexer.LuceneDomainIndex ODCIStart > INFO: qryStr: DESC(name:ravi) > Sep 26, 2008 3:44:48 PM > org.apache.lucene.indexer.LuceneDomainIndex ODCIStart > INFO: storing cachingFilter: -1378376940 and searcher: 781713581 > qryStr: DESC(name:ravi) > Sep 26, 2008 3:44:48 PM > org.apache.lucene.indexer.LuceneDomainIndex getSort > INFO: using sort: <score>,<doc> > Exception in thread "Root Thread" java.lang.IndexOutOfBoundsException: > Index: 6, Size: 4 > at java.util.ArrayList.RangeCheck(ArrayList.java) > at java.util.ArrayList.get(ArrayList.java) > at > org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java) > at > org.apache.lucene.index.FieldInfos.fieldName(FieldInfos.java) > at org.apache.lucene.index.TermBuffer.read(TermBuffer.java) > at > org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java) > at > org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java) > at > org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java) > at > org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java) > at > org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java) > at > org.apache.lucene.search.IndexSearcher.docFreq(IndexSearcher.java) > at org.apache.lucene.search.Similarity.idf(Similarity.java) > at > org.apache.lucene.search.TermQuery$TermWeight.<init>(TermQuery.java) > at > org.apache.lucene.search.TermQuery.createWeight(TermQuery.java) > at org.apache.lucene.search.Query.weight(Query.java) > at org.apache.lucene.search.Hits.<init>(Hits.java:85) > at org.apache.lucene.search.Searcher.search(Searcher.java) > at > org.apache.lucene.indexer.LuceneDomainIndex.ODCIStart(LuceneDo > mainIndex.java) > > Which definetly means that something is not well saved at OJVM > directory BLOB storage :( > This are my files: > SQL> select file_size,name from it1$t; > > FILE_SIZE NAME > ---------- ------------------------------ > 10 parameters > 1 updateCount > 28 segments_1 > 20 segments.gen > 8 _0.frq > 8 _0.prx > 103 _0.tis > 35 _0.tii > 12 _0.nrm > 22 _0.fnm > 48 _0.fdt > 20 _0.fdx > 62 segments_2 > I'll add some debugging information at my classes which save/load > buffers to see how many calls and which arguments are used. > Marcelo. > > On Fri, Sep 26, 2008 at 1:41 PM, Michael McCandless > <[EMAIL PROTECTED]> wrote: > > > > This one looks spooky! > > > > Is it easily repeated? If you could print out which 2 > terms you had tried > > to delete, and then zip up the index just before deleting > those docs (after > > closing the writer) and send to me, I can try to understand > what's wrong > > with the index. It looks as if the *.tis file for one of > the segments is > > truncated. > > > > If you capture the series of add/update/delete documents, > can you get a > > standalone Java test to show this? > > > > Does this test create an entirely new index? > > > > We did change the index format in 2.4 to use "true" UTF8 > encoding for all > > text content; not sure that this applies here (to > BufferedIndexReader it's > > all bytes) but it may. > > > > BufferedIndexReader in general can do random IO, especially > when reading the > > term dict file (*.tis), when you > > > > Mike > > > > Marcelo Ochoa wrote: > > > >> Michael: > >> I just start testing 2.4rc2 running inside OJVM. > >> I found a similar stack trace during indexing: > >> IW 3 [Root Thread]: flush: segment=_3 docStoreSegment=_3 > >> docStoreOffset=0 flushDocs=true flushDeletes=true > flushDocStores=false > >> numDocs=2 numBufDelTerms=2 > >> IW 3 [Root Thread]: index before flush _1:C2->_1 _2:C2->_2 > >> IW 3 [Root Thread]: DW: flush postings as segment _3 numDocs=2 > >> IW 3 [Root Thread]: DW: oldRAMSize=111616 newFlushedSize=264 > >> docs/MB=7,943.758 new/old=0.237% > >> IW 3 [Root Thread]: DW: apply 2 buffered deleted terms and > 0 deleted > >> docIDs and 0 deleted queries on 3 segments. > >> IW 3 [Root Thread]: hit exception flushing deletes > >> Exception in thread "Root Thread" java.io.IOException: > read past EOF > >> at > >> > org.apache.lucene.store.BufferedIndexInput.refill(BufferedInde > xInput.java) > >> at > >> > org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedI > ndexInput.java) > >> at > >> > org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedI > ndexInput.java) > >> at org.apache.lucene.index.TermBuffer.read(TermBuffer.java) > >> at > >> org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java) > >> at > >> > org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java) > >> at > org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java) > >> at > org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java) > >> at > >> org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java) > >> at > org.apache.lucene.index.IndexReader.termDocs(IndexReader.java) > >> at > >> > org.apache.lucene.index.DocumentsWriter.applyDeletes(Documents > Writer.java) > >> at > >> > org.apache.lucene.index.DocumentsWriter.applyDeletes(Documents > Writer.java:918) > >> at > >> org.apache.lucene.index.IndexWriter.applyDeletes(IndexWriter.java) > >> at > org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java) > >> at > org.apache.lucene.index.IndexWriter.flush(IndexWriter.java) > >> at > org.apache.lucene.index.IndexWriter.flush(IndexWriter.java) > >> at > >> > org.apache.lucene.indexer.LuceneDomainIndex.sync(LuceneDomainI > ndex.java:1308) > >> > >> I'll reinstall with a full debug info to see all line numbers in > >> Lucene java code. > >> Is there a list of semantic changes at BufferedIndeInput code? > >> I mean it do sequential or random writes for example. > >> But anyway, I just compiled with latest code and ran my > test suites, > >> I'll investigate the problem a bit more. > >> Best regards, Marcelo. > >> > >> On Fri, Sep 26, 2008 at 7:32 AM, Michael McCandless > >> <[EMAIL PROTECTED]> wrote: > >>> > >>> Can you describe the sequence of steps that your > replication process goes > >>> through? > >>> > >>> Also, which filesystem is the index being accessed through? > >>> > >>> Mike > >>> > >>> rahul_k123 wrote: > >>> > >>>> > >>>> First of all, thanks to all the people who helped me in > getting the > >>>> lucene > >>>> replication setup working and right now its live in our > production :-) > >>>> > >>>> Everything working fine, except that i am seeing some > exceptions on > >>>> slaves. > >>>> > >>>> The following is the one which is occuring more often on slaves > >>>> > >>>> at > >>>> > java.util.concurrent.Executors$RunnableAdapter.call(Executors. > java:441) > >>>> at > >>>> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > >>>> at java.util.concurrent.FutureTask.run(FutureTask.java:138) > >>>> at > >>>> > >>>> > >>>> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadP > oolExecutor.java:885) > >>>> at > >>>> > >>>> > >>>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolE > xecutor.java:907) > >>>> at java.lang.Thread.run(Thread.java:619) > >>>> Caused by: com.IndexingException: [SYSTEM_ERROR] Cannot > access index > >>>> [data_dir/index]: [read past EOF] > >>>> at > >>>> > >>>> > >>>> > com.lucene.LuceneSearchService.getSearchResults(LuceneSearchSe > rvice.java:964) > >>>> ... 12 more > >>>> Caused by: java.io.IOException: read past EOF > >>>> at > >>>> > >>>> > >>>> > org.apache.lucene.store.BufferedIndexInput.refill(BufferedInde > xInput.java:146) > >>>> at > >>>> > >>>> > >>>> > org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIn > dexInput.java:38) > >>>> at > org.apache.lucene.store.IndexInput.readInt(IndexInput.java:66) > >>>> at > org.apache.lucene.store.IndexInput.readLong(IndexInput.java:89) > >>>> at > org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:147) > >>>> at > >>>> > org.apache.lucene.index.SegmentReader.document(SegmentReader.java:659) > >>>> at > >>>> > >>>> > >>>> > org.apache.lucene.index.MultiSegmentReader.document(MultiSegme > ntReader.java:257) > >>>> at > >>>> > org.apache.lucene.index.IndexReader.document(IndexReader.java:525) > >>>> > >>>> and the second one is > >>>> > >>>> at > >>>> > java.util.concurrent.Executors$RunnableAdapter.call(Executors. > java:441) > >>>> at > >>>> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > >>>> at java.util.concurrent.FutureTask.run(FutureTask.java:138) > >>>> at > >>>> > >>>> > >>>> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadP > oolExecutor.java:885) > >>>> at > >>>> > >>>> > >>>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolE > xecutor.java:907) > >>>> at java.lang.Thread.run(Thread.java:619) > >>>> Caused by: java.lang.IllegalArgumentException: attempt > to access a > >>>> deleted > >>>> document > >>>> at > >>>> > org.apache.lucene.index.SegmentReader.document(SegmentReader.java:657) > >>>> at > >>>> > >>>> > >>>> > org.apache.lucene.index.MultiSegmentReader.document(MultiSegme > ntReader.java:257) > >>>> at > >>>> > org.apache.lucene.index.IndexReader.document(IndexReader.java:525) > >>>> This is on master index . > >>>> > >>>> > >>>> > >>>> Any help is appreciated > >>>> > >>>> Thanks. > >>>> > >>>> -- > >>>> View this message in context: > >>>> > >>>> > http://www.nabble.com/Caused-by%3A-java.io.IOException%3A-read > -past-EOF-on-Slave-tp19682684p19682684.html > >>>> Sent from the Lucene - Java Users mailing list archive > at Nabble.com. > >>>> > >>>> > >>>> > --------------------------------------------------------------------- > >>>> To unsubscribe, e-mail: [EMAIL PROTECTED] > >>>> For additional commands, e-mail: [EMAIL PROTECTED] > >>>> > >>> > >>> > >>> > --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: [EMAIL PROTECTED] > >>> For additional commands, e-mail: [EMAIL PROTECTED] > >>> > >>> > >> > >> > >> > >> -- > >> Marcelo F. Ochoa > >> http://marceloochoa.blogspot.com/ > >> http://marcelo.ochoa.googlepages.com/home > >> ______________ > >> Do you Know DBPrism? Look @ DB Prism's Web Site > >> http://www.dbprism.com.ar/index.html > >> More info? > >> Chapter 17 of the book "Programming the Oracle Database > using Java & > >> Web Services" > >> http://www.amazon.com/gp/product/1555583296/ > >> Chapter 21 of the book "Professional XML Databases" - Wrox Press > >> http://www.amazon.com/gp/product/1861003587/ > >> Chapter 8 of the book "Oracle & Open Source" - O'Reilly > >> http://www.oreilly.com/catalog/oracleopen/ > >> > >> > --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [EMAIL PROTECTED] > >> For additional commands, e-mail: [EMAIL PROTECTED] > >> > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > -- > Marcelo F. Ochoa > http://marceloochoa.blogspot.com/ > http://marcelo.ochoa.googlepages.com/home > ______________ > Want to integrate Lucene and Oracle? > http://marceloochoa.blogspot.com/2007/09/running-lucene-inside > -your-oracle-jvm.html > Is Oracle 11g REST ready? > http://marceloochoa.blogspot.com/2008/02/is-oracle-11g-rest-ready.html > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]