I was just able to index all of wikipedia, using StandardAnalyzer, with assertions enabled, without hitting that exception. Which analyzer are you using (besides your payload field)?
Mike Michael McCandless <luc...@mikemccandless.com> wrote: > Hmmmm. > > Jason is this easily/compactly repeated? EG, try to index the N docs > before that one. > > If you remove the SinglePayloadTokenStream field, does the exception > still happen? > > Mike > > Jason Rutherglen <jason.rutherg...@gmail.com> wrote: >> While indexing using >> contrib/org.apache.lucene.benchmark.byTask.feeds.EnwikiDocMaker. The >> asserion error is from TermsHashPerField.comparePostings(RawPostingList p1, >> RawPostingList p2). A Payload is added to the document representing a UID. >> Only 1-2 out of 1 million documents indexed generates this error. >> >> java.lang.AssertionError >> problem adding >> doc:Document<stored/uncompressed,indexed,tokenized<body:[[Image:Croatia, >> Washington.JPG|right|250px|thumb|The Croatian embassy]] The '''Croatian >> Embassy in Washington''' is the [[embassy]] of [[Croatia]] in [[Washington, >> D.C.]] It is located on [[Embassy Row]] at 2343 [[Massachusetts Avenue >> (Washington, DC)|Massachusetts Avenue]], [[Washington DC >> (northwest)|Northwest]] near [[Dupont Circle]]. Previously the building had >> been home to the [[Austrian Embassy in Washington|Austrian embassy]], but >> they left for larger quarters and sold the structure to Croatia in 1993. >> The purchase and renovation of the building was largely paid for by the >> [[Croatian-American]] community. In front of the embassy is a large >> sculpture of [[St. Jerome]] by Croatian sculptor [[Ivan Me?trovi?]]. >> ==External link== *[http://www.croatiaemb.org/ Official site] >> [[Category:Embassies in Washington|Croatia]] [[Category:Foreign relations of >> Croatia]]> stored/uncompressed,indexed,tokenized<doctitle:Embassy of Croatia >> in Washington> stored/uncompressed,indexed,tokenized<docdate:29-JUN-2006 >> 07:27:44.000> stored/uncompressed,indexed,omitNorms<docid:1703107> >> indexed,tokenized<_ID:proj.zoie.api.zoieindexreader$singlepayloadtokenstr...@e7b3cf> >> indexed<id:667162>> ex: java.lang.AssertionError >> at >> org.apache.lucene.index.TermsHashPerField.comparePostings(TermsHashPerField.java:228) >> at >> org.apache.lucene.index.TermsHashPerField.quickSort(TermsHashPerField.java:144) >> at >> org.apache.lucene.index.TermsHashPerField.sortPostings(TermsHashPerField.java:136) >> at >> org.apache.lucene.index.FreqProxFieldMergeState.<init>(FreqProxFieldMergeState.java:51) >> at >> org.apache.lucene.index.FreqProxTermsWriter.appendPostings(FreqProxTermsWriter.java:202) >> at >> org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:132) >> at org.apache.lucene.index.TermsHash.flush(TermsHash.java:145) >> at org.apache.lucene.index.DocInverter.flush(DocInverter.java:74) >> at >> org.apache.lucene.index.DocFieldConsumers.flush(DocFieldConsumers.java:75) >> at >> org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:60) >> at >> org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:574) >> at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3533) >> at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3442) >> at >> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1922) >> at >> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1880) >> > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org