Re: What does "found existing value for PerFieldPostingsFormat.format" mean?
We already have CheckIndex that verifies that Fields.iterator() returns a sorted iterator so I think we should improve the javadocs of Fields.iterator() to make it explicit. Le mar. 18 oct. 2016 à 05:15, Trejkaza écrit : > Continuation, found a bug but I'm not sure whether it's in Lucene or > Lucene's Javadoc. > > In MultiFields: > > @SuppressWarnings({"unchecked","rawtypes"}) > @Override > public Iterator iterator() { > Iterator subIterators[] = new Iterator[subs.length]; > for(int i=0;i subIterators[i] = subs[i].iterator(); > } > return new MergedIterator<>(subIterators); > } > > MergedIterator says in the Javadoc: > > "The behavior is undefined if the iterators are not actually sorted." > > And indeed, the iterators are _not_ actually sorted. So I look at > where they come from, Fields#iterator(), which is documented fairly > tersely: > > "Returns an iterator that will step through all fields names. > This will not return null." > > Which doesn't say anything about the names being in order. So I assume > that either: > > (a) Fields#iterator() is actually supposed to be sorted and the > documentation should specify it but doesn't, or > > (b) Fields#iterator() is not supposed to be sorted, but either > MultiFields#iterator() or MergedIterator is supposed to be handling > this better. > > Either way, I think it's a bug in Lucene. But since I don't know which > direction it's in, and I don't have a reproducible test case I can > just hand over, I can't easily file it. :/ > > TX > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >
Re: What does "found existing value for PerFieldPostingsFormat.format" mean?
Continuation, found a bug but I'm not sure whether it's in Lucene or Lucene's Javadoc. In MultiFields: @SuppressWarnings({"unchecked","rawtypes"}) @Override public Iterator iterator() { Iterator subIterators[] = new Iterator[subs.length]; for(int i=0;i(subIterators); } MergedIterator says in the Javadoc: "The behavior is undefined if the iterators are not actually sorted." And indeed, the iterators are _not_ actually sorted. So I look at where they come from, Fields#iterator(), which is documented fairly tersely: "Returns an iterator that will step through all fields names. This will not return null." Which doesn't say anything about the names being in order. So I assume that either: (a) Fields#iterator() is actually supposed to be sorted and the documentation should specify it but doesn't, or (b) Fields#iterator() is not supposed to be sorted, but either MultiFields#iterator() or MergedIterator is supposed to be handling this better. Either way, I think it's a bug in Lucene. But since I don't know which direction it's in, and I don't have a reproducible test case I can just hand over, I can't easily file it. :/ TX - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
Re: What does "found existing value for PerFieldPostingsFormat.format" mean?
Additional investigation: The index has two segments. Both segments have this "path-position" in the FieldInfo only once. The settings look the same: FieldInfo in first sub-reader: name = "path-position" number = 6 docValuesType = NONE storeTermVector = false omitNorms = true indexOptions = DOCS_AND_FREQS_AND_POSITIONS storePayloads = false attributes = "PerFieldPostingsFormat.format" -> "Lucene50" "PerFieldPostingsFormat.suffix" -> "0" dvGen = -1 FieldInfo in second sub-reader: name = "path-position" number = 6 docValuesType = NONE storeTermVector = false omitNorms = true indexOptions = DOCS_AND_FREQS_AND_POSITIONS storePayloads = false attributes = "PerFieldPostingsFormat.format" -> "Lucene50" "PerFieldPostingsFormat.suffix" -> "0" dvGen = -1 So I'm confused. addIndexes, I thought, merged the data from the given readers into the destination writer. And here I have two fields with the same name, number and every other setting, and somehow it's failing to merge them because when it gets to the second one, it fails because the first one existed already... which to me, seems like the point of merging, but maybe that's just me. TX - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org
What does "found existing value for PerFieldPostingsFormat.format" mean?
Hi all. Does anyone know what this error message means? found existing value for PerFieldPostingsFormat.format, field=path-position, old=Lucene50, new=Lucene50 java.lang.IllegalStateException: found existing value for PerFieldPostingsFormat.format, field=path-position, old=Lucene50, new=Lucene50 at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.write(PerFieldPostingsFormat.java:170) at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:105) at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:193) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:95) at org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:2629) We're doing fancy migrations which perform index changes by overriding FilterCodecReader and copying into a new index, but in this particular case the migration is only *deleting* values from the index, so it seems odd that I'd get this particular error. TX - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org