[ https://issues.apache.org/jira/browse/LUCENE-2514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Uwe Schindler updated LUCENE-2514: ---------------------------------- Attachment: LUCENE-2514.patch Here robert's patch with MTQ changed. It currently still uses placeholderTerms to not need to intern every time. If we remove string interning from Term, we can replace this by simple new Term() in MTQ. I delayed cloning of BytesRef until the BytesRef is put into a TermQuery or PQ or whenever it is set aside. But it no longer clones it e.g. if the term is never accepted by the PQ. Also the PQ reuses its ScoreTerm instances and so, the term bytes are simply copied over :-) I also removed a Java 1.6 interface override - the Generics Policeman gives a ticket! I don't understand where those come from, Java 1.6 should also fail to compile as the ant build uses -source 1.5...? > Change Term to use bytes > ------------------------ > > Key: LUCENE-2514 > URL: https://issues.apache.org/jira/browse/LUCENE-2514 > Project: Lucene - Java > Issue Type: Task > Components: Search > Affects Versions: 4.0 > Reporter: Robert Muir > Attachments: LUCENE-2514-surrogates-dance.patch, LUCENE-2514.patch, > LUCENE-2514.patch, LUCENE-2514.patch, LUCENE-2514.patch, LUCENE-2514.patch > > > in LUCENE-2426, the sort order was changed to codepoint order. > unfortunately, Term is still using string internally, and more importantly > its compareTo() uses the wrong order [utf-16]. > So MultiTermQuery, etc (especially its priority queues) are currently wrong. > By changing Term to use bytes, we can also support terms encoded as bytes > such as numerics, instead of using > strange string encodings. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org