[ https://issues.apache.org/jira/browse/LUCENE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744092#action_12744092 ]
DM Smith commented on LUCENE-1813: ---------------------------------- I like the idea of a constant and it presented as a default. I suggest that others be given in the JavaDoc. I have some texts which are using PUAs until Unicode includes the code points (e.g. Myanmar text), so I'm glad that allowing a choice doesn't create a potential conflict there. I think PUA should be left to the text author. As my texts are all derived from XML, I like the use of a character that is not allowed in XML. I think 0001 is just fine, even if not from a purity perspective. Some of my texts have BIDI markers and while these will be stripped by filters, I don't think this use is analogous. > Add option to ReverseStringFilter to mark reversed tokens > --------------------------------------------------------- > > Key: LUCENE-1813 > URL: https://issues.apache.org/jira/browse/LUCENE-1813 > Project: Lucene - Java > Issue Type: Improvement > Components: contrib/analyzers > Affects Versions: 2.9 > Reporter: Andrzej Bialecki > Assignee: Robert Muir > Fix For: 2.9 > > Attachments: LUCENE-1813.patch, reverseMark-2.patch, reverseMark.patch > > > This patch implements additional functionality in the filter to "mark" > reversed tokens with a special marker character (Unicode 0001). This is > useful when indexing both straight and reversed tokens (e.g. to implement > efficient leading wildcards search). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org