Hi,

May be look at the factory class to see how types argument is handled?

Ahmet


On Friday, March 17, 2017 11:05 PM, "pha...@mailbox.org" <pha...@mailbox.org> 
wrote:



Hi,


I am trying to index words like 'e-mail' as 'email', 'e mail' and 'e-mail' with 
Lucene 4.4.0.


Lucene's WordDelimiterFilter should be ideal for this. However, it treats 
every(?) non-alphanumeric character as a delimiter. So, terms like 'C++' are 
transformed to 'C', which is not what I want.


Apparently, Solr allows to specify custom delimiters. But how can I do it in 
Lucene?


I have looked into the documentation and the 'byte[] charTypeTable' parameter 
in the Constructor looked promising. But it seems to have no effect if I 
specify some delimiters in a charTypeTable.


Thank you!


---------------------------------------------------------------------

To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org

For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to