I just tried out the jena-text indexing and query capabilities of jena 2.11. Great stuff, but the property values I indexed contain part numbers that frequently contain hyphens. Apparently Lucene's StandardAnalyzer tokenizes on hyphens, so my initial search results were quite puzzling.
However, even with the limited results, I can see that the text queries are much faster than strstarts() or regex() filters on the same property values. So I would like to try indexing the property values using Lucene's KeywordAnalyzer. I think I can see in the code how this could be easily done. Has anyone else encountered this problem? Have I missed some other way to improve response time for a filtered string search, or overestimated the possible performance improvement? (I'm new to Lucene.) Would the developers consider an enhancement to make this option configurable in the text assembler? Regards, --Paul
