Hi All

I have around 20 million company name and I want to index them.
Currently What I am doing I am tokenizing and for each token I am applying 
Metaphone 3 and then Stroring each token in Hbase.
When I get new query(company to match) I will again tokenize and apply 
metaphone3 as I did when I stored them in Hbase
Now for each token I will query Hbase and collate the result.

This seems inefficient and has some issue even after implementing the 
functionality of 
WordDelimiterFilterFactory<http://stackoverflow.com/questions/17707733/worddelimiterfilterfactory-not-including-all-permutations>
 and singleFilter factory.

I am thinking to index these companies name in solr since all the functionality 
already there?

Do we have support for spark?

Reply via email to