I have a requirement in which I want to match the company name .. and I am thinking to solve this using clustering technique.
Can anybody suggest which algo I should Use in Spark and how to evaluate the running time and accuracy for this particular problem. I checked K means looks good. Any idea suggestions?