Hi,

I have my RDD that stores the titles of some articles:
1. "About Spark Streaming"
2. "About Spark MLlib"
3. "About Spark SQL"
4. "About Spark Installation"
5. "Kafka Streaming"
6. "Kafka Setup"
7. ....

I need to build a model to find titles by similarity,
e.g
if given "About Spark", hope to get:

"About Spark Installation", 0.98622 (where 0.98622 is the score
of similarity, range between 0 to 1)
"About Spark MLlib", 0.95394
"About Spark Streaming", 0.94332
"About Spark SQL", 0.9111

Any idea or reference to do so?

Thanks
Ascot





 and need to find out similar titles

Reply via email to