Clustering unstructured text data

lovely kasi Sun, 13 Oct 2013 07:43:34 -0700

Hi,

I have gone through the solr tutorial but i could find only indexing of the
json data.
I want to index and cluster the unstructured text data.
For example I have a folder which has 10 text files.Where each text file
contains 10 lines of text which is a communication between customer and
executive.
I want each file(i.e all 10 lines) to be considered as a single document
and indexed as one.



For example:

I have input text documents with data like below.

Document1: This is the first document of selling information.
Document2: This is the second document of gathering information.

I also have another look up file with data like below
selling:CatA
gathering:CatB.
information:CatC

NOw i would like to cluster the documents with output being genrated as
Document1:CatA,CatC
Document2:CatB,CatC

Please let me know how to achieve this

Clustering unstructured text data

Reply via email to