Hello, not sure if the term 'cluster' is the correct one, but here what i would like to do: given I have a small set of categories; i manually defined some keywords for each category. ie:
-spielberg: ET, munich, indiana jones; -sport: football, basket, volley, etc etc; then, i have a quite large archive of documents (html, pdf, doc) (~5000, still growing) and I want to 'assign' each document to those categories, using Lucene possibly (if it can help!). what approach could I adopt ? thanks, valerio -- To Iterate is Human, to Recurse, Divine James O. Coplien, Bell Labs (how good is to be human indeed)