On Thu, 17 Sep 2009 09:36:50 +0200 Aleksander Stensby <[email protected]> wrote:
> Or do I have to use the KMeansDriver.runJob and read input from > serialized vectors files? I'd say this is the recommended way currently, though we are open to changes to the API that would make your life easier. At least during experimentation phase, serializing the processed vectors to disk has the advantage of being able to rerun clustering with varied parameters (number of clusters, distance measure or even try out one of the other algorithms). Isabel
