Right, it's just a clustering example on publicly-available data from the URL Ted mentioned. Nothing magic going on here: just uses an InputDriver to put the data into seqFile notation, regular Canopy (usually) to find initial clusters and then an appropriate clustering job to cluster the points. Very little example-specific code in it; just the input conversion. It uses the cluster dumper to output the data and I will likely use both the ClusterEvaluator and CDbwEvaluator to evaluate the clustering goodness for 0.5.

On 9/26/10 1:12 PM, Sean Owen wrote:
That must be why -- it's not an algorithm, I see.

On Sun, Sep 26, 2010 at 5:23 PM, Ted Dunning<[email protected]>  wrote:
Correct me if I am wrong (and I am definitely unsure here), but doesn't the
"synthetic control" refer to the synthetic control chart clustering task of
great antiquity
http://archive.ics.uci.edu/ml/datasets/Synthetic+Control+Chart+Time+Series

As such, why should a synthetic data be in core?

On Sun, Sep 26, 2010 at 6:56 AM, Sean Owen<[email protected]>  wrote:

Dumb but simple question -- why is the synthetic control
implementation in examples and not core?


Reply via email to