Right, it's just a clustering example on publicly-available data from
the URL Ted mentioned. Nothing magic going on here: just uses an
InputDriver to put the data into seqFile notation, regular Canopy
(usually) to find initial clusters and then an appropriate clustering
job to cluster the points. Very little example-specific code in it; just
the input conversion. It uses the cluster dumper to output the data and
I will likely use both the ClusterEvaluator and CDbwEvaluator to
evaluate the clustering goodness for 0.5.
On 9/26/10 1:12 PM, Sean Owen wrote:
That must be why -- it's not an algorithm, I see.
On Sun, Sep 26, 2010 at 5:23 PM, Ted Dunning<[email protected]> wrote:
Correct me if I am wrong (and I am definitely unsure here), but doesn't the
"synthetic control" refer to the synthetic control chart clustering task of
great antiquity
http://archive.ics.uci.edu/ml/datasets/Synthetic+Control+Chart+Time+Series
As such, why should a synthetic data be in core?
On Sun, Sep 26, 2010 at 6:56 AM, Sean Owen<[email protected]> wrote:
Dumb but simple question -- why is the synthetic control
implementation in examples and not core?