Documentation of synthetic control data example

Joe Kumar Sun, 15 Aug 2010 00:34:23 -0700

Team,

I've started documenting the first example (clustering of synthetic control
data) present in the quick start page. The TOC is as below


   1. Introduction
   2. Problem description
   3. Pre-Prep (thinking of moving this to a separate page and reference it
   in all the example pages)
   4. Perform Clustering
   5. Read / Analyze Output

Does this TOC look ok ? Can we standardize on some TOC so that all examples
(current and future) will have structured information.
If someone can glance through documentation @
https://cwiki.apache.org/confluence/display/MAHOUT/Clustering+of+synthetic+control+dataand
let me know of any feedbacks that'd be gr8. It'll help me with
documenting the other examples..

Few questions -
1. How does end user read the output data present in output directory of
HDFS. I tried reading data from /output/clusteredPoints/part-m-00000 and it
doesnt look readable (It doesnt have control data numbers that are readable)
2. How could someone validate the accuracy of the control data clusters that
are being generated ?

regards
Joe.

Documentation of synthetic control data example

Reply via email to