[ 
https://issues.apache.org/jira/browse/MAHOUT-520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920848#action_12920848
 ] 

Joe Prasanna Kumar commented on MAHOUT-520:
-------------------------------------------

Jeff,
Thanks for pointing it out. Sorry that I didnt read the entire code and spoke 
too soon.

Drew,
I can modify the code to specify the parameters. Currently I m having a single 
line of code 
(org.apache.mahout.clustering.syntheticcontrol."${algorithm[$choice-1]}".Job) 
to call any of the clustering algos since the variable in the command is just 
the algo name. But if we specify the i/p and o/p parameters, we need to call 
each of the algos separately as each of them have different params. For eg, 
meanshift needs a convergencedelta while dirichlet doesnt need one and so on.

Should we also modify the various synthetic control clustering jobs to specify 
default values and make them optional params. For eg, in kmeans (or in all 
algos), the i/p and o/p would be the only mandatory params. Will it make sense 
to do that ? From the script perspective, it'll be great since we could just 
stick to 1 line of code for calling all clustering algos (I am trying to 
achieve DRY). Depending on this decision, i'll modify the script and re-post 
the patch.

Also currently in the non-interactive mode, I am just invoking canopy 
clustering. Should we just leave it that way or should we call each of the 
clustering algos so that hudson could verify all of the clustering algos ?

regards
Joe.


> Add example scripts / integration tests for various algorithms.
> ---------------------------------------------------------------
>
>                 Key: MAHOUT-520
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-520
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Classification
>    Affects Versions: 0.4
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>         Attachments: MAHOUT-520-syntheticcontrol.patch, MAHOUT-520.patch, 
> MAHOUT-520.patch
>
>
> Scripts like build-reuters.sh are useful in that they both demonstrate 
> typical usage of Mahout from the command-line but also serve as integration 
> tests. We should add additional scripts that drive the algorithms so new 
> users can quickly run the examples. 
> Perhaps these can also be run from hudson as a part of the nightly builds and 
> can serve as integration tests.
> As a start towards this goal, provide build-20news-bayes.sh example (in the 
> same vein as build-reuters.sh, that follows 
> https://cwiki.apache.org/MAHOUT/twenty-newsgroups.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to