--input could be misleading, if we dont specify what format the input
file is in. Like DictionaryVectorizer needs Text,Text  .  Kmeans need
Text, VW
--input is ok if we can create a input tester which tests and throws
error if the files are not in the required format. Cheaper than
launching a map/reduce across a cluster and see maps fail

Robin

Reply via email to