--input could be misleading, if we dont specify what format the input file is in. Like DictionaryVectorizer needs Text,Text . Kmeans need Text, VW --input is ok if we can create a input tester which tests and throws error if the files are not in the required format. Cheaper than launching a map/reduce across a cluster and see maps fail
Robin
