+1 I would love to see that feature. Sadly enough I am no Java- guy myself.

Am 09.01.2012 um 22:59 schrieb Jeff Eastman <[email protected]>:

> Even better, you might figure out how to pass the desired delimiter into the 
> InputDriver as an argument and submit a patch to make that a permanent Mahout 
> feature. It should be straightforward and it would start you down the path to 
> become a committer.
> 
> 
> On 1/9/12 2:52 PM, Jeff Eastman wrote:
>> The Synthetic Control examples use a similar (but space delimited) input 
>> format and there is an InputDriver in integration/ which can convert those 
>> files into Mahout Vector sequence files. You could easily modify the 
>> InputMapper to be comma delimited or modify your own file formats to use 
>> spaces.
>> 
>> On 1/9/12 12:50 PM, Daniel Quach wrote:
>>> I have a file of vectors I formulated in csv format, and I want to use 
>>> mahout to perform k-means clustering on the vectors in this file.
>>> 
>>> However, it seems mahout expects the input data to be formatted in a 
>>> SequenceFile format, and I'm not sure if there's a way to easily do this 
>>> (are there existing tools?)
>>> 
>> 
> 

Reply via email to