Hi Sebastian, I just noticed that. My mistake, I will open a ticket. Thanks, Alex
-----Original Message----- From: Sebastian Schelter [mailto:[email protected]] Sent: Thursday, July 18, 2013 1:33 PM To: [email protected] Subject: Re: classifier.sgd.CsvRecordFactory incorrect CSV parsing Hello Alex, thank you for willing to contribute. Unfortunately you cannot send attachments via this list. Could you open a jira ticket at https://issues.apache.org/jira/browse/MAHOUT and upload your patch there? -sebastian 2013/7/18 Alexander Franchuk <[email protected]> > Hi All,**** > > I’ve been working with mahout for an internship this summer, and in > the process I noticed that the CsvRecordFactory class uses incorrect > parsing of CSV files. So I made a fix for this, which is in the attached > patch file. > It’s not a huge change or anything, but I thought it would be helpful > for people. This will also fix the demo programs in the mahout > distribution from failing due to incorrect parsing of CSV files. For > instance, if you have a double-quoted field with a comma in it, the > demo programs will incorrectly divide the field into two, which in > some cases causes parsing problems, and even if the program doesn’t > fail, it will of course cause incorrect results.**** > > ** ** > > This patch causes the class to use the solr-commons-csv.jar file, > which I noticed was included in the mahout distribution.**** > > ** ** > > Hope this helps! And thanks for all your work, my experience with > Mahout has been great so far.**** > > Alex Franchuk**** >
