Pat Ferrel created MAHOUT-1568: ---------------------------------- Summary: Build an I/O model that can replace sequence files for import/export Key: MAHOUT-1568 URL: https://issues.apache.org/jira/browse/MAHOUT-1568 Project: Mahout Issue Type: New Feature Components: CLI Environment: Scala, Spark Reporter: Pat Ferrel Assignee: Pat Ferrel
Implement mechanisms to read and write data from/to flexible stores. These will support tuples streams and drms but with extensions that allow keeping user defined values for IDs. The mechanism in some sense can replace Sequence Files for import/export and will make the operation much easier for the user. In many cases directly consuming their input files. Start with text delimited files for input/output in the Spark version of ItemSimilarity A proposal is running with ItemSimilarity on Spark which and is documented on the github wiki here: https://github.com/pferrel/harness/wiki Comments are appreciated -- This message was sent by Atlassian JIRA (v6.2#6252)