jay vyas created MAHOUT-1421:
--------------------------------
Summary: Adapter package for all mahout tools
Key: MAHOUT-1421
URL: https://issues.apache.org/jira/browse/MAHOUT-1421
Project: Mahout
Issue Type: Improvement
Reporter: jay vyas
Hi mahout. I'd like to create an umbrella JIRA for allowing more runtime
flexibility for reading different types of input formats for all mahout tasks.
Specifically, I'd like to start with the FreeTextRecommenderAdapeter, which
typically requires:
1) Hashing text entries into numbers
2) Saving the large transformed file on disk
3) Feeding it into classifieer
Instead, we could build adapters into the classifier itself, so that the user
1) Specifies input file to recommender
2) Specifies transformation class which converts each record of input to 3
column recommender format
3) Runs internal mahout recommender directly against the data
And thus the user could easily run mahout against existing data without having
to munge it to much.
This package might be called something like "org.apache.mahout.adapters", and
would over time provide flexible adapters to the core mahout algorithm
implementations, so that folks wouldnt have to worry so much about vectors/csv
transformers/etc...
Any thoughts on this? If positive feedback I can submit an initial patch to
get things started.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)