Nothing's fixed yet, trying to revive the discussion around this to see if it leads to anything tangible it if this needs to be resolved as 'Won't fix' .
Sent from my iPhone > On Mar 2, 2014, at 5:18 PM, peng <[email protected]> wrote: > > Wow, waiting this for a long time, finally fixed. > >> On Sun 02 Mar 2014 05:01:26 PM EST, Suneel Marthi (JIRA) wrote: >> >> [ >> https://issues.apache.org/jira/browse/MAHOUT-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel >> ] >> >> Suneel Marthi updated MAHOUT-1178: >> ---------------------------------- >> >> Fix Version/s: (was: Backlog) >> 1.0 >> >>> GSOC 2013: Improve Lucene support in Mahout >>> ------------------------------------------- >>> >>> Key: MAHOUT-1178 >>> URL: https://issues.apache.org/jira/browse/MAHOUT-1178 >>> Project: Mahout >>> Issue Type: New Feature >>> Reporter: Dan Filimon >>> Assignee: Gokhan Capan >>> Labels: gsoc2013, mentor >>> Fix For: 1.0 >>> >>> Attachments: MAHOUT-1178-TEST.patch, MAHOUT-1178.patch >>> >>> >>> [via Ted Dunning] >>> It should be possible to view a Lucene index as a matrix. This would >>> require that we standardize on a way to convert documents to rows. There >>> are many choices, the discussion of which should be deferred to the actual >>> work on the project, but there are a few obvious constraints: >>> a) it should be possible to get the same result as dumping the term vectors >>> for each document each to a line and converting that result using standard >>> Mahout methods. >>> b) numeric fields ought to work somehow. >>> c) if there are multiple text fields that ought to work sensibly as well. >>> Two options include dumping multiple matrices or to convert the fields >>> into a single row of a single matrix. >>> d) it should be possible to refer back from a row of the matrix to find the >>> correct document. THis might be because we remember the Lucene doc number >>> or because a field is named as holding a unique id. >>> e) named vectors and matrices should be used if plausible. >> >> >> >> -- >> This message was sent by Atlassian JIRA >> (v6.2#6252)
