What is a good regression test for this? Not a unit test, but something that demonstrates the algorithms in action at the amount of data where they become useful?
Preferably from a small dataset. On 5/21/11, Ted Dunning <ted.dunn...@gmail.com> wrote: > On Sat, May 21, 2011 at 4:25 PM, Hector Yee <hector....@gmail.com> wrote: > >> Sure, or I can wait till you submit patches before working on the next >> one? >> > > I think that submit == commit. > > But in any case, don't wait for anything. Find ways forward. We are in the > middle of a release cycle right now so nothing new is going to be committed > for a little while (another week, possibly). > > >> How would the github repo work? I just clone the apache git version and >> check it in there? >> > > Yes. Exactly. And if you want me to help rebasing to track trunk, give me > a committer bit. That won't be very necessary, of course, while trunk is > frozen. > > Then periodically, you can use [git diff --no-prefix trunk] to dump a patch > that can be added to the JIRA. That will allow non-git users to track > progress as well. > > > >> >> On Sun, May 22, 2011 at 3:41 AM, Ted Dunning <ted.dunn...@gmail.com> >> wrote: >> >> > Hector, >> > >> > You are working on a variety of things here that have interdependencies. >> > >> > What would you think about a github repo where you can keep track of >> > them >> > with multiple branches and we can all avoid problems with patches not >> > applying. >> > >> > If you like, I can help out keeping your branches up to date relative to >> > trunk. >> > >> > On Sat, May 21, 2011 at 1:54 AM, Hector Yee (JIRA) <j...@apache.org> >> > wrote: >> > >> > > >> > > [ >> > > >> > >> https://issues.apache.org/jira/browse/MAHOUT-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037289#comment-13037289 >> > ] >> > > >> > > Hector Yee commented on MAHOUT-703: >> > > ----------------------------------- >> > > >> > > Note: This patch requires 702 for the OnlineBaseTest. >> > > >> > > > Implement Gradient machine >> > > > -------------------------- >> > > > >> > > > Key: MAHOUT-703 >> > > > URL: >> https://issues.apache.org/jira/browse/MAHOUT-703 >> > > > Project: Mahout >> > > > Issue Type: New Feature >> > > > Components: Classification >> > > > Affects Versions: 0.6 >> > > > Reporter: Hector Yee >> > > > Priority: Minor >> > > > Labels: features >> > > > Fix For: 0.6 >> > > > >> > > > Attachments: MAHOUT-703.patch >> > > > >> > > > Original Estimate: 72h >> > > > Remaining Estimate: 72h >> > > > >> > > > Implement a gradient machine (aka 'neural network) that can be used >> for >> > > classification or auto-encoding. >> > > > It will just have an input layer, identity, sigmoid or tanh hidden >> > layer >> > > and an output layer. >> > > > Training done by stochastic gradient descent (possibly mini-batch >> > later). >> > > > Sparsity will be optionally enforced by tweaking the bias in the >> hidden >> > > unit. >> > > > For now it will go in classifier/sgd and the auto-encoder will wrap >> it >> > in >> > > the filter unit later on. >> > > >> > > -- >> > > This message is automatically generated by JIRA. >> > > For more information on JIRA, see: >> > http://www.atlassian.com/software/jira >> > > >> > >> >> >> >> -- >> Yee Yang Li Hector >> http://hectorgon.blogspot.com/ (tech + travel) >> http://hectorgon.com (book reviews) >> > -- Lance Norskog goks...@gmail.com