[jira] Updated: (MAHOUT-392) Test cases for logGamma, Distribution.normal and Distribution.beta, fix for Distribution.normal

2010-05-08 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated MAHOUT-392: --- Status: Patch Available (was: Open) Test cases for logGamma, Distribution.normal and

[jira] Updated: (MAHOUT-392) Test cases for logGamma, Distribution.normal and Distribution.beta, fix for Distribution.normal

2010-05-08 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated MAHOUT-392: --- Attachment: MAHOUT-392.patch Test cases for logGamma, Distribution.normal and Distribution.beta,

[jira] Updated: (MAHOUT-376) Implement Map-reduce version of stochastic SVD

2010-05-08 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated MAHOUT-376: --- Attachment: MAHOUT-376.patch Here is a work-in-progress patch that illustrates how I plan to do the

[jira] Commented: (MAHOUT-392) Test cases for logGamma, Distribution.normal and Distribution.beta, fix for Distribution.normal

2010-05-08 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12865427#action_12865427 ] Ted Dunning commented on MAHOUT-392: Regarding the constants, since they are single

[jira] Created: (MAHOUT-392) Test cases for logGamma, Distribution.normal and Distribution.beta, fix for Distribution.normal

2010-05-07 Thread Ted Dunning (JIRA)
Test cases for logGamma, Distribution.normal and Distribution.beta, fix for Distribution.normal --- Key: MAHOUT-392 URL: https://issues.apache.org/jira/browse/MAHOUT-392

[jira] Commented: (MAHOUT-302) Change tests to use temp directories instead of output, testdata

2010-05-04 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864069#action_12864069 ] Ted Dunning commented on MAHOUT-302: I read through about 20% of the patch. For this

[jira] Commented: (MAHOUT-305) Combine both cooccurrence-based CF M/R jobs

2010-04-26 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12861171#action_12861171 ] Ted Dunning commented on MAHOUT-305: {quote} Ted says he ... doesn't like throwing out

[jira] Commented: (MAHOUT-236) Cluster Evaluation Tools

2010-04-20 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12859040#action_12859040 ] Ted Dunning commented on MAHOUT-236: Typically any place where you have an algorithm

[jira] Commented: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-13 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12856698#action_12856698 ] Ted Dunning commented on MAHOUT-364: {quote} By the way is GPL3 Apache 2 compatible?

[jira] Created: (MAHOUT-376) Implement Map-reduce version of stochastic SVD

2010-04-11 Thread Ted Dunning (JIRA)
Implement Map-reduce version of stochastic SVD -- Key: MAHOUT-376 URL: https://issues.apache.org/jira/browse/MAHOUT-376 Project: Mahout Issue Type: Bug Reporter: Ted Dunning See

[jira] Updated: (MAHOUT-376) Implement Map-reduce version of stochastic SVD

2010-04-11 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated MAHOUT-376: --- Attachment: sd.tex sd-bib.bib sd.pdf Algorithm details. Implement

[jira] Commented: (MAHOUT-369) Issues with DistributedLanczosSolver output

2010-04-08 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12854971#action_12854971 ] Ted Dunning commented on MAHOUT-369: Can you create a suggested patch? Issues with

[jira] Commented: (MAHOUT-363) Proposal for GSoC 2010 (EigenCuts clustering algorithm for Mahout)

2010-04-08 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12855068#action_12855068 ] Ted Dunning commented on MAHOUT-363: {quote} Are there any other suggestions for making

[jira] Commented: (MAHOUT-334) Proposal for GSoC2010 (Linear SVM for Mahout)

2010-04-08 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12855168#action_12855168 ] Ted Dunning commented on MAHOUT-334: There is little to say. This is an excellent

[jira] Commented: (MAHOUT-364) [GSOC] Proposal to implement Neural Network with backpropagation learning on Hadoop

2010-04-06 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12854349#action_12854349 ] Ted Dunning commented on MAHOUT-364: This is a very nicely written proposal. One

[jira] Commented: (MAHOUT-357) Implement a clustering algorithm on mapreduce

2010-04-02 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852749#action_12852749 ] Ted Dunning commented on MAHOUT-357: The idea is for you to come up with something new

[jira] Commented: (MAHOUT-357) Implement a clustering algorithm on mapreduce

2010-04-01 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852575#action_12852575 ] Ted Dunning commented on MAHOUT-357: The algorithm you describe is pretty much how

[jira] Commented: (MAHOUT-342) [GSOC] Implement Map/Reduce Enabled Neural Networks

2010-03-19 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12847506#action_12847506 ] Ted Dunning commented on MAHOUT-342: The SGD implementation (not yet committed) should

[jira] Commented: (MAHOUT-322) DistributedRowMatrix should live in SequenceFileWritable,VectorWritable instead of SequenceFileIntWritable,VectorWritable

2010-03-04 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841408#action_12841408 ] Ted Dunning commented on MAHOUT-322: I think that not contiguous might be a better way

[jira] Commented: (MAHOUT-305) Combine both cooccurrence-based CF M/R jobs

2010-02-23 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12837378#action_12837378 ] Ted Dunning commented on MAHOUT-305: My own experience is that all that counts in

[jira] Commented: (MAHOUT-305) Combine both cooccurrence-based CF M/R jobs

2010-02-23 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12837377#action_12837377 ] Ted Dunning commented on MAHOUT-305: My own experience is that all that counts in

[jira] Commented: (MAHOUT-305) Combine both cooccurrence-based CF M/R jobs

2010-02-23 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12837528#action_12837528 ] Ted Dunning commented on MAHOUT-305: {quote} Yeah in this context there's no choice but

[jira] Commented: (MAHOUT-300) Solve performance issues with Vector Implementations

2010-02-22 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836817#action_12836817 ] Ted Dunning commented on MAHOUT-300: These are getting respectable! As a quick hack,

[jira] Commented: (MAHOUT-300) Solve performance issues with Vector Implementations

2010-02-21 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836433#action_12836433 ] Ted Dunning commented on MAHOUT-300: I think that this is a cleaner style for the merge

[jira] Issue Comment Edited: (MAHOUT-300) Solve performance issues with Vector Implementations

2010-02-21 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836433#action_12836433 ] Ted Dunning edited comment on MAHOUT-300 at 2/21/10 7:52 PM: - I

[jira] Commented: (MAHOUT-300) Solve performance issues with Vector Implementations

2010-02-21 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836452#action_12836452 ] Ted Dunning commented on MAHOUT-300: Huh some of those times are a little

[jira] Commented: (MAHOUT-300) Solve performance issues with Vector Implementations

2010-02-20 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836238#action_12836238 ] Ted Dunning commented on MAHOUT-300: {quote} I dont know what to do in the edge case of

[jira] Commented: (MAHOUT-299) Collocations: improve performance by making Gram BinaryComparable

2010-02-20 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836246#action_12836246 ] Ted Dunning commented on MAHOUT-299: {quote} Just wanted to check on this - I think the

[jira] Commented: (MAHOUT-301) Improve command-line shell script by allowing default properties files

2010-02-20 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836247#action_12836247 ] Ted Dunning commented on MAHOUT-301: THis also helps non command line usage, actually.

[jira] Commented: (MAHOUT-300) Solve performance issues with Vector Implementations

2010-02-19 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12836053#action_12836053 ] Ted Dunning commented on MAHOUT-300: I think that the min and max functions need to

[jira] Commented: (MAHOUT-260) An alternative approach to RNG management

2010-02-17 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12834987#action_12834987 ] Ted Dunning commented on MAHOUT-260: Sean, Why did you use a map for storing the

[jira] Commented: (MAHOUT-279) Make RandomSeedGenerator a M/R Job

2010-02-14 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12833641#action_12833641 ] Ted Dunning commented on MAHOUT-279: Is this overlapping with the k-means++ stuff?

[jira] Commented: (MAHOUT-227) Parallel SVM

2010-02-10 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12832089#action_12832089 ] Ted Dunning commented on MAHOUT-227: Zhao, My thought is that having a good

[jira] Created: (MAHOUT-286) Need to be able to run classifiers from non-text input (such as ARFF data)

2010-02-09 Thread Ted Dunning (JIRA)
Need to be able to run classifiers from non-text input (such as ARFF data) -- Key: MAHOUT-286 URL: https://issues.apache.org/jira/browse/MAHOUT-286 Project: Mahout

[jira] Updated: (MAHOUT-286) Need to be able to run classifiers from non-text input (such as ARFF data)

2010-02-09 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated MAHOUT-286: --- Attachment: weka.log mahout.log Here are the original attachments Martin sent.

[jira] Commented: (MAHOUT-153) Implement kmeans++ for initial cluster selection in kmeans

2010-02-09 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831622#action_12831622 ] Ted Dunning commented on MAHOUT-153: I have been thinking about this problem a bit,

[jira] Commented: (MAHOUT-227) Parallel SVM

2010-02-09 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831648#action_12831648 ] Ted Dunning commented on MAHOUT-227: Is this going to be complete this week or next?

[jira] Commented: (MAHOUT-274) Use avro for serialization of structured documents.

2010-02-06 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12830669#action_12830669 ] Ted Dunning commented on MAHOUT-274: Those discussions seem pretty future tense and

[jira] Commented: (MAHOUT-237) Map/Reduce Implementation of Document Vectorizer

2010-02-02 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12828763#action_12828763 ] Ted Dunning commented on MAHOUT-237: {quote} Seems like the Text field Vector Class

[jira] Commented: (MAHOUT-269) Vector.maxValue() returns Double.MIN_VALUE for vectors with all negative entries.

2010-01-27 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805628#action_12805628 ] Ted Dunning commented on MAHOUT-269: This code looks like it uses Math.max to

[jira] Commented: (MAHOUT-269) Vector.maxValue() returns Double.MIN_VALUE for vectors with all negative entries.

2010-01-27 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805636#action_12805636 ] Ted Dunning commented on MAHOUT-269: I don't understand your point. If there are any

[jira] Commented: (MAHOUT-269) Vector.maxValue() returns Double.MIN_VALUE for vectors with all negative entries.

2010-01-27 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805653#action_12805653 ] Ted Dunning commented on MAHOUT-269: Thank you. Yes. That was my thought.

[jira] Commented: (MAHOUT-209) Add aggregate() methods for Vector

2010-01-25 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12804663#action_12804663 ] Ted Dunning commented on MAHOUT-209: These look nearly good enough to commit as they

[jira] Commented: (MAHOUT-242) LLR Collocation Identifier

2010-01-23 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12804191#action_12804191 ] Ted Dunning commented on MAHOUT-242: {quote} Each of the existing attributes are

[jira] Commented: (MAHOUT-242) LLR Collocation Identifier

2010-01-22 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12803979#action_12803979 ] Ted Dunning commented on MAHOUT-242: Drew, I think that what we really need is a

[jira] Commented: (MAHOUT-242) LLR Collocation Identifier

2010-01-21 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12803361#action_12803361 ] Ted Dunning commented on MAHOUT-242: {quote} Are we really worried about individual

[jira] Commented: (MAHOUT-263) Matrix interface should extend IterableVector for better integration with distributed storage

2010-01-20 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802960#action_12802960 ] Ted Dunning commented on MAHOUT-263: The idea of iterating through inputs sequentially

[jira] Commented: (MAHOUT-153) Implement kmeans++ for initial cluster selection in kmeans

2010-01-18 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801960#action_12801960 ] Ted Dunning commented on MAHOUT-153: +1 to what Grant said. Go ahead and post a patch

[jira] Commented: (MAHOUT-228) Need sequential logistic regression implementation using SGD techniques

2010-01-18 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802106#action_12802106 ] Ted Dunning commented on MAHOUT-228: {quote} make sure that L1 is sparsity inducing my

[jira] Commented: (MAHOUT-260) An alternative approach to RNG management

2010-01-17 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801676#action_12801676 ] Ted Dunning commented on MAHOUT-260: Pretty fancy. I would prefer direct injection of

[jira] Commented: (MAHOUT-228) Need sequential logistic regression implementation using SGD techniques

2010-01-14 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800248#action_12800248 ] Ted Dunning commented on MAHOUT-228: We need a few things: - a few functions should

[jira] Commented: (MAHOUT-232) Implementation of sequential SVM solver based on Pegasos

2010-01-07 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12797741#action_12797741 ] Ted Dunning commented on MAHOUT-232: zhaozhendong, Nice results so far. I would

[jira] Commented: (MAHOUT-185) Add mahout shell script for easy launching of various algorithms

2010-01-06 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12797277#action_12797277 ] Ted Dunning commented on MAHOUT-185: Regarding the properties file idea, I have had

[jira] Commented: (MAHOUT-153) Implement kmeans++ for initial cluster selection in kmeans

2010-01-04 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796323#action_12796323 ] Ted Dunning commented on MAHOUT-153: {quote} On Mon, Jan 4, 2010 at 4:03 AM, Palleti,

[jira] Commented: (MAHOUT-173) Implement clustering of massive-domain attributes

2010-01-04 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796538#action_12796538 ] Ted Dunning commented on MAHOUT-173: It seems that this algorithm is a combination of

[jira] Commented: (MAHOUT-106) PLSI/EM in pig based on hofmann's ACM 04 paper.

2010-01-03 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12796007#action_12796007 ] Ted Dunning commented on MAHOUT-106: Pig programs are a pain in the * because Pig has

[jira] Commented: (MAHOUT-232) Implementation of sequential SVM solver based on Pegasos

2010-01-01 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12795813#action_12795813 ] Ted Dunning commented on MAHOUT-232: The 0.1 patch compiles for me, but the 0.2 patch

[jira] Commented: (MAHOUT-232) Implementation of sequential SVM solver based on Pegasos

2010-01-01 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12795839#action_12795839 ] Ted Dunning commented on MAHOUT-232: I had only a few minutes just now to look at this

[jira] Commented: (MAHOUT-235) GenericSorting.java also needs replacing

2009-12-31 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12795704#action_12795704 ] Ted Dunning commented on MAHOUT-235: I just looked at this. Didn't find the problem,

[jira] Issue Comment Edited: (MAHOUT-235) GenericSorting.java also needs replacing

2009-12-31 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12795704#action_12795704 ] Ted Dunning edited comment on MAHOUT-235 at 12/31/09 11:24 PM:

[jira] Commented: (MAHOUT-220) Mahout Bayes Code cleanup

2009-12-30 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12795551#action_12795551 ] Ted Dunning commented on MAHOUT-220: {quote} FWIW, I'd say stuff that converts text,

[jira] Commented: (MAHOUT-220) Mahout Bayes Code cleanup

2009-12-29 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12795122#action_12795122 ] Ted Dunning commented on MAHOUT-220: Anil, See classifier.sgd.TermRandomizer (and

[jira] Commented: (MAHOUT-220) Mahout Bayes Code cleanup

2009-12-29 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12795135#action_12795135 ] Ted Dunning commented on MAHOUT-220: {quote} Robin: I am not very clear what is

[jira] Commented: (MAHOUT-220) Mahout Bayes Code cleanup

2009-12-29 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12795136#action_12795136 ] Ted Dunning commented on MAHOUT-220: {quote} For sgd algorithm. I suggest you define

[jira] Commented: (MAHOUT-220) Mahout Bayes Code cleanup

2009-12-28 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12794976#action_12794976 ] Ted Dunning commented on MAHOUT-220: Robin, I was just looking at some of the code

[jira] Commented: (MAHOUT-228) Need sequential logistic regression implementation using SGD techniques

2009-12-26 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12794661#action_12794661 ] Ted Dunning commented on MAHOUT-228: The orginal code was very nearly correct as it

[jira] Updated: (MAHOUT-228) Need sequential logistic regression implementation using SGD techniques

2009-12-25 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated MAHOUT-228: --- Attachment: r.csv logP.csv sgd.csv I have been doing some testing

[jira] Updated: (MAHOUT-228) Need sequential logistic regression implementation using SGD techniques

2009-12-25 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated MAHOUT-228: --- Attachment: sgd-derivation.tex sgd-derivation.pdf Here are the derivations of the

[jira] Updated: (MAHOUT-228) Need sequential logistic regression implementation using SGD techniques

2009-12-25 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated MAHOUT-228: --- Attachment: (was: sgd-derivation.pdf) Need sequential logistic regression implementation using

[jira] Updated: (MAHOUT-228) Need sequential logistic regression implementation using SGD techniques

2009-12-25 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated MAHOUT-228: --- Attachment: (was: MAHOUT-228-1.patch) Need sequential logistic regression implementation using

[jira] Updated: (MAHOUT-228) Need sequential logistic regression implementation using SGD techniques

2009-12-23 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated MAHOUT-228: --- Attachment: MAHOUT-228-1.patch Here is the actual patch file. Need sequential logistic regression

[jira] Commented: (MAHOUT-228) Need sequential logistic regression implementation using SGD techniques

2009-12-23 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12794230#action_12794230 ] Ted Dunning commented on MAHOUT-228: This implementation is purely logistic

[jira] Commented: (MAHOUT-173) Implement clustering of massive-domain attributes

2009-12-23 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12794320#action_12794320 ] Ted Dunning commented on MAHOUT-173: Go right ahead and implement it or at least scope

[jira] Updated: (MAHOUT-228) Need sequential logistic regression implementation using SGD techniques

2009-12-23 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated MAHOUT-228: --- Attachment: MAHOUT-228-2.patch Updated to avoid googles guava libraries. Need sequential logistic

[jira] Commented: (MAHOUT-227) Parallel SVM

2009-12-21 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12793487#action_12793487 ] Ted Dunning commented on MAHOUT-227: {quote} I understand this concern. Actually, if we

[jira] Issue Comment Edited: (MAHOUT-227) Parallel SVM

2009-12-21 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12793497#action_12793497 ] Ted Dunning edited comment on MAHOUT-227 at 12/22/09 4:42 AM: --

[jira] Commented: (MAHOUT-226) Velocity-based code generation support to support more primitive type collections

2009-12-20 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792971#action_12792971 ] Ted Dunning commented on MAHOUT-226: Benson, I am completely down with you modifying

[jira] Commented: (MAHOUT-227) Parallel SVM

2009-12-20 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12793088#action_12793088 ] Ted Dunning commented on MAHOUT-227: Here are a few formatting suggestions: a) when

[jira] Commented: (MAHOUT-226) Velocity-based code generation support to support more primitive type collections

2009-12-18 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792762#action_12792762 ] Ted Dunning commented on MAHOUT-226: This bit looks a bit odd: {noformat} public

[jira] Commented: (MAHOUT-219) Unit test for GenericSorting

2009-12-11 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789437#action_12789437 ] Ted Dunning commented on MAHOUT-219: This looks like a fine patch (just a test and a

[jira] Commented: (MAHOUT-116) Decode matrix methods

2009-12-11 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789438#action_12789438 ] Ted Dunning commented on MAHOUT-116: Hasn't this been subsumed by other work?

[jira] Commented: (MAHOUT-168) Need integer compression routines

2009-12-11 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789539#action_12789539 ] Ted Dunning commented on MAHOUT-168: Then intent for this was for improving the storage

[jira] Commented: (MAHOUT-212) Need random sampler for use in reducers

2009-12-10 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12788839#action_12788839 ] Ted Dunning commented on MAHOUT-212: Awesome. Thanks. I have no dependencies. I

[jira] Commented: (MAHOUT-208) Vector.getLengthSquared() is dangerously optimized

2009-12-10 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12788854#action_12788854 ] Ted Dunning commented on MAHOUT-208: This caching can be a really major win so I would

[jira] Commented: (MAHOUT-208) Vector.getLengthSquared() is dangerously optimized

2009-12-10 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12788862#action_12788862 ] Ted Dunning commented on MAHOUT-208: (ted speaking from far down the slippery slope)

[jira] Commented: (MAHOUT-216) Improve the results of MAHOUT-145 by uniformly distributing the classes in the partitioned data

2009-12-10 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12788961#action_12788961 ] Ted Dunning commented on MAHOUT-216: Couldn't you just resort the data using random

[jira] Commented: (MAHOUT-212) Need random sampler for use in reducers

2009-12-09 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12788262#action_12788262 ] Ted Dunning commented on MAHOUT-212: I am snowed under and won't get to this for at

[jira] Commented: (MAHOUT-212) Need random sampler for use in reducers

2009-12-07 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786820#action_12786820 ] Ted Dunning commented on MAHOUT-212: Kinda existed, but SamplingIterator takes a

[jira] Commented: (MAHOUT-212) Need random sampler for use in reducers

2009-12-07 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787020#action_12787020 ] Ted Dunning commented on MAHOUT-212: bq. I had suggested we not use both

[jira] Commented: (MAHOUT-212) Need random sampler for use in reducers

2009-12-07 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12787143#action_12787143 ] Ted Dunning commented on MAHOUT-212: Here is another patch. I moved SamplingIterator

[jira] Updated: (MAHOUT-212) Need random sampler for use in reducers

2009-12-07 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated MAHOUT-212: --- Attachment: MAHOUT-212-b.patch Here is the actual patch. Need random sampler for use in reducers

[jira] Created: (MAHOUT-212) Need random sampler for use in reducers

2009-12-06 Thread Ted Dunning (JIRA)
Need random sampler for use in reducers --- Key: MAHOUT-212 URL: https://issues.apache.org/jira/browse/MAHOUT-212 Project: Mahout Issue Type: Bug Components: Utils Affects Versions: 0.2

[jira] Assigned: (MAHOUT-212) Need random sampler for use in reducers

2009-12-06 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning reassigned MAHOUT-212: -- Assignee: Ted Dunning Need random sampler for use in reducers

[jira] Updated: (MAHOUT-212) Need random sampler for use in reducers

2009-12-06 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated MAHOUT-212: --- Status: Patch Available (was: Open) Code plus test cases. Ready for use. I think. Need random

[jira] Assigned: (MAHOUT-212) Need random sampler for use in reducers

2009-12-06 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning reassigned MAHOUT-212: -- Assignee: Sean Owen (was: Ted Dunning) Need random sampler for use in reducers

[jira] Updated: (MAHOUT-212) Need random sampler for use in reducers

2009-12-06 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated MAHOUT-212: --- Attachment: MAHOUT-212.patch Hmm... didn't get asked for where the patch file was when marking the

[jira] Commented: (MAHOUT-207) AbstractVector.hashCode() should not care about the order of iteration over elements

2009-11-24 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12782051#action_12782051 ] Ted Dunning commented on MAHOUT-207: I think that 159 is superseded by this work.

[jira] Commented: (MAHOUT-165) Using better primitives hash for sparse vector for performance gains

2009-11-23 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12781451#action_12781451 ] Ted Dunning commented on MAHOUT-165: bq. From there, refactoring Vector to not have a

[jira] Commented: (MAHOUT-204) Better integration of Mahout matrix capabilities with Colt Matrix additions

2009-11-23 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12781787#action_12781787 ] Ted Dunning commented on MAHOUT-204: It would be great to have a ginormous patch right

[jira] Commented: (MAHOUT-45) Matrix QR decomposition

2009-11-19 Thread Ted Dunning (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12780209#action_12780209 ] Ted Dunning commented on MAHOUT-45: --- This is largely superseded by Jake's recent work.

  1   2   >