[jira] [Commented] (MAHOUT-979) RowSimilarityJob should be able to infer the number of columns from the input matrix if not specified

2012-05-08 Thread Sebastian Schelter (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270236#comment-13270236 ] Sebastian Schelter commented on MAHOUT-979: --- Patch looks very good. I have one

[jira] [Created] (MAHOUT-1008) Remove link analysis package

2012-05-08 Thread Sebastian Schelter (JIRA)
Sebastian Schelter created MAHOUT-1008: -- Summary: Remove link analysis package Key: MAHOUT-1008 URL: https://issues.apache.org/jira/browse/MAHOUT-1008 Project: Mahout Issue Type: Task

[jira] [Updated] (MAHOUT-1008) Remove link analysis package

2012-05-08 Thread Sebastian Schelter (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Schelter updated MAHOUT-1008: --- Attachment: MAHOUT-1008.patch I will remove the linkanalysis package in 2 days if

[jira] [Commented] (MAHOUT-944) LuceneIndexToSequenceFiles (lucene2seq) utility

2012-05-08 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270367#comment-13270367 ] Grant Ingersoll commented on MAHOUT-944: I'll try to get to this patch this week.

[jira] [Commented] (MAHOUT-979) RowSimilarityJob should be able to infer the number of columns from the input matrix if not specified

2012-05-08 Thread Suneel Marthi (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270393#comment-13270393 ] Suneel Marthi commented on MAHOUT-979: -- Uploaded patch with fixes based on

[jira] [Updated] (MAHOUT-979) RowSimilarityJob should be able to infer the number of columns from the input matrix if not specified

2012-05-08 Thread Suneel Marthi (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suneel Marthi updated MAHOUT-979: - Attachment: Mahout-979.patch RowSimilarityJob should be able to infer the number of columns

[jira] [Updated] (MAHOUT-979) RowSimilarityJob should be able to infer the number of columns from the input matrix if not specified

2012-05-08 Thread Suneel Marthi (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suneel Marthi updated MAHOUT-979: - Attachment: (was: Mahout-979.patch) RowSimilarityJob should be able to infer the number

[jira] [Commented] (MAHOUT-1007) Performance improvement in recommenditembased by splitting long records

2012-05-08 Thread Sebastian Schelter (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270441#comment-13270441 ] Sebastian Schelter commented on MAHOUT-1007: Unsymmetrify taking very long is

[jira] [Issue Comment Edited] (MAHOUT-979) RowSimilarityJob should be able to infer the number of columns from the input matrix if not specified

2012-05-08 Thread Suneel Marthi (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270052#comment-13270052 ] Suneel Marthi edited comment on MAHOUT-979 at 5/8/12 2:19 PM: --

Making Mahout Leaner

2012-05-08 Thread Robin Anil
Based on some discussion on the private group about where Mahout is faltering in the real world, a stream of thought bubbled up - Make Mahout leaner. i.e push the best stuff we have to the top and prune out algorithms that are underperforming. The main issue here is that Iterative nature of many

Re: Making Mahout Leaner

2012-05-08 Thread Ted Dunning
To add to this and to save the breath of the participants in the formerly private discussion, it seems like there is rough consensus about removing cruft, but there has also been quite a bit of desire to be very sensitive to the needs of current and planned production users. Somewhat less

Re: Making Mahout Leaner

2012-05-08 Thread Ted Dunning
This is frustrating to consider losing Bayes, but I would consider keeping it if only to decrease the number of questions on the list about why the examples from the book don't work. On Tue, May 8, 2012 at 8:11 AM, Robin Anil robin.a...@gmail.com wrote: - Bayes + Random Forest - Seems a shame

Re: Making Mahout Leaner

2012-05-08 Thread Jake Mannix
On Tue, May 8, 2012 at 9:31 AM, Ted Dunning ted.dunn...@gmail.com wrote: This is frustrating to consider losing Bayes, but I would consider keeping it if only to decrease the number of questions on the list about why the examples from the book don't work. Could maybe someone just sit down

[jira] [Created] (MAHOUT-1009) Remove old LDA implementation from codebase

2012-05-08 Thread Jake Mannix (JIRA)
Jake Mannix created MAHOUT-1009: --- Summary: Remove old LDA implementation from codebase Key: MAHOUT-1009 URL: https://issues.apache.org/jira/browse/MAHOUT-1009 Project: Mahout Issue Type:

[jira] [Assigned] (MAHOUT-1009) Remove old LDA implementation from codebase

2012-05-08 Thread Jake Mannix (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jake Mannix reassigned MAHOUT-1009: --- Assignee: Jake Mannix Remove old LDA implementation from codebase

[jira] [Created] (MAHOUT-1010) Remove the old naive bayes implementation (org.apache.mahout.classifier.bayes) from the codebase

2012-05-08 Thread Sebastian Schelter (JIRA)
Sebastian Schelter created MAHOUT-1010: -- Summary: Remove the old naive bayes implementation (org.apache.mahout.classifier.bayes) from the codebase Key: MAHOUT-1010 URL:

[jira] [Reopened] (MAHOUT-979) RowSimilarityJob should be able to infer the number of columns from the input matrix if not specified

2012-05-08 Thread Suneel Marthi (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suneel Marthi reopened MAHOUT-979: -- RowSimilarityJob should be able to infer the number of columns from the input matrix if not

[jira] [Commented] (MAHOUT-979) RowSimilarityJob should be able to infer the number of columns from the input matrix if not specified

2012-05-08 Thread Suneel Marthi (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270687#comment-13270687 ] Suneel Marthi commented on MAHOUT-979: -- Reopening the issue as the submitted patch is

[jira] [Commented] (MAHOUT-979) RowSimilarityJob should be able to infer the number of columns from the input matrix if not specified

2012-05-08 Thread Sebastian Schelter (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270700#comment-13270700 ] Sebastian Schelter commented on MAHOUT-979: --- I'm currently working in your

[jira] [Updated] (MAHOUT-979) RowSimilarityJob should be able to infer the number of columns from the input matrix if not specified

2012-05-08 Thread Suneel Marthi (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suneel Marthi updated MAHOUT-979: - Status: Patch Available (was: Reopened) RowSimilarityJob should be able to infer the

[jira] [Updated] (MAHOUT-979) RowSimilarityJob should be able to infer the number of columns from the input matrix if not specified

2012-05-08 Thread Suneel Marthi (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suneel Marthi updated MAHOUT-979: - Attachment: Mahout-979.patch RowSimilarityJob should be able to infer the number of columns

[jira] [Commented] (MAHOUT-979) RowSimilarityJob should be able to infer the number of columns from the input matrix if not specified

2012-05-08 Thread Sebastian Schelter (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270736#comment-13270736 ] Sebastian Schelter commented on MAHOUT-979: --- What did you change in the latest

Build failed in Jenkins: Mahout-Examples-Cluster-Reuters #126

2012-05-08 Thread Apache Jenkins Server
See https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters/126/ -- [...truncated 6079 lines...] 12/05/08 19:26:55 INFO mapred.LocalJobRunner: 12/05/08 19:26:55 INFO mapred.Task: Task 'attempt_local_0003_m_00_0' done. 12/05/08 19:26:55 INFO

[jira] [Commented] (MAHOUT-979) RowSimilarityJob should be able to infer the number of columns from the input matrix if not specified

2012-05-08 Thread Suneel Marthi (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270751#comment-13270751 ] Suneel Marthi commented on MAHOUT-979: -- Modified getDimensions() to take the

[jira] [Commented] (MAHOUT-979) RowSimilarityJob should be able to infer the number of columns from the input matrix if not specified

2012-05-08 Thread Sebastian Schelter (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270784#comment-13270784 ] Sebastian Schelter commented on MAHOUT-979: --- I already changed that :)

[jira] [Commented] (MAHOUT-979) RowSimilarityJob should be able to infer the number of columns from the input matrix if not specified

2012-05-08 Thread Suneel Marthi (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270786#comment-13270786 ] Suneel Marthi commented on MAHOUT-979: -- Uploading a patch again, may be the final

[jira] [Commented] (MAHOUT-979) RowSimilarityJob should be able to infer the number of columns from the input matrix if not specified

2012-05-08 Thread Sebastian Schelter (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270785#comment-13270785 ] Sebastian Schelter commented on MAHOUT-979: --- Patch committed. Thanks for the

[jira] [Issue Comment Edited] (MAHOUT-979) RowSimilarityJob should be able to infer the number of columns from the input matrix if not specified

2012-05-08 Thread Suneel Marthi (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270786#comment-13270786 ] Suneel Marthi edited comment on MAHOUT-979 at 5/8/12 8:13 PM: --

[jira] [Commented] (MAHOUT-979) RowSimilarityJob should be able to infer the number of columns from the input matrix if not specified

2012-05-08 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13270970#comment-13270970 ] Hudson commented on MAHOUT-979: --- Integrated in Mahout-Quality #1467 (See

Jenkins build is still unstable: Mahout-Quality #1468

2012-05-08 Thread Apache Jenkins Server
See https://builds.apache.org/job/Mahout-Quality/changes