[jira] [Created] (SPARK-3839) Reimplement HashOuterJoin to construct hash table of only one relation

2014-10-07 Thread Liquan Pei (JIRA)
Liquan Pei created SPARK-3839: - Summary: Reimplement HashOuterJoin to construct hash table of only one relation Key: SPARK-3839 URL: https://issues.apache.org/jira/browse/SPARK-3839 Project: Spark

[jira] [Comment Edited] (SPARK-3828) Spark returns inconsistent results when building with different Hadoop version

2014-10-07 Thread Liquan Pei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14162374#comment-14162374 ] Liquan Pei edited comment on SPARK-3828 at 10/7/14 7:33 PM: It

[jira] [Commented] (SPARK-3828) Spark returns inconsistent results when building with different Hadoop version

2014-10-07 Thread Liquan Pei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14162374#comment-14162374 ] Liquan Pei commented on SPARK-3828: --- It seems that this is a bug in LineRecordReader. Fo

[jira] [Updated] (SPARK-3828) Spark returns inconsistent results when building with different Hadoop version

2014-10-06 Thread Liquan Pei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liquan Pei updated SPARK-3828: -- Summary: Spark returns inconsistent results when building with different Hadoop version (was: Spark re

[jira] [Updated] (SPARK-3828) Spark returns inconsistent results when building with different HADOOP version

2014-10-06 Thread Liquan Pei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liquan Pei updated SPARK-3828: -- Summary: Spark returns inconsistent results when building with different HADOOP version (was: Spark re

[jira] [Created] (SPARK-3828) Spark returns inconsistent result when compiling with different HADOOP version

2014-10-06 Thread Liquan Pei (JIRA)
Liquan Pei created SPARK-3828: - Summary: Spark returns inconsistent result when compiling with different HADOOP version Key: SPARK-3828 URL: https://issues.apache.org/jira/browse/SPARK-3828 Project: Spar

[jira] [Commented] (SPARK-3614) Filter on minimum occurrences of a term in IDF

2014-09-22 Thread Liquan Pei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143887#comment-14143887 ] Liquan Pei commented on SPARK-3614: --- To me, the less number of documents a term appears,

[jira] [Updated] (SPARK-3486) [MLlib]Add PySpark support for Word2Vec

2014-09-11 Thread Liquan Pei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liquan Pei updated SPARK-3486: -- Summary: [MLlib]Add PySpark support for Word2Vec (was: Add PySpark support for Word2Vec) > [MLlib]Add

[jira] [Created] (SPARK-3486) Add PySpark support for Word2Vec

2014-09-11 Thread Liquan Pei (JIRA)
Liquan Pei created SPARK-3486: - Summary: Add PySpark support for Word2Vec Key: SPARK-3486 URL: https://issues.apache.org/jira/browse/SPARK-3486 Project: Spark Issue Type: New Feature Co

[jira] [Created] (SPARK-3436) [MLlib]Streaming SVM

2014-09-08 Thread Liquan Pei (JIRA)
Liquan Pei created SPARK-3436: - Summary: [MLlib]Streaming SVM Key: SPARK-3436 URL: https://issues.apache.org/jira/browse/SPARK-3436 Project: Spark Issue Type: New Feature Components: M

[jira] [Created] (SPARK-3097) Word2Vec Performance Improvement

2014-08-17 Thread Liquan Pei (JIRA)
Liquan Pei created SPARK-3097: - Summary: Word2Vec Performance Improvement Key: SPARK-3097 URL: https://issues.apache.org/jira/browse/SPARK-3097 Project: Spark Issue Type: Improvement C

[jira] [Created] (SPARK-2907) Use mutable.HashMap to represent Model in Word2Vec

2014-08-07 Thread Liquan Pei (JIRA)
Liquan Pei created SPARK-2907: - Summary: Use mutable.HashMap to represent Model in Word2Vec Key: SPARK-2907 URL: https://issues.apache.org/jira/browse/SPARK-2907 Project: Spark Issue Type: Improv

[jira] [Updated] (SPARK-2510) word2vec: Distributed Representation of Words

2014-08-01 Thread Liquan Pei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liquan Pei updated SPARK-2510: -- Description: We would like to add parallel implementation of word2vec to MLlib. word2vec finds distribu

[jira] [Created] (SPARK-2510) word2vec: Distributed Representation of Words

2014-07-15 Thread Liquan Pei (JIRA)
Liquan Pei created SPARK-2510: - Summary: word2vec: Distributed Representation of Words Key: SPARK-2510 URL: https://issues.apache.org/jira/browse/SPARK-2510 Project: Spark Issue Type: New Feature