[jira] [Commented] (SPARK-2389) globally shared SparkContext / shared Spark application

2015-01-26 Thread Robert Stupp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291818#comment-14291818 ] Robert Stupp commented on SPARK-2389: - [~srowen] yes, the problem is that drivers

[jira] [Closed] (SPARK-5303) applySchema returns NullPointerException

2015-01-26 Thread Mauro Pirrone (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mauro Pirrone closed SPARK-5303. Resolution: Not a Problem applySchema returns NullPointerException

[jira] [Created] (SPARK-5409) Broken link in documentation

2015-01-26 Thread Mauro Pirrone (JIRA)
Mauro Pirrone created SPARK-5409: Summary: Broken link in documentation Key: SPARK-5409 URL: https://issues.apache.org/jira/browse/SPARK-5409 Project: Spark Issue Type: Documentation

[jira] [Commented] (SPARK-2389) globally shared SparkContext / shared Spark application

2015-01-26 Thread Robert Stupp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291797#comment-14291797 ] Robert Stupp commented on SPARK-2389: - bq. That aside, why doesn't it scale? Simply

[jira] [Commented] (SPARK-2389) globally shared SparkContext / shared Spark application

2015-01-26 Thread Murat Eken (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291795#comment-14291795 ] Murat Eken commented on SPARK-2389: --- [~sowen], I think Robert is talking about fault

[jira] [Commented] (SPARK-2389) globally shared SparkContext / shared Spark application

2015-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291788#comment-14291788 ] Sean Owen commented on SPARK-2389: -- Yes, the SPOF problem makes sense. It doesn't seem to

[jira] [Commented] (SPARK-2389) globally shared SparkContext / shared Spark application

2015-01-26 Thread Robert Stupp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291798#comment-14291798 ] Robert Stupp commented on SPARK-2389: - bq. fault tolerance when he mentions

[jira] [Commented] (SPARK-2389) globally shared SparkContext / shared Spark application

2015-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291804#comment-14291804 ] Sean Owen commented on SPARK-2389: -- Yes, makes sense. Maxing out one driver isn't an

[jira] [Commented] (SPARK-5409) Broken link in documentation

2015-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291823#comment-14291823 ] Sean Owen commented on SPARK-5409: -- Should just be

[jira] [Closed] (SPARK-5407) No 1.2 AMI available for ec2

2015-01-26 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HÃ¥kan Jonsson closed SPARK-5407. Resolution: Invalid Error on my side. No 1.2 AMI available for ec2

[jira] [Created] (SPARK-5410) Error parsing scientific notation in a select statement

2015-01-26 Thread Hugo Ferrira (JIRA)
Hugo Ferrira created SPARK-5410: --- Summary: Error parsing scientific notation in a select statement Key: SPARK-5410 URL: https://issues.apache.org/jira/browse/SPARK-5410 Project: Spark Issue

[jira] [Resolved] (SPARK-3852) Document spark.driver.extra* configs

2015-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-3852. -- Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Sean Owen Target

[jira] [Resolved] (SPARK-4430) Apache RAT Checks fail spuriously on test files

2015-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-4430. -- Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Sean Owen Apache RAT Checks fail

[jira] [Commented] (SPARK-595) Document local-cluster mode

2015-01-26 Thread Vladimir Grigor (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292052#comment-14292052 ] Vladimir Grigor commented on SPARK-595: --- +1 for reopen Document local-cluster mode

[jira] [Commented] (SPARK-5324) Results of describe can't be queried

2015-01-26 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291987#comment-14291987 ] Yanbo Liang commented on SPARK-5324: [~marmbrus] I have pull a request for this issue

[jira] [Commented] (SPARK-5324) Results of describe can't be queried

2015-01-26 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292008#comment-14292008 ] Yanbo Liang commented on SPARK-5324: https://github.com/apache/spark/pull/4207

[jira] [Commented] (SPARK-5355) SparkConf is not thread-safe

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292048#comment-14292048 ] Apache Spark commented on SPARK-5355: - User 'davies' has created a pull request for

[jira] [Commented] (SPARK-794) Remove sleep() in ClusterScheduler.stop

2015-01-26 Thread Brennon York (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292303#comment-14292303 ] Brennon York commented on SPARK-794: [~joshrosen] How is this PR holding up? I haven't

[jira] [Reopened] (SPARK-595) Document local-cluster mode

2015-01-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reopened SPARK-595: -- I've re-opened this issue. Folks are using the API in the wild and we're not going to break compatibility

[jira] [Commented] (SPARK-5395) Large number of Python workers causing resource depletion

2015-01-26 Thread Mark Khaitman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292121#comment-14292121 ] Mark Khaitman commented on SPARK-5395: -- Having the same issue in standalone

[jira] [Commented] (SPARK-5400) Rename GaussianMixtureEM to GaussianMixture

2015-01-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292176#comment-14292176 ] Joseph K. Bradley commented on SPARK-5400: -- I agree this could be done either

[jira] [Commented] (SPARK-5162) Python yarn-cluster mode

2015-01-26 Thread Vladimir Grigor (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292069#comment-14292069 ] Vladimir Grigor commented on SPARK-5162: I second [~jared.holmb...@orchestro.com]

[jira] [Updated] (SPARK-5236) java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.MutableAny cannot be cast to org.apache.spark.sql.catalyst.expressions.MutableInt

2015-01-26 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-5236: -- Description: {code} 15/01/14 05:39:27 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 18.0 (TID

[jira] [Commented] (SPARK-2688) Need a way to run multiple data pipeline concurrently

2015-01-26 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292448#comment-14292448 ] Xuefu Zhang commented on SPARK-2688: Yeah. We don't need a syntactic suger, but a

[jira] [Commented] (SPARK-3644) REST API for Spark application info (jobs / stages / tasks / storage info)

2015-01-26 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292370#comment-14292370 ] Imran Rashid commented on SPARK-3644: - [~joshrosen] Hi Josh, I've got time to

[jira] [Resolved] (SPARK-5339) build/mvn doesn't work because of invalid URL for maven's tgz.

2015-01-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-5339. Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Kousuke Saruta build/mvn

[jira] [Commented] (SPARK-3789) Python bindings for GraphX

2015-01-26 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292477#comment-14292477 ] Reynold Xin commented on SPARK-3789: Unfortunately this is not going to make it into

[jira] [Commented] (SPARK-5411) Allow SparkListeners to be specified in SparkConf and loaded when creating SparkContext

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292494#comment-14292494 ] Apache Spark commented on SPARK-5411: - User 'JoshRosen' has created a pull request for

[jira] [Created] (SPARK-5411) Allow SparkListeners to be specified in SparkConf and loaded when creating SparkContext

2015-01-26 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-5411: - Summary: Allow SparkListeners to be specified in SparkConf and loaded when creating SparkContext Key: SPARK-5411 URL: https://issues.apache.org/jira/browse/SPARK-5411

[jira] [Commented] (SPARK-3789) Python bindings for GraphX

2015-01-26 Thread Kushal Datta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292499#comment-14292499 ] Kushal Datta commented on SPARK-3789: - Sure, i will write up the design document. @

[jira] [Commented] (SPARK-2688) Need a way to run multiple data pipeline concurrently

2015-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292431#comment-14292431 ] Sean Owen commented on SPARK-2688: -- As [~irashid] says, #1 is just syntactic sugar on

[jira] [Commented] (SPARK-5226) Add DBSCAN Clustering Algorithm to MLlib

2015-01-26 Thread Dmitriy Lyubimov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292399#comment-14292399 ] Dmitriy Lyubimov commented on SPARK-5226: - All attempts to parallelize dbscan in

[jira] [Commented] (SPARK-2688) Need a way to run multiple data pipeline concurrently

2015-01-26 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292415#comment-14292415 ] Xuefu Zhang commented on SPARK-2688: #1 above is exactly what Hive needs badly. Need

[jira] [Commented] (SPARK-3789) Python bindings for GraphX

2015-01-26 Thread Kushal Datta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292468#comment-14292468 ] Kushal Datta commented on SPARK-3789: - Hi Ameet, Sorry for asking this question

[jira] [Commented] (SPARK-5416) Initialize Executor.threadPool before ExecutorSource

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292784#comment-14292784 ] Apache Spark commented on SPARK-5416: - User 'ryan-williams' has created a pull request

[jira] [Commented] (SPARK-3562) Periodic cleanup event logs

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292818#comment-14292818 ] Apache Spark commented on SPARK-3562: - User 'viper-kun' has created a pull request for

[jira] [Updated] (SPARK-3880) HBase as data source to SparkSQL

2015-01-26 Thread Yan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan updated SPARK-3880: --- Attachment: SparkSQLOnHBase_v2.0.docx HBase as data source to SparkSQL

[jira] [Updated] (SPARK-3880) HBase as data source to SparkSQL

2015-01-26 Thread Yan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan updated SPARK-3880: --- Attachment: (was: SparkSQLOnHBase_v2.docx) HBase as data source to SparkSQL

[jira] [Commented] (SPARK-5388) Provide a stable application submission gateway

2015-01-26 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292874#comment-14292874 ] Andrew Or commented on SPARK-5388: -- Hi Dale, thank you for your comments. Yes, in the

[jira] [Comment Edited] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-26 Thread Joseph Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292853#comment-14292853 ] Joseph Tang edited comment on SPARK-4846 at 1/27/15 2:46 AM: -

[jira] [Commented] (SPARK-5422) Support sending to Graphite via UDP

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292925#comment-14292925 ] Apache Spark commented on SPARK-5422: - User 'ryan-williams' has created a pull request

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-26 Thread Joseph Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292926#comment-14292926 ] Joseph Tang commented on SPARK-4846: I've added some code at

[jira] [Commented] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2015-01-26 Thread Luca Morandini (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292998#comment-14292998 ] Luca Morandini commented on SPARK-1405: --- Indeed, I have a couple students whose

[jira] [Created] (SPARK-5417) Remove redundant executor-ID set() call

2015-01-26 Thread Ryan Williams (JIRA)
Ryan Williams created SPARK-5417: Summary: Remove redundant executor-ID set() call Key: SPARK-5417 URL: https://issues.apache.org/jira/browse/SPARK-5417 Project: Spark Issue Type:

[jira] [Updated] (SPARK-5119) java.lang.ArrayIndexOutOfBoundsException on trying to train decision tree model

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5119: - Assignee: Kai Sasaki java.lang.ArrayIndexOutOfBoundsException on trying to train decision tree

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-26 Thread Joseph Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292853#comment-14292853 ] Joseph Tang commented on SPARK-4846: Sorry about the procrastination. I'm still

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-26 Thread Joseph Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292855#comment-14292855 ] Joseph Tang commented on SPARK-4846: Sorry about the procrastination. I'm still

[jira] [Updated] (SPARK-4979) Add streaming logistic regression

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4979: - Assignee: Jeremy Freeman Add streaming logistic regression -

[jira] [Updated] (SPARK-5421) SparkSql throw OOM at shuffle

2015-01-26 Thread Hong Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Shen updated SPARK-5421: - Description: ExternalAppendOnlyMap if only for the spark job that aggregator isDefined, but sparkSQL's

[jira] [Updated] (SPARK-5421) SparkSql throw OOM at shuffle

2015-01-26 Thread Hong Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Shen updated SPARK-5421: - Description: ExternalAppendOnlyMap if only for the spark job that aggregator isDefined, but sparkSQL's

[jira] [Updated] (SPARK-3726) RandomForest: Support for bootstrap options

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3726: - Target Version/s: 1.3.0 RandomForest: Support for bootstrap options

[jira] [Commented] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2015-01-26 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293001#comment-14293001 ] Joseph K. Bradley commented on SPARK-1405: -- It has not yet been merged into Spark

[jira] [Created] (SPARK-5416) Initialize Executor.threadPool before ExecutorSource

2015-01-26 Thread Ryan Williams (JIRA)
Ryan Williams created SPARK-5416: Summary: Initialize Executor.threadPool before ExecutorSource Key: SPARK-5416 URL: https://issues.apache.org/jira/browse/SPARK-5416 Project: Spark Issue

[jira] [Commented] (SPARK-5341) Support maven coordinates in spark-shell and spark-submit

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292826#comment-14292826 ] Apache Spark commented on SPARK-5341: - User 'brkyvz' has created a pull request for

[jira] [Updated] (SPARK-5119) java.lang.ArrayIndexOutOfBoundsException on trying to train decision tree model

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5119: - Target Version/s: 1.3.0 java.lang.ArrayIndexOutOfBoundsException on trying to train decision

[jira] [Created] (SPARK-5418) Output directory for shuffle should consider left space of each directory set in conf

2015-01-26 Thread ding (JIRA)
ding created SPARK-5418: --- Summary: Output directory for shuffle should consider left space of each directory set in conf Key: SPARK-5418 URL: https://issues.apache.org/jira/browse/SPARK-5418 Project: Spark

[jira] [Resolved] (SPARK-5119) java.lang.ArrayIndexOutOfBoundsException on trying to train decision tree model

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-5119. -- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 3975

[jira] [Commented] (SPARK-5388) Provide a stable application submission gateway

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292866#comment-14292866 ] Apache Spark commented on SPARK-5388: - User 'andrewor14' has created a pull request

[jira] [Commented] (SPARK-2243) Support multiple SparkContexts in the same JVM

2015-01-26 Thread Aniket Bhatnagar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293019#comment-14293019 ] Aniket Bhatnagar commented on SPARK-2243: - I am also interested in having this

[jira] [Updated] (SPARK-5420) Cross-langauge load/store functions for creating and saving DataFrames

2015-01-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-5420: --- Summary: Cross-langauge load/store functions for creating and saving DataFrames (was: Create

[jira] [Issue Comment Deleted] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-26 Thread Joseph Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Tang updated SPARK-4846: --- Comment: was deleted (was: Sorry about the procrastination. I'm still working on this. Regarding

[jira] [Comment Edited] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-26 Thread Joseph Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292853#comment-14292853 ] Joseph Tang edited comment on SPARK-4846 at 1/27/15 2:44 AM: -

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-26 Thread Joseph Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292886#comment-14292886 ] Joseph Tang commented on SPARK-4846: Hi Xiangrui, here is a problem. PR #3693 that

[jira] [Created] (SPARK-5420) Create cross-langauge load/store functions for creating and saving DataFrames

2015-01-26 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-5420: -- Summary: Create cross-langauge load/store functions for creating and saving DataFrames Key: SPARK-5420 URL: https://issues.apache.org/jira/browse/SPARK-5420

[jira] [Commented] (SPARK-5395) Large number of Python workers causing resource depletion

2015-01-26 Thread Sven Krasser (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292913#comment-14292913 ] Sven Krasser commented on SPARK-5395: - Some additional findings from my side: I've

[jira] [Resolved] (SPARK-5052) com.google.common.base.Optional binary has a wrong method signatures

2015-01-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-5052. Resolution: Fixed Fix Version/s: 1.3.0 com.google.common.base.Optional binary has a

[jira] [Updated] (SPARK-5052) com.google.common.base.Optional binary has a wrong method signatures

2015-01-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-5052: --- Assignee: Elmer Garduno com.google.common.base.Optional binary has a wrong method signatures

[jira] [Commented] (SPARK-5261) In some cases ,The value of word's vector representation is too big

2015-01-26 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292823#comment-14292823 ] Guoqiang Li commented on SPARK-5261: [~lewuathe] {code} normalize_text() { awk

[jira] [Commented] (SPARK-5206) Accumulators are not re-registered during recovering from checkpoint

2015-01-26 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292896#comment-14292896 ] Saisai Shao commented on SPARK-5206: IMHO I think this is a general problem in Spark

[jira] [Comment Edited] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-26 Thread Joseph Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292926#comment-14292926 ] Joseph Tang edited comment on SPARK-4846 at 1/27/15 3:42 AM: -

[jira] [Commented] (SPARK-5417) Remove redundant executor-ID set() call

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292790#comment-14292790 ] Apache Spark commented on SPARK-5417: - User 'ryan-williams' has created a pull request

[jira] [Commented] (SPARK-5419) Fix the logic in Vectors.sqdist

2015-01-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292898#comment-14292898 ] Apache Spark commented on SPARK-5419: - User 'viirya' has created a pull request for

[jira] [Created] (SPARK-5422) Support sending to Graphite via UDP

2015-01-26 Thread Ryan Williams (JIRA)
Ryan Williams created SPARK-5422: Summary: Support sending to Graphite via UDP Key: SPARK-5422 URL: https://issues.apache.org/jira/browse/SPARK-5422 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-5384) Vectors.sqdist return inconsistent result for sparse/dense vectors when the vectors have different lengths

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5384: - Priority: Minor (was: Critical) Vectors.sqdist return inconsistent result for sparse/dense

[jira] [Commented] (SPARK-3439) Add Canopy Clustering Algorithm

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292688#comment-14292688 ] Xiangrui Meng commented on SPARK-3439: -- [~angellandros] The public API and the

[jira] [Updated] (SPARK-4587) Model export/import

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4587: - Assignee: Joseph K. Bradley Model export/import --- Key:

[jira] [Updated] (SPARK-1856) Standardize MLlib interfaces

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1856: - Target Version/s: (was: 1.3.0) Standardize MLlib interfaces

[jira] [Updated] (SPARK-1856) Standardize MLlib interfaces

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1856: - Priority: Critical (was: Blocker) Standardize MLlib interfaces

[jira] [Updated] (SPARK-1486) Support multi-model training in MLlib

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1486: - Target Version/s: (was: 1.3.0) Support multi-model training in MLlib

[jira] [Updated] (SPARK-3717) DecisionTree, RandomForest: Partition by feature

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3717: - Target Version/s: (was: 1.3.0) DecisionTree, RandomForest: Partition by feature

[jira] [Updated] (SPARK-5321) Add transpose() method to Matrix

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5321: - Assignee: Burak Yavuz Add transpose() method to Matrix

[jira] [Updated] (SPARK-5114) Should Evaluator be a PipelineStage

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5114: - Summary: Should Evaluator be a PipelineStage (was: Should Evaluator by a PipelineStage) Should

[jira] [Updated] (SPARK-5094) Python API for gradient-boosted trees

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5094: - Assignee: Kazuki Taniguchi Python API for gradient-boosted trees

[jira] [Updated] (SPARK-5413) Upgrade metrics dependency to 3.1.0

2015-01-26 Thread Ryan Williams (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Williams updated SPARK-5413: - Description: Spark currently uses Coda Hale's metrics library version {{3.0.0}}. Version

[jira] [Updated] (SPARK-5413) Upgrade metrics dependency to 3.1.0

2015-01-26 Thread Ryan Williams (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Williams updated SPARK-5413: - Description: Spark currently uses Coda Hale's metrics library version {{3.0.0}}. Version

[jira] [Created] (SPARK-5413) Upgrade metrics dependency to 3.1.0

2015-01-26 Thread Ryan Williams (JIRA)
Ryan Williams created SPARK-5413: Summary: Upgrade metrics dependency to 3.1.0 Key: SPARK-5413 URL: https://issues.apache.org/jira/browse/SPARK-5413 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-4846) When the vocabulary size is large, Word2Vec may yield OutOfMemoryError: Requested array size exceeds VM limit

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292718#comment-14292718 ] Xiangrui Meng commented on SPARK-4846: -- [~josephtang] Are you working on this issue?

[jira] [Updated] (SPARK-5414) Add SparkListener implementation that allows users to receive all listener events in one method

2015-01-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-5414: -- Component/s: Spark Core Add SparkListener implementation that allows users to receive all listener

[jira] [Created] (SPARK-5415) Upgrade sbt to 0.13.7

2015-01-26 Thread Ryan Williams (JIRA)
Ryan Williams created SPARK-5415: Summary: Upgrade sbt to 0.13.7 Key: SPARK-5415 URL: https://issues.apache.org/jira/browse/SPARK-5415 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-5261) In some cases ,The value of word's vector representation is too big

2015-01-26 Thread Kai Sasaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292750#comment-14292750 ] Kai Sasaki commented on SPARK-5261: --- [~gq] Can you provide us data set? I tried with

[jira] [Resolved] (SPARK-5409) Broken link in documentation

2015-01-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5409. -- Resolution: Duplicate Actually this was already fixed Broken link in documentation

[jira] [Commented] (SPARK-5395) Large number of Python workers causing resource depletion

2015-01-26 Thread Mark Khaitman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292769#comment-14292769 ] Mark Khaitman commented on SPARK-5395: -- This may prove to be useful... I'm watching

[jira] [Created] (SPARK-5419) Fix the logic in Vectors.sqdist

2015-01-26 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-5419: Summary: Fix the logic in Vectors.sqdist Key: SPARK-5419 URL: https://issues.apache.org/jira/browse/SPARK-5419 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-5421) SparkSql throw OOM at shuffle

2015-01-26 Thread Hong Shen (JIRA)
Hong Shen created SPARK-5421: Summary: SparkSql throw OOM at shuffle Key: SPARK-5421 URL: https://issues.apache.org/jira/browse/SPARK-5421 Project: Spark Issue Type: Bug Components:

[jira] [Resolved] (SPARK-3726) RandomForest: Support for bootstrap options

2015-01-26 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-3726. -- Resolution: Fixed Fix Version/s: (was: 1.2.0) 1.3.0 Issue

[jira] [Commented] (SPARK-5267) Add a streaming module to ingest Apache Camel Messages from a configured endpoints

2015-01-26 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293043#comment-14293043 ] Tathagata Das commented on SPARK-5267: -- Hey this is a great initiative! However, we

[jira] [Commented] (SPARK-4964) Exactly-once + WAL-free Kafka Support in Spark Streaming

2015-01-26 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293140#comment-14293140 ] Tathagata Das commented on SPARK-4964: --

[jira] [Updated] (SPARK-4964) Exactly-once + WAL-free Kafka Support in Spark Streaming

2015-01-26 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-4964: - Summary: Exactly-once + WAL-free Kafka Support in Spark Streaming (was: Exactly-once semantics

[jira] [Commented] (SPARK-4964) Exactly-once semantics for Kafka

2015-01-26 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293129#comment-14293129 ] Tathagata Das commented on SPARK-4964: -- I am renaming this JIRA to Native Kafka

[jira] [Updated] (SPARK-4964) Exactly-once + WAL-free Kafka Support in Spark Streaming

2015-01-26 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-4964: - Description: There are two issues with the current Kafka support - Use of Write Ahead Logs in

  1   2   >