[jira] [Created] (SPARK-4484) Treat maxResultSize as unlimited when set to 0; improve error message

2014-11-19 Thread Nishkam Ravi (JIRA)
Nishkam Ravi created SPARK-4484: --- Summary: Treat maxResultSize as unlimited when set to 0; improve error message Key: SPARK-4484 URL: https://issues.apache.org/jira/browse/SPARK-4484 Project: Spark

[jira] [Commented] (SPARK-4484) Treat maxResultSize as unlimited when set to 0; improve error message

2014-11-19 Thread Nishkam Ravi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217577#comment-14217577 ] Nishkam Ravi commented on SPARK-4484: - PR: https://github.com/apache/spark/pull/3360/

[jira] [Commented] (SPARK-4484) Treat maxResultSize as unlimited when set to 0; improve error message

2014-11-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217580#comment-14217580 ] Apache Spark commented on SPARK-4484: - User 'nishkamravi2' has created a pull request

[jira] [Commented] (SPARK-4484) Treat maxResultSize as unlimited when set to 0; improve error message

2014-11-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217581#comment-14217581 ] Sean Owen commented on SPARK-4484: -- This is causing a number of benchmarks to fail, no?

[jira] [Created] (SPARK-4485) Add broadcast outer join to optimize left outer join and right outer join

2014-11-19 Thread XiaoJing wang (JIRA)
XiaoJing wang created SPARK-4485: Summary: Add broadcast outer join to optimize left outer join and right outer join Key: SPARK-4485 URL: https://issues.apache.org/jira/browse/SPARK-4485 Project:

[jira] [Commented] (SPARK-948) Move Classpath Entries in WebUI

2014-11-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217599#comment-14217599 ] Sean Owen commented on SPARK-948: - I'll close as WontFix unless someone agrees with the

[jira] [Created] (SPARK-4486) Improve GradientBoosting APIs and doc

2014-11-19 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-4486: Summary: Improve GradientBoosting APIs and doc Key: SPARK-4486 URL: https://issues.apache.org/jira/browse/SPARK-4486 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-4355) OnlineSummarizer doesn't merge mean correctly

2014-11-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4355: - Fix Version/s: 1.2.0 OnlineSummarizer doesn't merge mean correctly

[jira] [Commented] (SPARK-4484) Treat maxResultSize as unlimited when set to 0; improve error message

2014-11-19 Thread Nishkam Ravi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217653#comment-14217653 ] Nishkam Ravi commented on SPARK-4484: - Yeah, most workloads break with a large enough

[jira] [Updated] (SPARK-4484) Treat maxResultSize as unlimited when set to 0; improve error message

2014-11-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-4484: - Priority: Critical (was: Major) I'll be so bold as to raise the priority as I see this actually stopped

[jira] [Updated] (SPARK-4484) [CORE] Treat maxResultSize as unlimited when set to 0; improve error message

2014-11-19 Thread Nishkam Ravi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nishkam Ravi updated SPARK-4484: Summary: [CORE] Treat maxResultSize as unlimited when set to 0; improve error message (was: Treat

[jira] [Created] (SPARK-4487) Fix attribute reference resolution error when using ORDER BY.

2014-11-19 Thread Kousuke Saruta (JIRA)
Kousuke Saruta created SPARK-4487: - Summary: Fix attribute reference resolution error when using ORDER BY. Key: SPARK-4487 URL: https://issues.apache.org/jira/browse/SPARK-4487 Project: Spark

[jira] [Commented] (SPARK-4485) Add broadcast outer join to optimize left outer join and right outer join

2014-11-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217687#comment-14217687 ] Apache Spark commented on SPARK-4485: - User 'wangxiaojing' has created a pull request

[jira] [Commented] (SPARK-4487) Fix attribute reference resolution error when using ORDER BY.

2014-11-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217732#comment-14217732 ] Apache Spark commented on SPARK-4487: - User 'sarutak' has created a pull request for

[jira] [Created] (SPARK-4488) Add control over map-side aggregation

2014-11-19 Thread uncleGen (JIRA)
uncleGen created SPARK-4488: --- Summary: Add control over map-side aggregation Key: SPARK-4488 URL: https://issues.apache.org/jira/browse/SPARK-4488 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-4488) Add control over map-side aggregation

2014-11-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217755#comment-14217755 ] Apache Spark commented on SPARK-4488: - User 'uncleGen' has created a pull request for

[jira] [Created] (SPARK-4489) JavaPairRDD.collectAsMap from checkpoint RDD may fail with ClassCastException

2014-11-19 Thread Christopher Ng (JIRA)
Christopher Ng created SPARK-4489: - Summary: JavaPairRDD.collectAsMap from checkpoint RDD may fail with ClassCastException Key: SPARK-4489 URL: https://issues.apache.org/jira/browse/SPARK-4489

[jira] [Commented] (SPARK-4489) JavaPairRDD.collectAsMap from checkpoint RDD may fail with ClassCastException

2014-11-19 Thread Christopher Ng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217806#comment-14217806 ] Christopher Ng commented on SPARK-4489: --- I should add there's a rather nasty looking

[jira] [Created] (SPARK-4490) Not found RandomGenerator through spark-shell

2014-11-19 Thread Kai Sasaki (JIRA)
Kai Sasaki created SPARK-4490: - Summary: Not found RandomGenerator through spark-shell Key: SPARK-4490 URL: https://issues.apache.org/jira/browse/SPARK-4490 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-4490) Not found RandomGenerator through spark-shell

2014-11-19 Thread Kai Sasaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Sasaki updated SPARK-4490: -- Priority: Major (was: Critical) Not found RandomGenerator through spark-shell

[jira] [Updated] (SPARK-4490) Not found RandomGenerator through spark-shell

2014-11-19 Thread Kai Sasaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Sasaki updated SPARK-4490: -- Description: In spark-1.1.0, exception is threw whenever RandomGenerator of commons-math3 is used.

[jira] [Created] (SPARK-4491) Using sbt assembly with spark as dep requires Phd in sbt

2014-11-19 Thread sam (JIRA)
sam created SPARK-4491: -- Summary: Using sbt assembly with spark as dep requires Phd in sbt Key: SPARK-4491 URL: https://issues.apache.org/jira/browse/SPARK-4491 Project: Spark Issue Type: Question

[jira] [Updated] (SPARK-3373) Filtering operations should optionally rebuild routing tables

2014-11-19 Thread uncleGen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] uncleGen updated SPARK-3373: Target Version/s: 1.1.1, 1.2.0 (was: 1.1.0, 1.0.3) Filtering operations should optionally rebuild routing

[jira] [Created] (SPARK-4492) Exception when following SimpleApp tutorial java.lang.ClassNotFoundException: org.apache.spark.deploy.yarn.YarnSparkHadoopUtil

2014-11-19 Thread sam (JIRA)
sam created SPARK-4492: -- Summary: Exception when following SimpleApp tutorial java.lang.ClassNotFoundException: org.apache.spark.deploy.yarn.YarnSparkHadoopUtil Key: SPARK-4492 URL:

[jira] [Updated] (SPARK-4492) Exception when following SimpleApp tutorial java.lang.ClassNotFoundException: org.apache.spark.deploy.yarn.YarnSparkHadoopUtil

2014-11-19 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sam updated SPARK-4492: --- Description: When I follow the example here https://spark.apache.org/docs/1.0.2/quick-start.html and run with java

[jira] [Created] (SPARK-4493) Don't pushdown Eq, NotEq, Lt, LtEq, Gt and GtEq predicates with nulls for Parquet

2014-11-19 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-4493: - Summary: Don't pushdown Eq, NotEq, Lt, LtEq, Gt and GtEq predicates with nulls for Parquet Key: SPARK-4493 URL: https://issues.apache.org/jira/browse/SPARK-4493 Project:

[jira] [Updated] (SPARK-4494) IDFModel.transform() add support for single vector

2014-11-19 Thread Jean-Philippe Quemener (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Philippe Quemener updated SPARK-4494: -- Summary: IDFModel.transform() add support for single vector (was:

[jira] [Updated] (SPARK-4494) IDFModel.transform() add support for single vector

2014-11-19 Thread Jean-Philippe Quemener (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Philippe Quemener updated SPARK-4494: -- Description: For now when using the tfidf implementation of mllib you have no

[jira] [Created] (SPARK-4494) IDFModel.transform() add support for single vectors

2014-11-19 Thread Jean-Philippe Quemener (JIRA)
Jean-Philippe Quemener created SPARK-4494: - Summary: IDFModel.transform() add support for single vectors Key: SPARK-4494 URL: https://issues.apache.org/jira/browse/SPARK-4494 Project: Spark

[jira] [Updated] (SPARK-4494) IDFModel.transform() add support for single vector

2014-11-19 Thread Jean-Philippe Quemener (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Philippe Quemener updated SPARK-4494: -- Description: For now when using the tfidf implementation of mllib you have no

[jira] [Updated] (SPARK-4494) IDFModel.transform() add support for single vector

2014-11-19 Thread Jean-Philippe Quemener (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Philippe Quemener updated SPARK-4494: -- Description: For now when using the tfidf implementation of mllib you have no

[jira] [Closed] (SPARK-4473) [Core] StageInfo should have ActiveJob's group ID as a field

2014-11-19 Thread Mark Hamstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hamstra closed SPARK-4473. --- Resolution: Duplicate [Core] StageInfo should have ActiveJob's group ID as a field

[jira] [Commented] (SPARK-4473) [Core] StageInfo should have ActiveJob's group ID as a field

2014-11-19 Thread Mark Hamstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218057#comment-14218057 ] Mark Hamstra commented on SPARK-4473: - This is already covered by

[jira] [Updated] (SPARK-4493) Don't pushdown Eq, NotEq, Lt, LtEq, Gt and GtEq predicates with nulls for Parquet

2014-11-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-4493: -- Description: Predicates like {{a = NULL}} and {{a NULL}} can't be pushed down since Parquet {{Lt}},

[jira] [Updated] (SPARK-4493) Don't pushdown Eq, NotEq, Lt, LtEq, Gt and GtEq predicates with nulls for Parquet

2014-11-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-4493: -- Description: Predicates like {{a = NULL}} and {{a NULL}} can't be pushed down since Parquet {{Lt}},

[jira] [Commented] (SPARK-2321) Design a proper progress reporting event listener API

2014-11-19 Thread Aniket Bhatnagar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218063#comment-14218063 ] Aniket Bhatnagar commented on SPARK-2321: - Just a quick question, will the API be

[jira] [Commented] (SPARK-4493) Don't pushdown Eq, NotEq, Lt, LtEq, Gt and GtEq predicates with nulls for Parquet

2014-11-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218070#comment-14218070 ] Apache Spark commented on SPARK-4493: - User 'liancheng' has created a pull request for

[jira] [Updated] (SPARK-4495) Memory leak in JobProgressListener due to `spark.ui.retainedJobs` not being used

2014-11-19 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-4495: -- Summary: Memory leak in JobProgressListener due to `spark.ui.retainedJobs` not being used (was: Memory

[jira] [Created] (SPARK-4495) Memory leak in JobProgressListener due to `spark.ui.retainedJobs` not being used properly

2014-11-19 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-4495: - Summary: Memory leak in JobProgressListener due to `spark.ui.retainedJobs` not being used properly Key: SPARK-4495 URL: https://issues.apache.org/jira/browse/SPARK-4495

[jira] [Commented] (SPARK-4478) totalRegisteredExecutors not updated properly

2014-11-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218200#comment-14218200 ] Apache Spark commented on SPARK-4478: - User 'coolfrood' has created a pull request for

[jira] [Closed] (SPARK-4467) Number of elements read is never reset in ExternalSorter

2014-11-19 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-4467. Resolution: Fixed Fix Version/s: 1.2.0 Number of elements read is never reset in ExternalSorter

[jira] [Commented] (SPARK-3928) Support wildcard matches on Parquet files

2014-11-19 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218287#comment-14218287 ] sam commented on SPARK-3928: CSV doesn't work for me. I'm using spark-submit and it appears to

[jira] [Commented] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-19 Thread Tianshuo Deng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218289#comment-14218289 ] Tianshuo Deng commented on SPARK-4452: -- Hi, While I'm working on this ticket, I have

[jira] [Updated] (SPARK-4494) IDFModel.transform() add support for single vector

2014-11-19 Thread Jean-Philippe Quemener (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Philippe Quemener updated SPARK-4494: -- Shepherd: Xiangrui Meng IDFModel.transform() add support for single vector

[jira] [Created] (SPARK-4496) smallint (16 bit value) is being send as a 32 bit value in the thrift interface.

2014-11-19 Thread Chip Sands (JIRA)
Chip Sands created SPARK-4496: - Summary: smallint (16 bit value) is being send as a 32 bit value in the thrift interface. Key: SPARK-4496 URL: https://issues.apache.org/jira/browse/SPARK-4496 Project:

[jira] [Resolved] (SPARK-4296) Throw Expression not in GROUP BY when using same expression in group by clause and select clause

2014-11-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4296. - Resolution: Duplicate I think this was fixed by the linked issue. Please reopen if I am

[jira] [Resolved] (SPARK-4410) Support for external sort

2014-11-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4410. - Resolution: Fixed Fix Version/s: 1.2.0 Target Version/s: 1.2.0 Support

[jira] [Commented] (SPARK-4452) Shuffle data structures can starve others on the same thread for memory

2014-11-19 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218318#comment-14218318 ] Andrew Or commented on SPARK-4452: -- [~tianshuo] That is a correct assumption for

[jira] [Created] (SPARK-4497) HiveThriftServer2 does not exit properly on failure

2014-11-19 Thread Yana Kadiyska (JIRA)
Yana Kadiyska created SPARK-4497: Summary: HiveThriftServer2 does not exit properly on failure Key: SPARK-4497 URL: https://issues.apache.org/jira/browse/SPARK-4497 Project: Spark Issue

[jira] [Commented] (SPARK-4480) Avoid many small spills in external data structures

2014-11-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218387#comment-14218387 ] Apache Spark commented on SPARK-4480: - User 'andrewor14' has created a pull request

[jira] [Updated] (SPARK-4470) SparkContext accepts local[0] as a master URL

2014-11-19 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4470: - Assignee: Kenichi Maehashi SparkContext accepts local[0] as a master URL

[jira] [Closed] (SPARK-4470) SparkContext accepts local[0] as a master URL

2014-11-19 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-4470. Resolution: Fixed Fix Version/s: 1.2.0 Target Version/s: 1.2.0 SparkContext accepts

[jira] [Updated] (SPARK-4498) Standalone Master can fail to recognize completed/failed applications

2014-11-19 Thread Harry Brundage (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harry Brundage updated SPARK-4498: -- Attachment: one-applications-master-logs.txt These are the logs from the standalone master for

[jira] [Updated] (SPARK-4498) Standalone Master can fail to recognize completed/failed applications

2014-11-19 Thread Harry Brundage (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harry Brundage updated SPARK-4498: -- Description: We observe the spark standalone master not detecting that a driver application

[jira] [Created] (SPARK-4498) Standalone Master can fail to recognize completed/failed applications

2014-11-19 Thread Harry Brundage (JIRA)
Harry Brundage created SPARK-4498: - Summary: Standalone Master can fail to recognize completed/failed applications Key: SPARK-4498 URL: https://issues.apache.org/jira/browse/SPARK-4498 Project: Spark

[jira] [Updated] (SPARK-4498) Standalone Master can fail to recognize completed/failed applications

2014-11-19 Thread Harry Brundage (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harry Brundage updated SPARK-4498: -- Attachment: all-master-logs-around-blip.txt These are all the master logs around a recent blip

[jira] [Updated] (SPARK-4471) blockManagerIdFromJson function throws exception while BlockManagerId be null in MetadataFetchFailedException

2014-11-19 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4471: - Assignee: SuYan blockManagerIdFromJson function throws exception while BlockManagerId be null in

[jira] [Commented] (SPARK-4498) Standalone Master can fail to recognize completed/failed applications

2014-11-19 Thread Harry Brundage (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218491#comment-14218491 ] Harry Brundage commented on SPARK-4498: --- For the simple canary spark application

[jira] [Commented] (SPARK-2321) Design a proper progress reporting event listener API

2014-11-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218502#comment-14218502 ] Patrick Wendell commented on SPARK-2321: Currently this a programmatic API for

[jira] [Resolved] (SPARK-4482) ReceivedBlockTracker's write ahead log is enabled by default

2014-11-19 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-4482. -- Resolution: Fixed Fix Version/s: 1.2.0 ReceivedBlockTracker's write ahead log is

[jira] [Created] (SPARK-4499) Problems building and running 1.2 with Hive support

2014-11-19 Thread George Kyriacou (JIRA)
George Kyriacou created SPARK-4499: -- Summary: Problems building and running 1.2 with Hive support Key: SPARK-4499 URL: https://issues.apache.org/jira/browse/SPARK-4499 Project: Spark Issue

[jira] [Updated] (SPARK-4481) Some comments for `updateStateByKey` are wrong

2014-11-19 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-4481: - Fix Version/s: 1.3.0 Some comments for `updateStateByKey` are wrong

[jira] [Created] (SPARK-4500) Improve exact stratified sampling implementation

2014-11-19 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-4500: Summary: Improve exact stratified sampling implementation Key: SPARK-4500 URL: https://issues.apache.org/jira/browse/SPARK-4500 Project: Spark Issue

[jira] [Updated] (SPARK-4500) Improve exact stratified sampling implementation

2014-11-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-4500: - Description: The current implementation for exact stratified sampling (sampleByKeyExact)

[jira] [Updated] (SPARK-4500) Improve exact stratified sampling implementation

2014-11-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-4500: - Description: The current implementation for exact stratified sampling (sampleByKeyExact)

[jira] [Commented] (SPARK-1358) Continuous integrated test should be involved in Spark ecosystem

2014-11-19 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218579#comment-14218579 ] shane knapp commented on SPARK-1358: ok, so i can dedicate 4 servers to this by EOY.

[jira] [Resolved] (SPARK-3962) Mark spark dependency as provided in external libraries

2014-11-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-3962. Resolution: Fixed Fix Version/s: 1.2.0 Mark spark dependency as provided in

[jira] [Updated] (SPARK-4480) Avoid many small spills in external data structures

2014-11-19 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4480: - Fix Version/s: 1.1.1 Avoid many small spills in external data structures

[jira] [Resolved] (SPARK-4429) Build for Scala 2.11 using sbt fails.

2014-11-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4429. Resolution: Fixed Fix Version/s: 1.2.0 Assignee: Takuya Ueshin

[jira] [Commented] (SPARK-4495) Memory leak in JobProgressListener due to `spark.ui.retainedJobs` not being used

2014-11-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218653#comment-14218653 ] Apache Spark commented on SPARK-4495: - User 'JoshRosen' has created a pull request for

[jira] [Commented] (SPARK-4478) totalRegisteredExecutors not updated properly

2014-11-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218654#comment-14218654 ] Apache Spark commented on SPARK-4478: - User 'coolfrood' has created a pull request for

[jira] [Updated] (SPARK-4497) HiveThriftServer2 does not exit properly on failure

2014-11-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4497: Priority: Critical (was: Major) Target Version/s: 1.3.0 HiveThriftServer2

[jira] [Comment Edited] (SPARK-3066) Support recommendAll in matrix factorization model

2014-11-19 Thread Debasish Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218667#comment-14218667 ] Debasish Das edited comment on SPARK-3066 at 11/19/14 10:59 PM:

[jira] [Commented] (SPARK-3066) Support recommendAll in matrix factorization model

2014-11-19 Thread Debasish Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218667#comment-14218667 ] Debasish Das commented on SPARK-3066: - @mengxr as per our discussions, I added APIs

[jira] [Updated] (SPARK-4498) Standalone Master can fail to recognize completed/failed applications

2014-11-19 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-4498: -- Component/s: Deploy Standalone Master can fail to recognize completed/failed applications

[jira] [Updated] (SPARK-4501) Create build/mvn to automatically download maven/zinc/scalac

2014-11-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4501: --- Assignee: Prashant Sharma Create build/mvn to automatically download maven/zinc/scalac

[jira] [Commented] (SPARK-4442) Move common unit test utilities into their own package / module

2014-11-19 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218704#comment-14218704 ] Josh Rosen commented on SPARK-4442: --- There's a few generally useful test utility

[jira] [Updated] (SPARK-4376) Put external modules behind build profiles

2014-11-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4376: --- Target Version/s: 1.3.0 (was: 1.2.0) Put external modules behind build profiles

[jira] [Updated] (SPARK-4355) OnlineSummarizer doesn't merge mean correctly

2014-11-19 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4355: - Fix Version/s: (was: 1.1.2) 1.1.1 OnlineSummarizer doesn't merge mean correctly

[jira] [Updated] (SPARK-4355) OnlineSummarizer doesn't merge mean correctly

2014-11-19 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4355: - Target Version/s: 1.1.1, 1.2.0, 1.0.3 (was: 1.2.0, 1.0.3, 1.1.2) OnlineSummarizer doesn't merge mean

[jira] [Updated] (SPARK-2630) Input data size of CoalescedRDD is incorrect

2014-11-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2630: --- Fix Version/s: (was: 1.2.0) Input data size of CoalescedRDD is incorrect

[jira] [Updated] (SPARK-4479) Avoid unnecessary defensive copies when Sort based shuffle is on

2014-11-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4479: Assignee: Cheng Lian Avoid unnecessary defensive copies when Sort based shuffle is on

[jira] [Closed] (SPARK-4385) DataSource DDL Parser can't handle table names with '_'

2014-11-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust closed SPARK-4385. --- Resolution: Cannot Reproduce DataSource DDL Parser can't handle table names with '_'

[jira] [Updated] (SPARK-4384) Too many open files during sort in pyspark

2014-11-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4384: --- Priority: Blocker (was: Critical) Too many open files during sort in pyspark

[jira] [Updated] (SPARK-4384) Too many open files during sort in pyspark

2014-11-19 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-4384: -- Affects Version/s: (was: 1.1.0) Too many open files during sort in pyspark

[jira] [Updated] (SPARK-4384) Too many open files during sort in pyspark

2014-11-19 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-4384: -- Assignee: Davies Liu Too many open files during sort in pyspark

[jira] [Resolved] (SPARK-4384) Too many open files during sort in pyspark

2014-11-19 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-4384. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 3252

[jira] [Resolved] (SPARK-4294) UnionDStream stream should express the requirements in the same way as TransformedDStream

2014-11-19 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-4294. -- Resolution: Fixed UnionDStream stream should express the requirements in the same way as

[jira] [Commented] (SPARK-4486) Improve GradientBoosting APIs and doc

2014-11-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218791#comment-14218791 ] Apache Spark commented on SPARK-4486: - User 'mengxr' has created a pull request for

[jira] [Resolved] (SPARK-4495) Memory leak in JobProgressListener due to `spark.ui.retainedJobs` not being used

2014-11-19 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-4495. --- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 3372

[jira] [Created] (SPARK-4502) Spark SQL unnecessarily reads the entire nested column from Parquet

2014-11-19 Thread Liwen Sun (JIRA)
Liwen Sun created SPARK-4502: Summary: Spark SQL unnecessarily reads the entire nested column from Parquet Key: SPARK-4502 URL: https://issues.apache.org/jira/browse/SPARK-4502 Project: Spark

[jira] [Updated] (SPARK-4502) Spark SQL unnecessarily reads the entire nested column from Parquet

2014-11-19 Thread Liwen Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liwen Sun updated SPARK-4502: - Component/s: SQL Spark SQL unnecessarily reads the entire nested column from Parquet

[jira] [Updated] (SPARK-4502) Spark SQL unnecessarily reads the entire nested column from Parquet

2014-11-19 Thread Liwen Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liwen Sun updated SPARK-4502: - Description: When reading a field of a nested column from Parquet, SparkSQL reads and assemble all the

[jira] [Updated] (SPARK-4502) Spark SQL reads unneccesary fields from Parquet

2014-11-19 Thread Liwen Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liwen Sun updated SPARK-4502: - Summary: Spark SQL reads unneccesary fields from Parquet (was: Spark SQL unnecessarily reads the entire

[jira] [Updated] (SPARK-4502) Spark SQL reads unneccesary fields from Parquet

2014-11-19 Thread Liwen Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liwen Sun updated SPARK-4502: - Description: When reading a field of a nested column from Parquet, SparkSQL reads and assemble all the

[jira] [Closed] (SPARK-4478) totalRegisteredExecutors not updated properly

2014-11-19 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-4478. Resolution: Fixed Fix Version/s: 1.2.0 Target Version/s: 1.2.0 totalRegisteredExecutors

[jira] [Updated] (SPARK-4478) totalRegisteredExecutors not updated properly

2014-11-19 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4478: - Assignee: Akshat Aranya totalRegisteredExecutors not updated properly

[jira] [Updated] (SPARK-4484) [CORE] Treat maxResultSize as unlimited when set to 0; improve error message

2014-11-19 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4484: - Assignee: Nishkam Ravi [CORE] Treat maxResultSize as unlimited when set to 0; improve error message

[jira] [Closed] (SPARK-4484) [CORE] Treat maxResultSize as unlimited when set to 0; improve error message

2014-11-19 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-4484. Resolution: Fixed Fix Version/s: 1.2.0 Target Version/s: 1.2.0 [CORE] Treat maxResultSize

[jira] [Commented] (SPARK-1405) parallel Latent Dirichlet Allocation (LDA) atop of spark in MLlib

2014-11-19 Thread Debasish Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218845#comment-14218845 ] Debasish Das commented on SPARK-1405: - I would like to compare the LSA formulations

  1   2   >