[jira] [Created] (SPARK-16028) Remove the need to pass in a SparkContext for spark.lapply

2016-06-17 Thread Shivaram Venkataraman (JIRA)
Shivaram Venkataraman created SPARK-16028: - Summary: Remove the need to pass in a SparkContext for spark.lapply Key: SPARK-16028 URL: https://issues.apache.org/jira/browse/SPARK-16028

[jira] [Commented] (SPARK-16027) Fix SparkR session unit test

2016-06-17 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15337151#comment-15337151 ] Shivaram Venkataraman commented on SPARK-16027: --- cc [~felixcheung] > Fix SparkR session

[jira] [Created] (SPARK-16027) Fix SparkR session unit test

2016-06-17 Thread Shivaram Venkataraman (JIRA)
Shivaram Venkataraman created SPARK-16027: - Summary: Fix SparkR session unit test Key: SPARK-16027 URL: https://issues.apache.org/jira/browse/SPARK-16027 Project: Spark Issue Type:

[jira] [Commented] (SPARK-16026) Cost-based optimizer framework

2016-06-17 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15337135#comment-15337135 ] Reynold Xin commented on SPARK-16026: - Note one implementation from Huawei is

[jira] [Created] (SPARK-16026) Cost-based optimizer framework

2016-06-17 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-16026: --- Summary: Cost-based optimizer framework Key: SPARK-16026 URL: https://issues.apache.org/jira/browse/SPARK-16026 Project: Spark Issue Type: New Feature

[jira] [Assigned] (SPARK-16025) Document OFF_HEAP storage level in 2.0

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16025: Assignee: (was: Apache Spark) > Document OFF_HEAP storage level in 2.0 >

[jira] [Assigned] (SPARK-16025) Document OFF_HEAP storage level in 2.0

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16025: Assignee: Apache Spark > Document OFF_HEAP storage level in 2.0 >

[jira] [Commented] (SPARK-16025) Document OFF_HEAP storage level in 2.0

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15337087#comment-15337087 ] Apache Spark commented on SPARK-16025: -- User 'ericl' has created a pull request for this issue:

[jira] [Created] (SPARK-16025) Document OFF_HEAP storage level in 2.0

2016-06-17 Thread Eric Liang (JIRA)
Eric Liang created SPARK-16025: -- Summary: Document OFF_HEAP storage level in 2.0 Key: SPARK-16025 URL: https://issues.apache.org/jira/browse/SPARK-16025 Project: Spark Issue Type: Documentation

[jira] [Created] (SPARK-16024) column comment is ignored for datasource table

2016-06-17 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-16024: --- Summary: column comment is ignored for datasource table Key: SPARK-16024 URL: https://issues.apache.org/jira/browse/SPARK-16024 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-15916) JDBC AND/OR operator push down does not respect lower OR operator precedence

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15337023#comment-15337023 ] Apache Spark commented on SPARK-15916: -- User 'clockfly' has created a pull request for this issue:

[jira] [Commented] (SPARK-16017) YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality.

2016-06-17 Thread Trystan Leftwich (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15337004#comment-15337004 ] Trystan Leftwich commented on SPARK-16017: -- [~zsxwing] I'll run the tests here shortly and get

[jira] [Commented] (SPARK-16000) Make model loading backward compatible with saved models using old vector columns

2016-06-17 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336990#comment-15336990 ] Yanbo Liang commented on SPARK-16000: - It's a good idea to split the task into smaller ones, since

[jira] [Commented] (SPARK-16015) Datasource register for shutdown?

2016-06-17 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336925#comment-15336925 ] Sean Owen commented on SPARK-16015: --- Is

[jira] [Commented] (SPARK-15999) Wrong/Missing information for Spark UI/REST interface

2016-06-17 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336920#comment-15336920 ] Sean Owen commented on SPARK-15999: --- Yes, you can specify any parameters you want when running your

[jira] [Commented] (SPARK-16017) YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality.

2016-06-17 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336919#comment-15336919 ] Shixiong Zhu commented on SPARK-16017: -- FYI, I reverted SPARK-15395 for branch-1.6. >

[jira] [Assigned] (SPARK-16023) Move InMemoryRelation to its own file

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16023: Assignee: Andrew Or (was: Apache Spark) > Move InMemoryRelation to its own file >

[jira] [Commented] (SPARK-16017) YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality.

2016-06-17 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336914#comment-15336914 ] Shixiong Zhu commented on SPARK-16017: -- [~tleftwich] Could help test

[jira] [Resolved] (SPARK-16018) Shade netty for shuffle to work on YARN

2016-06-17 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-16018. --- Resolution: Fixed > Shade netty for shuffle to work on YARN >

[jira] [Assigned] (SPARK-16023) Move InMemoryRelation to its own file

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16023: Assignee: Apache Spark (was: Andrew Or) > Move InMemoryRelation to its own file >

[jira] [Commented] (SPARK-16023) Move InMemoryRelation to its own file

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336912#comment-15336912 ] Apache Spark commented on SPARK-16023: -- User 'andrewor14' has created a pull request for this issue:

[jira] [Updated] (SPARK-16018) Shade netty for shuffle to work on YARN

2016-06-17 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-16018: -- Assignee: Dhruve Ashar > Shade netty for shuffle to work on YARN >

[jira] [Created] (SPARK-16023) Move InMemoryRelation to its own file

2016-06-17 Thread Andrew Or (JIRA)
Andrew Or created SPARK-16023: - Summary: Move InMemoryRelation to its own file Key: SPARK-16023 URL: https://issues.apache.org/jira/browse/SPARK-16023 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-16017) YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality.

2016-06-17 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336908#comment-15336908 ] Marcelo Vanzin commented on SPARK-16017: Yes, given the HDFS restrictions, that should be enough.

[jira] [Commented] (SPARK-16017) YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality.

2016-06-17 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336906#comment-15336906 ] Marcelo Vanzin commented on SPARK-16017: That seems it might create a similar situation, but it

[jira] [Commented] (SPARK-16017) YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality.

2016-06-17 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336902#comment-15336902 ] Shixiong Zhu commented on SPARK-16017: -- Just to confirm that: you agree that just sending the

[jira] [Resolved] (SPARK-16016) where i can find the code of Extreme Learning Machine(elm) on spark

2016-06-17 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16016. --- Resolution: Invalid Target Version/s: (was: 1.6.0) > where i can find the code of

[jira] [Resolved] (SPARK-16022) Input size is different when I use 1 or 3 nodes but the shufle size remains +- icual, do you know why?

2016-06-17 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16022. --- Resolution: Not A Problem This should be a question at user@ first.

[jira] [Commented] (SPARK-16017) YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality.

2016-06-17 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336895#comment-15336895 ] Sean Owen commented on SPARK-16017: --- BTW while we're here... is this related to what

[jira] [Commented] (SPARK-15395) Use getHostString to create RpcAddress

2016-06-17 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336892#comment-15336892 ] Shixiong Zhu commented on SPARK-15395: -- Just reverted this one for branch 1.6:

[jira] [Comment Edited] (SPARK-12177) Update KafkaDStreams to new Kafka 0.10 Consumer API

2016-06-17 Thread Mirza Gaush Beg (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336811#comment-15336811 ] Mirza Gaush Beg edited comment on SPARK-12177 at 6/17/16 8:34 PM: -- any

[jira] [Updated] (SPARK-15395) Use getHostString to create RpcAddress

2016-06-17 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-15395: - Fix Version/s: (was: 1.6.2) > Use getHostString to create RpcAddress >

[jira] [Assigned] (SPARK-16017) YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality.

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16017: Assignee: (was: Apache Spark) > YarnClientSchedulerBackend now registers backends as

[jira] [Assigned] (SPARK-16017) YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality.

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16017: Assignee: Apache Spark > YarnClientSchedulerBackend now registers backends as IPs instead

[jira] [Commented] (SPARK-16017) YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality.

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336884#comment-15336884 ] Apache Spark commented on SPARK-16017: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Commented] (SPARK-16017) YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality.

2016-06-17 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336880#comment-15336880 ] Marcelo Vanzin commented on SPARK-16017: The default configuration of HDFS seems to disallow the

[jira] [Commented] (SPARK-12177) Update KafkaDStreams to new Kafka 0.10 Consumer API

2016-06-17 Thread Mirza Gaush Beg (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336811#comment-15336811 ] Mirza Gaush Beg commented on SPARK-12177: - any ETA on this fix. we are using spark 1.6 and kafka

[jira] [Updated] (SPARK-14459) SQL partitioning must match existing tables, but is not checked.

2016-06-17 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-14459: - Labels: release_notes releasenotes (was: ) > SQL partitioning must match existing tables, but is not

[jira] [Created] (SPARK-16022) Input size is different when I use 1 or 3 nodes but the shufle size remains +- icual, do you know why?

2016-06-17 Thread jon (JIRA)
jon created SPARK-16022: --- Summary: Input size is different when I use 1 or 3 nodes but the shufle size remains +- icual, do you know why? Key: SPARK-16022 URL: https://issues.apache.org/jira/browse/SPARK-16022

[jira] [Commented] (SPARK-16017) YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality.

2016-06-17 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336765#comment-15336765 ] Shixiong Zhu commented on SPARK-16017: -- [~vanzin] Just realized one problem: If the user starts the

[jira] [Commented] (SPARK-15926) Improve readability of DAGScheduler stage creation methods

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336753#comment-15336753 ] Apache Spark commented on SPARK-15926: -- User 'kayousterhout' has created a pull request for this

[jira] [Resolved] (SPARK-15926) Improve readability of DAGScheduler stage creation methods

2016-06-17 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout resolved SPARK-15926. Resolution: Fixed Fix Version/s: 2.1.0 Fixed by

[jira] [Commented] (SPARK-16021) Zero out freed memory in test to help catch correctness bugs

2016-06-17 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336731#comment-15336731 ] Sameer Agarwal commented on SPARK-16021: +1 > Zero out freed memory in test to help catch

[jira] [Created] (SPARK-16021) Zero out freed memory in test to help catch correctness bugs

2016-06-17 Thread Eric Liang (JIRA)
Eric Liang created SPARK-16021: -- Summary: Zero out freed memory in test to help catch correctness bugs Key: SPARK-16021 URL: https://issues.apache.org/jira/browse/SPARK-16021 Project: Spark

[jira] [Updated] (SPARK-15999) Wrong/Missing information for Spark UI/REST interface

2016-06-17 Thread Faisal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Faisal updated SPARK-15999: --- Summary: Wrong/Missing information for Spark UI/REST interface (was: Wrong/Missing information for Spark

[jira] [Comment Edited] (SPARK-15999) Wrong/Missing information for Spark UI/REST port

2016-06-17 Thread Faisal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336685#comment-15336685 ] Faisal edited comment on SPARK-15999 at 6/17/16 6:50 PM: - - Does the spark rest

[jira] [Assigned] (SPARK-16020) Fix complete mode aggregation with console sink

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16020: Assignee: Apache Spark (was: Shixiong Zhu) > Fix complete mode aggregation with console

[jira] [Assigned] (SPARK-16020) Fix complete mode aggregation with console sink

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16020: Assignee: Shixiong Zhu (was: Apache Spark) > Fix complete mode aggregation with console

[jira] [Commented] (SPARK-15999) Wrong/Missing information for Spark UI/REST port

2016-06-17 Thread Faisal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336685#comment-15336685 ] Faisal commented on SPARK-15999: - Does the spark rest service exposed on *spark.yarn.am.port*? - Under

[jira] [Commented] (SPARK-16020) Fix complete mode aggregation with console sink

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336687#comment-15336687 ] Apache Spark commented on SPARK-16020: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Commented] (SPARK-16017) YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality.

2016-06-17 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336667#comment-15336667 ] Shixiong Zhu commented on SPARK-16017: -- Yes. Thanks for correcting. I will submit a PR for 2.0. In

[jira] [Created] (SPARK-16020) Fix complete mode aggregation with console sink

2016-06-17 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-16020: Summary: Fix complete mode aggregation with console sink Key: SPARK-16020 URL: https://issues.apache.org/jira/browse/SPARK-16020 Project: Spark Issue Type:

[jira] [Commented] (SPARK-16017) YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality.

2016-06-17 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336630#comment-15336630 ] Marcelo Vanzin commented on SPARK-16017: I assume you mean from {{CoarseGrainedExecutorBackend}}?

[jira] [Updated] (SPARK-15644) Replace SQLContext with SparkSession in MLlib

2016-06-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-15644: -- Shepherd: Joseph K. Bradley Assignee: Xiao Li > Replace SQLContext with

[jira] [Commented] (SPARK-16017) YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality.

2016-06-17 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336610#comment-15336610 ] Shixiong Zhu commented on SPARK-16017: -- [~vanzin] what do you think if we just send hostname from

[jira] [Commented] (SPARK-15501) ML 2.0 QA: Scala APIs audit for recommendation

2016-06-17 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336600#comment-15336600 ] Joseph K. Bradley commented on SPARK-15501: --- Thanks! > ML 2.0 QA: Scala APIs audit for

[jira] [Updated] (SPARK-16017) YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality.

2016-06-17 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-16017: --- Component/s: Spark Core > YarnClientSchedulerBackend now registers backends as IPs instead

[jira] [Commented] (SPARK-16017) YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality.

2016-06-17 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336584#comment-15336584 ] Marcelo Vanzin commented on SPARK-16017: In fact this shouldn't affect just YARN, but anything

[jira] [Commented] (SPARK-16017) YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality.

2016-06-17 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336570#comment-15336570 ] Marcelo Vanzin commented on SPARK-16017: BTW SPARK-15395 doesn't really explain a real use case

[jira] [Commented] (SPARK-13753) Column nullable is derived incorrectly

2016-06-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336567#comment-15336567 ] Davies Liu commented on SPARK-13753: After discussed with [~cloud_fan], we do have runtime check to

[jira] [Updated] (SPARK-16017) YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality.

2016-06-17 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-16017: --- Affects Version/s: 2.0.0 1.6.2 Target Version/s: 1.6.2, 2.0.0

[jira] [Commented] (SPARK-15340) Limit the size of the map used to cache JobConfs to void OOM

2016-06-17 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336511#comment-15336511 ] Sean Zhong commented on SPARK-15340: [~DoingDone9] I did some tests, and didn't see the OOM you

[jira] [Assigned] (SPARK-16018) Shade netty for shuffle to work on YARN

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16018: Assignee: (was: Apache Spark) > Shade netty for shuffle to work on YARN >

[jira] [Assigned] (SPARK-16018) Shade netty for shuffle to work on YARN

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-16018: Assignee: Apache Spark > Shade netty for shuffle to work on YARN >

[jira] [Commented] (SPARK-16018) Shade netty for shuffle to work on YARN

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336485#comment-15336485 ] Apache Spark commented on SPARK-16018: -- User 'dhruve' has created a pull request for this issue:

[jira] [Updated] (SPARK-15660) Update RDD `variance/stdev` description and add popVariance/popStdev

2016-06-17 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-15660: -- Priority: Minor (was: Major) Description: In Spark-11490, `variance/stdev` are

[jira] [Resolved] (SPARK-16008) ML Logistic Regression aggregator serializes unnecessary data

2016-06-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-16008. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13729

[jira] [Updated] (SPARK-16000) Make model loading backward compatible with saved models using old vector columns

2016-06-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-16000: -- Assignee: yuhao yang > Make model loading backward compatible with saved models using old

[jira] [Commented] (SPARK-16000) Make model loading backward compatible with saved models using old vector columns

2016-06-17 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336446#comment-15336446 ] Xiangrui Meng commented on SPARK-16000: --- That's great! Please let me know if you want to split the

[jira] [Commented] (SPARK-15993) PySpark RuntimeConfig should be immutable

2016-06-17 Thread Vladimir Feinberg (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336381#comment-15336381 ] Vladimir Feinberg commented on SPARK-15993: --- So the intent is that changing {{RuntimeConfig}}

[jira] [Updated] (SPARK-15989) PySpark SQL python-only UDTs don't support nested types

2016-06-17 Thread Vladimir Feinberg (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Feinberg updated SPARK-15989: -- Component/s: SQL > PySpark SQL python-only UDTs don't support nested types >

[jira] [Created] (SPARK-16019) Eliminate unexpected delay during spark on yarn job launch

2016-06-17 Thread Olasoji (JIRA)
Olasoji created SPARK-16019: --- Summary: Eliminate unexpected delay during spark on yarn job launch Key: SPARK-16019 URL: https://issues.apache.org/jira/browse/SPARK-16019 Project: Spark Issue Type:

[jira] [Commented] (SPARK-11227) Spark1.5+ HDFS HA mode throw java.net.UnknownHostException: nameservice1

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336348#comment-15336348 ] Apache Spark commented on SPARK-11227: -- User 'sarutak' has created a pull request for this issue:

[jira] [Updated] (SPARK-15967) Spark UI should show realtime value of storage memory instead of showing one static value all the time

2016-06-17 Thread Umesh K (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh K updated SPARK-15967: Description: As of Spark 1.6.x we have unified memory management and hence execution/storage memory

[jira] [Commented] (SPARK-16018) Shade netty for shuffle to work on YARN

2016-06-17 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336326#comment-15336326 ] Thomas Graves commented on SPARK-16018: --- Note we are seeing this on hadoop 2.7. I see hadoop 2.8

[jira] [Created] (SPARK-16018) Shade netty for shuffle to work on YARN

2016-06-17 Thread Dhruve Ashar (JIRA)
Dhruve Ashar created SPARK-16018: Summary: Shade netty for shuffle to work on YARN Key: SPARK-16018 URL: https://issues.apache.org/jira/browse/SPARK-16018 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-15691) Refactor and improve Hive support

2016-06-17 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-15691: - Description: Hive support is important to Spark SQL, as many Spark users use it to read from Hive. The

[jira] [Commented] (SPARK-15954) TestHive has issues being used in PySpark

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336240#comment-15336240 ] Apache Spark commented on SPARK-15954: -- User 'holdenk' has created a pull request for this issue:

[jira] [Created] (SPARK-16017) YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality.

2016-06-17 Thread Trystan Leftwich (JIRA)
Trystan Leftwich created SPARK-16017: Summary: YarnClientSchedulerBackend now registers backends as IPs instead of Hostnames which causes all tasks to run with RACK_LOCAL locality. Key: SPARK-16017 URL:

[jira] [Commented] (SPARK-15995) Gradient Boosted Trees - handling of Categorical Inputs

2016-06-17 Thread Taylor Baldwin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336132#comment-15336132 ] Taylor Baldwin commented on SPARK-15995: Will be closing this issue. Found everything we need in

[jira] [Closed] (SPARK-15995) Gradient Boosted Trees - handling of Categorical Inputs

2016-06-17 Thread Taylor Baldwin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Taylor Baldwin closed SPARK-15995. -- Resolution: Not A Problem > Gradient Boosted Trees - handling of Categorical Inputs >

[jira] [Commented] (SPARK-12113) Add timing metrics to blocking phases for spark sql

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15336015#comment-15336015 ] Apache Spark commented on SPARK-12113: -- User 'maropu' has created a pull request for this issue:

[jira] [Created] (SPARK-16016) where i can find the code of Extreme Learning Machine(elm) on spark

2016-06-17 Thread yueyou (JIRA)
yueyou created SPARK-16016: -- Summary: where i can find the code of Extreme Learning Machine(elm) on spark Key: SPARK-16016 URL: https://issues.apache.org/jira/browse/SPARK-16016 Project: Spark

[jira] [Commented] (SPARK-15501) ML 2.0 QA: Scala APIs audit for recommendation

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335961#comment-15335961 ] Nick Pentreath commented on SPARK-15501: It's done - resolved it. > ML 2.0 QA: Scala APIs audit

[jira] [Resolved] (SPARK-15501) ML 2.0 QA: Scala APIs audit for recommendation

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15501. Resolution: Fixed Fix Version/s: 2.0.0 > ML 2.0 QA: Scala APIs audit for

[jira] [Resolved] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15447. Resolution: Fixed Fix Version/s: 2.0.0 > Performance test for ALS in Spark 2.0 >

[jira] [Commented] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335956#comment-15335956 ] Nick Pentreath commented on SPARK-15447: Finalized results in the linked Google sheet. Also

[jira] [Updated] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15447: --- Description: We made several changes to ALS in 2.0. It is necessary to run some tests to

[jira] [Updated] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15447: --- Description: We made several changes to ALS in 2.0. It is necessary to run some tests to

[jira] [Commented] (SPARK-15328) Word2Vec import for original binary format

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335922#comment-15335922 ] Apache Spark commented on SPARK-15328: -- User 'wangyum' has created a pull request for this issue:

[jira] [Commented] (SPARK-15343) NoClassDefFoundError when initializing Spark with YARN

2016-06-17 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335878#comment-15335878 ] Steve Loughran commented on SPARK-15343: ooh, this is ia pain. FWIW, my current stance on

[jira] [Assigned] (SPARK-14995) Add "since" tag in Roxygen documentation for SparkR API methods

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14995: Assignee: Apache Spark > Add "since" tag in Roxygen documentation for SparkR API methods

[jira] [Assigned] (SPARK-14995) Add "since" tag in Roxygen documentation for SparkR API methods

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14995: Assignee: (was: Apache Spark) > Add "since" tag in Roxygen documentation for SparkR

[jira] [Commented] (SPARK-14995) Add "since" tag in Roxygen documentation for SparkR API methods

2016-06-17 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335862#comment-15335862 ] Apache Spark commented on SPARK-14995: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Created] (SPARK-16015) Datasource register for shutdown?

2016-06-17 Thread Michael Nitschinger (JIRA)
Michael Nitschinger created SPARK-16015: --- Summary: Datasource register for shutdown? Key: SPARK-16015 URL: https://issues.apache.org/jira/browse/SPARK-16015 Project: Spark Issue Type:

[jira] [Commented] (SPARK-15995) Gradient Boosted Trees - handling of Categorical Inputs

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335801#comment-15335801 ] Nick Pentreath commented on SPARK-15995: cc [~sethah] > Gradient Boosted Trees - handling of

[jira] [Updated] (SPARK-16008) ML Logistic Regression aggregator serializes unnecessary data

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-16008: --- Assignee: Seth Hendrickson > ML Logistic Regression aggregator serializes unnecessary data >

[jira] [Commented] (SPARK-16013) Add option to disable HiveContext in spark-shell/pyspark

2016-06-17 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335747#comment-15335747 ] Jeff Zhang commented on SPARK-16013: Found SPARK-11562, although it is not necessary in spark 2.0, I

[jira] [Commented] (SPARK-15987) PostgreSQL CITEXT type JDBC support

2016-06-17 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335729#comment-15335729 ] Takeshi Yamamuro commented on SPARK-15987: -- It is easy to add new types as JDBC dialects in

[jira] [Commented] (SPARK-16013) Add option to disable HiveContext in spark-shell/pyspark

2016-06-17 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335727#comment-15335727 ] Jeff Zhang commented on SPARK-16013: I mean to introduce this to 1.6 as in spark 2.0 we can disable

<    1   2   3   >