[jira] [Assigned] (SPARK-23129) Lazy init DiskMapIterator#deserializeStream to reduce memory usage when ExternalAppendOnlyMap spill too much times

2018-01-24 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-23129: --- Assignee: zhoukang > Lazy init DiskMapIterator#deserializeStream to reduce memory usage

[jira] [Resolved] (SPARK-23129) Lazy init DiskMapIterator#deserializeStream to reduce memory usage when ExternalAppendOnlyMap spill too much times

2018-01-24 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-23129. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 20292

[jira] [Resolved] (SPARK-23211) SparkR MLlib randomFroest parameter problem

2018-01-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-23211. --- Resolution: Invalid I can't make out what you're asking. Please put this to the mailing list first.

[jira] [Created] (SPARK-23211) SparkR MLlib randomFroest parameter problem

2018-01-24 Thread JIRA
黄龙龙 created SPARK-23211: --- Summary: SparkR MLlib randomFroest parameter problem Key: SPARK-23211 URL: https://issues.apache.org/jira/browse/SPARK-23211 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-23187) Accumulator object can not be sent from Executor to Driver

2018-01-24 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338737#comment-16338737 ] Saisai Shao commented on SPARK-23187: - Actually heartbeat report is OK according to my investigation,

[jira] [Commented] (SPARK-23210) Introduce the concept of default value to schema

2018-01-24 Thread LvDongrong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338675#comment-16338675 ] LvDongrong commented on SPARK-23210: Can we set the default value to be null, like hive? @maropu

[jira] [Created] (SPARK-23210) Introduce the concept of default value to schema

2018-01-24 Thread LvDongrong (JIRA)
LvDongrong created SPARK-23210: -- Summary: Introduce the concept of default value to schema Key: SPARK-23210 URL: https://issues.apache.org/jira/browse/SPARK-23210 Project: Spark Issue Type:

[jira] [Commented] (SPARK-23206) Additional Memory Tuning Metrics

2018-01-24 Thread Edwina Lu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338559#comment-16338559 ] Edwina Lu commented on SPARK-23206: --- Thanks, [~zsxwing]. Making the new metrics available in the

[jira] [Commented] (SPARK-23206) Additional Memory Tuning Metrics

2018-01-24 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338553#comment-16338553 ] Saisai Shao commented on SPARK-23206: - I think this Jira duplicates SPARK-9103. Also seems some

[jira] [Created] (SPARK-23209) HiveDelegationTokenProvider throws an exception if Hive jars are not the classpath

2018-01-24 Thread Sahil Takiar (JIRA)
Sahil Takiar created SPARK-23209: Summary: HiveDelegationTokenProvider throws an exception if Hive jars are not the classpath Key: SPARK-23209 URL: https://issues.apache.org/jira/browse/SPARK-23209

[jira] [Commented] (SPARK-23201) Cannot create view when duplicate columns exist in subquery

2018-01-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338551#comment-16338551 ] Dongjoon Hyun commented on SPARK-23201: --- Apache Spark 1.6.3 (released on November 7, 2016) also has

[jira] [Updated] (SPARK-23201) Cannot create view when duplicate columns exist in subquery

2018-01-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23201: -- Affects Version/s: 1.6.3 > Cannot create view when duplicate columns exist in subquery >

[jira] [Updated] (SPARK-21717) Decouple the generated codes of consuming rows in operators under whole-stage codegen

2018-01-24 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sameer Agarwal updated SPARK-21717: --- Target Version/s: 2.3.0 Priority: Critical (was: Major) > Decouple the

[jira] [Updated] (SPARK-22221) Add User Documentation for Working with Arrow in Spark

2018-01-24 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin updated SPARK-1: -- Target Version/s: 2.3.0 > Add User Documentation for Working with Arrow in Spark >

[jira] [Assigned] (SPARK-23208) GenArrayData produces illegal code

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23208: Assignee: Apache Spark (was: Herman van Hovell) > GenArrayData produces illegal code >

[jira] [Assigned] (SPARK-23208) GenArrayData produces illegal code

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23208: Assignee: Herman van Hovell (was: Apache Spark) > GenArrayData produces illegal code >

[jira] [Commented] (SPARK-23208) GenArrayData produces illegal code

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338537#comment-16338537 ] Apache Spark commented on SPARK-23208: -- User 'hvanhovell' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23207) Shuffle+Repartition on an RDD/DataFrame could lead to Data Loss

2018-01-24 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sameer Agarwal reassigned SPARK-23207: -- Assignee: Jiang Xingbo > Shuffle+Repartition on an RDD/DataFrame could lead to Data

[jira] [Updated] (SPARK-23207) Shuffle+Repartition on an RDD/DataFrame could lead to Data Loss

2018-01-24 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sameer Agarwal updated SPARK-23207: --- Priority: Blocker (was: Major) > Shuffle+Repartition on an RDD/DataFrame could lead to Data

[jira] [Comment Edited] (SPARK-23201) Cannot create view when duplicate columns exist in subquery

2018-01-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338525#comment-16338525 ] Dongjoon Hyun edited comment on SPARK-23201 at 1/25/18 1:11 AM: Hi,

[jira] [Updated] (SPARK-23208) GenArrayData produces illegal code

2018-01-24 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-23208: -- Target Version/s: 2.3.0 > GenArrayData produces illegal code >

[jira] [Commented] (SPARK-23201) Cannot create view when duplicate columns exist in subquery

2018-01-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338525#comment-16338525 ] Dongjoon Hyun commented on SPARK-23201: --- Hi, [~joha0123]. It seems to work in the latest Apache

[jira] [Assigned] (SPARK-23081) Add colRegex API to PySpark

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23081: Assignee: (was: Apache Spark) > Add colRegex API to PySpark >

[jira] [Assigned] (SPARK-23081) Add colRegex API to PySpark

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23081: Assignee: Apache Spark > Add colRegex API to PySpark > --- > >

[jira] [Commented] (SPARK-23081) Add colRegex API to PySpark

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338486#comment-16338486 ] Apache Spark commented on SPARK-23081: -- User 'huaxingao' has created a pull request for this issue:

[jira] [Created] (SPARK-23208) GenArrayData produces illegal code

2018-01-24 Thread Herman van Hovell (JIRA)
Herman van Hovell created SPARK-23208: - Summary: GenArrayData produces illegal code Key: SPARK-23208 URL: https://issues.apache.org/jira/browse/SPARK-23208 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-23206) Additional Memory Tuning Metrics

2018-01-24 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338427#comment-16338427 ] Shixiong Zhu commented on SPARK-23206: -- We can also just add more information to metrics system and

[jira] [Commented] (SPARK-23206) Additional Memory Tuning Metrics

2018-01-24 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338421#comment-16338421 ] Shixiong Zhu commented on SPARK-23206: -- Also cc [~jerryshao] since you were working on metrics

[jira] [Commented] (SPARK-23207) Shuffle+Repartition on an RDD/DataFrame could lead to Data Loss

2018-01-24 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338419#comment-16338419 ] Jiang Xingbo commented on SPARK-23207: -- I'm working on this. > Shuffle+Repartition on an

[jira] [Created] (SPARK-23207) Shuffle+Repartition on an RDD/DataFrame could lead to Data Loss

2018-01-24 Thread Jiang Xingbo (JIRA)
Jiang Xingbo created SPARK-23207: Summary: Shuffle+Repartition on an RDD/DataFrame could lead to Data Loss Key: SPARK-23207 URL: https://issues.apache.org/jira/browse/SPARK-23207 Project: Spark

[jira] [Commented] (SPARK-20641) Key-value store abstraction and implementation for storing application data

2018-01-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338402#comment-16338402 ] Marcelo Vanzin commented on SPARK-20641: [~rxin] sorry I missed you comment. As I explained in

[jira] [Assigned] (SPARK-20650) Remove JobProgressListener (and other unneeded classes)

2018-01-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-20650: -- Assignee: Marcelo Vanzin > Remove JobProgressListener (and other unneeded classes) >

[jira] [Assigned] (SPARK-20641) Key-value store abstraction and implementation for storing application data

2018-01-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-20641: -- Assignee: Marcelo Vanzin > Key-value store abstraction and implementation for storing

[jira] [Commented] (SPARK-23206) Additional Memory Tuning Metrics

2018-01-24 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338396#comment-16338396 ] Shixiong Zhu commented on SPARK-23206: -- cc [~vanzin]  > Additional Memory Tuning Metrics >

[jira] [Commented] (SPARK-23206) Additional Memory Tuning Metrics

2018-01-24 Thread Ye Zhou (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338378#comment-16338378 ] Ye Zhou commented on SPARK-23206: - [~zsxwing] Hi, Can you help find some one who can help review this

[jira] [Updated] (SPARK-23206) Additional Memory Tuning Metrics

2018-01-24 Thread Edwina Lu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edwina Lu updated SPARK-23206: -- Description: At LinkedIn, we have multiple clusters, running thousands of Spark applications, and

[jira] [Updated] (SPARK-23206) Additional Memory Tuning Metrics

2018-01-24 Thread Edwina Lu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edwina Lu updated SPARK-23206: -- Attachment: MemoryTuningMetricsDesignDoc.pdf > Additional Memory Tuning Metrics >

[jira] [Created] (SPARK-23206) Additional Memory Tuning Metrics

2018-01-24 Thread Edwina Lu (JIRA)
Edwina Lu created SPARK-23206: - Summary: Additional Memory Tuning Metrics Key: SPARK-23206 URL: https://issues.apache.org/jira/browse/SPARK-23206 Project: Spark Issue Type: Improvement

[jira] [Assigned] (SPARK-23205) ImageSchema.readImages incorrectly sets alpha channel to 255 for four-channel images

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23205: Assignee: Apache Spark > ImageSchema.readImages incorrectly sets alpha channel to 255 for

[jira] [Commented] (SPARK-23205) ImageSchema.readImages incorrectly sets alpha channel to 255 for four-channel images

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338362#comment-16338362 ] Apache Spark commented on SPARK-23205: -- User 'smurching' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23205) ImageSchema.readImages incorrectly sets alpha channel to 255 for four-channel images

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23205: Assignee: (was: Apache Spark) > ImageSchema.readImages incorrectly sets alpha channel

[jira] [Commented] (SPARK-23205) ImageSchema.readImages incorrectly sets alpha channel to 255 for four-channel images

2018-01-24 Thread Siddharth Murching (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338360#comment-16338360 ] Siddharth Murching commented on SPARK-23205: Working on a PR to address this issue >

[jira] [Comment Edited] (SPARK-23205) ImageSchema.readImages incorrectly sets alpha channel to 255 for four-channel images

2018-01-24 Thread Siddharth Murching (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338360#comment-16338360 ] Siddharth Murching edited comment on SPARK-23205 at 1/24/18 10:40 PM:

[jira] [Created] (SPARK-23205) ImageSchema.readImages incorrectly sets alpha channel to 255 for four-channel images

2018-01-24 Thread Siddharth Murching (JIRA)
Siddharth Murching created SPARK-23205: -- Summary: ImageSchema.readImages incorrectly sets alpha channel to 255 for four-channel images Key: SPARK-23205 URL: https://issues.apache.org/jira/browse/SPARK-23205

[jira] [Commented] (SPARK-23020) Re-enable Flaky Test: org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338317#comment-16338317 ] Apache Spark commented on SPARK-23020: -- User 'vanzin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23198) Fix KafkaContinuousSourceStressForDontFailOnDataLossSuite to test ContinuousExecution

2018-01-24 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-23198: Assignee: Dongjoon Hyun > Fix KafkaContinuousSourceStressForDontFailOnDataLossSuite to

[jira] [Resolved] (SPARK-23198) Fix KafkaContinuousSourceStressForDontFailOnDataLossSuite to test ContinuousExecution

2018-01-24 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-23198. -- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 20374

[jira] [Commented] (SPARK-23203) DataSourceV2 should use immutable trees.

2018-01-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338120#comment-16338120 ] Marcelo Vanzin commented on SPARK-23203: Ah, cool, wasn't aware that it was still experimental.

[jira] [Commented] (SPARK-22711) _pickle.PicklingError: args[0] from __newobj__ args has the wrong class from cloudpickle.py

2018-01-24 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338116#comment-16338116 ] Bryan Cutler commented on SPARK-22711: -- Yes, normally you would not need to import inside the

[jira] [Commented] (SPARK-23203) DataSourceV2 should use immutable trees.

2018-01-24 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338112#comment-16338112 ] Ryan Blue commented on SPARK-23203: --- [~vanzin], given that DataSourceV2 is experimental, I don't think

[jira] [Commented] (SPARK-23204) DataSourceV2 should support named tables in DataFrameReader, DataFrameWriter

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338089#comment-16338089 ] Apache Spark commented on SPARK-23204: -- User 'rdblue' has created a pull request for this issue:

[jira] [Commented] (SPARK-23203) DataSourceV2 should use immutable trees.

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338088#comment-16338088 ] Apache Spark commented on SPARK-23203: -- User 'rdblue' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23203) DataSourceV2 should use immutable trees.

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23203: Assignee: (was: Apache Spark) > DataSourceV2 should use immutable trees. >

[jira] [Assigned] (SPARK-23204) DataSourceV2 should support named tables in DataFrameReader, DataFrameWriter

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23204: Assignee: Apache Spark > DataSourceV2 should support named tables in DataFrameReader,

[jira] [Assigned] (SPARK-23203) DataSourceV2 should use immutable trees.

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23203: Assignee: Apache Spark > DataSourceV2 should use immutable trees. >

[jira] [Assigned] (SPARK-23204) DataSourceV2 should support named tables in DataFrameReader, DataFrameWriter

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23204: Assignee: (was: Apache Spark) > DataSourceV2 should support named tables in

[jira] [Commented] (SPARK-22711) _pickle.PicklingError: args[0] from __newobj__ args has the wrong class from cloudpickle.py

2018-01-24 Thread Prateek (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338086#comment-16338086 ] Prateek commented on SPARK-22711: - Thanks [~bryanc]. I will check. Do we have to import the libraries in

[jira] [Updated] (SPARK-23204) DataSourceV2 should support named tables in DataFrameReader, DataFrameWriter

2018-01-24 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-23204: -- Description: DataSourceV2 is currently only configured with a path, passed in options as {{path}}. 

[jira] [Commented] (SPARK-23189) reflect stage level blacklisting on executor tab

2018-01-24 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338083#comment-16338083 ] Imran Rashid commented on SPARK-23189: -- OK, since nobody feels strongly and to avoid bike-shedding

[jira] [Updated] (SPARK-23204) DataSourceV2 should support named tables in DataFrameReader, DataFrameWriter

2018-01-24 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-23204: -- Description: DataSourceV2 is currently only configured with a path, passed in options as {{path}}. 

[jira] [Created] (SPARK-23204) DataSourceV2 should support named tables in DataFrameReader, DataFrameWriter

2018-01-24 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-23204: - Summary: DataSourceV2 should support named tables in DataFrameReader, DataFrameWriter Key: SPARK-23204 URL: https://issues.apache.org/jira/browse/SPARK-23204 Project:

[jira] [Commented] (SPARK-22386) Data Source V2 improvements

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338071#comment-16338071 ] Apache Spark commented on SPARK-22386: -- User 'rdblue' has created a pull request for this issue:

[jira] [Assigned] (SPARK-22386) Data Source V2 improvements

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22386: Assignee: (was: Apache Spark) > Data Source V2 improvements >

[jira] [Assigned] (SPARK-22386) Data Source V2 improvements

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22386: Assignee: Apache Spark > Data Source V2 improvements > --- > >

[jira] [Commented] (SPARK-22711) _pickle.PicklingError: args[0] from __newobj__ args has the wrong class from cloudpickle.py

2018-01-24 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338070#comment-16338070 ] Bryan Cutler commented on SPARK-22711: -- Hi [~PrateekRM], here is your code trimmed down to where the

[jira] [Commented] (SPARK-23203) DataSourceV2 should use immutable trees.

2018-01-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338063#comment-16338063 ] Marcelo Vanzin commented on SPARK-23203: [~rdblue] is this something that affects the API or is

[jira] [Created] (SPARK-23203) DataSourceV2 should use immutable trees.

2018-01-24 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-23203: - Summary: DataSourceV2 should use immutable trees. Key: SPARK-23203 URL: https://issues.apache.org/jira/browse/SPARK-23203 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-23203) DataSourceV2 should use immutable trees.

2018-01-24 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-23203: -- Environment: (was: The DataSourceV2 integration doesn't use [immutable

[jira] [Updated] (SPARK-23203) DataSourceV2 should use immutable trees.

2018-01-24 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-23203: -- Description: The DataSourceV2 integration doesn't use [immutable

[jira] [Commented] (SPARK-23117) SparkR 2.3 QA: Check for new R APIs requiring example code

2018-01-24 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338043#comment-16338043 ] Felix Cheung commented on SPARK-23117: -- I'm ok to sign off if we don't have example for SPARK-20307

[jira] [Commented] (SPARK-17147) Spark Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction)

2018-01-24 Thread Justin Miller (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338032#comment-16338032 ] Justin Miller commented on SPARK-17147: --- I'm also seeing this behavior on a topic that has

[jira] [Assigned] (SPARK-22297) Flaky test: BlockManagerSuite "Shuffle registration timeout and maxAttempts conf"

2018-01-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-22297: -- Assignee: Mark Petruska > Flaky test: BlockManagerSuite "Shuffle registration timeout

[jira] [Resolved] (SPARK-22297) Flaky test: BlockManagerSuite "Shuffle registration timeout and maxAttempts conf"

2018-01-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-22297. Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 19671

[jira] [Commented] (SPARK-23020) Re-enable Flaky Test: org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher

2018-01-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338023#comment-16338023 ] Marcelo Vanzin commented on SPARK-23020: Argh. Feel free to disable it in branch-2.3; please

[jira] [Commented] (SPARK-23020) Re-enable Flaky Test: org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher

2018-01-24 Thread Sameer Agarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338016#comment-16338016 ] Sameer Agarwal commented on SPARK-23020: FYI The {{SparkLauncherSuite}} test is still failing

[jira] [Assigned] (SPARK-23152) Invalid guard condition in org.apache.spark.ml.classification.Classifier

2018-01-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-23152: - Assignee: Matthew Tovbin > Invalid guard condition in

[jira] [Resolved] (SPARK-23152) Invalid guard condition in org.apache.spark.ml.classification.Classifier

2018-01-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-23152. --- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 20321

[jira] [Resolved] (SPARK-22837) Session timeout checker does not work in SessionManager

2018-01-24 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-22837. - Resolution: Fixed Assignee: zuotingbing Fix Version/s: 2.3.0 > Session timeout checker

[jira] [Commented] (SPARK-23189) reflect stage level blacklisting on executor tab

2018-01-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338005#comment-16338005 ] Thomas Graves commented on SPARK-23189: --- for large jobs the specific stage page is a pain to

[jira] [Commented] (SPARK-23189) reflect stage level blacklisting on executor tab

2018-01-24 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337957#comment-16337957 ] Imran Rashid commented on SPARK-23189: -- [~tgraves] -- why do you use the executors page instead of

[jira] [Resolved] (SPARK-23115) SparkR 2.3 QA: New R APIs and API docs

2018-01-24 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-23115. -- Resolution: Fixed Assignee: Felix Cheung > SparkR 2.3 QA: New R APIs and API docs >

[jira] [Commented] (SPARK-23115) SparkR 2.3 QA: New R APIs and API docs

2018-01-24 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337947#comment-16337947 ] Felix Cheung commented on SPARK-23115: -- done > SparkR 2.3 QA: New R APIs and API docs >

[jira] [Resolved] (SPARK-22577) executor page blacklist status should update with TaskSet level blacklisting

2018-01-24 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid resolved SPARK-22577. -- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 20203

[jira] [Assigned] (SPARK-22577) executor page blacklist status should update with TaskSet level blacklisting

2018-01-24 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid reassigned SPARK-22577: Assignee: Attila Zsolt Piros > executor page blacklist status should update with TaskSet

[jira] [Commented] (SPARK-23202) Break down DataSourceV2Writer.commit into two phase

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337927#comment-16337927 ] Apache Spark commented on SPARK-23202: -- User 'gengliangwang' has created a pull request for this

[jira] [Assigned] (SPARK-23202) Break down DataSourceV2Writer.commit into two phase

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23202: Assignee: (was: Apache Spark) > Break down DataSourceV2Writer.commit into two phase >

[jira] [Assigned] (SPARK-23202) Break down DataSourceV2Writer.commit into two phase

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23202: Assignee: Apache Spark > Break down DataSourceV2Writer.commit into two phase >

[jira] [Created] (SPARK-23202) Break down DataSourceV2Writer.commit into two phase

2018-01-24 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-23202: -- Summary: Break down DataSourceV2Writer.commit into two phase Key: SPARK-23202 URL: https://issues.apache.org/jira/browse/SPARK-23202 Project: Spark

[jira] [Commented] (SPARK-22711) _pickle.PicklingError: args[0] from __newobj__ args has the wrong class from cloudpickle.py

2018-01-24 Thread Prateek (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337875#comment-16337875 ] Prateek commented on SPARK-22711: - I don't understand what you mean. updated code is in attachment

[jira] [Assigned] (SPARK-21396) Spark Hive Thriftserver doesn't return UDT field

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21396: Assignee: Apache Spark > Spark Hive Thriftserver doesn't return UDT field >

[jira] [Assigned] (SPARK-23195) Hint of cached data is lost

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23195: Assignee: Apache Spark (was: Xiao Li) > Hint of cached data is lost >

[jira] [Assigned] (SPARK-23195) Hint of cached data is lost

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23195: Assignee: Xiao Li (was: Apache Spark) > Hint of cached data is lost >

[jira] [Assigned] (SPARK-21396) Spark Hive Thriftserver doesn't return UDT field

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21396: Assignee: (was: Apache Spark) > Spark Hive Thriftserver doesn't return UDT field >

[jira] [Commented] (SPARK-21396) Spark Hive Thriftserver doesn't return UDT field

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337820#comment-16337820 ] Apache Spark commented on SPARK-21396: -- User 'atallahhezbor' has created a pull request for this

[jira] [Reopened] (SPARK-23195) Hint of cached data is lost

2018-01-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reopened SPARK-23195: --- Since this is reverted, let's keep this open until the on-going PR is merged back. > Hint of

[jira] [Updated] (SPARK-23195) Hint of cached data is lost

2018-01-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23195: -- Fix Version/s: (was: 2.3.1) > Hint of cached data is lost > --- >

[jira] [Commented] (SPARK-13108) Encoding not working with non-ascii compatible encodings (UTF-16/32 etc.)

2018-01-24 Thread Rafael Cavazin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337775#comment-16337775 ] Rafael Cavazin commented on SPARK-13108: [~hyukjin.kwon] was this issue fixed? the PR was closed

[jira] [Commented] (SPARK-23195) Hint of cached data is lost

2018-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337771#comment-16337771 ] Apache Spark commented on SPARK-23195: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Commented] (SPARK-15348) Hive ACID

2018-01-24 Thread Arvind Jajoo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337760#comment-16337760 ] Arvind Jajoo commented on SPARK-15348: -- I think in order to have an end to end streaming ETL

[jira] [Resolved] (SPARK-22784) Configure reading buffer size in Spark History Server

2018-01-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-22784. --- Resolution: Won't Fix > Configure reading buffer size in Spark History Server >

  1   2   >