[jira] [Resolved] (SPARK-23515) JsonProtocol.sparkEventToJson can OOM when jsonifying an event

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-23515. - Resolution: Won't Fix > JsonProtocol.sparkEventToJson can OOM when jsonifying an event >

[jira] [Commented] (SPARK-23200) Reset configuration when restarting from checkpoints

2018-09-10 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610089#comment-16610089 ] Felix Cheung commented on SPARK-23200: -- probably need someone to rebuild on the current config

[jira] [Updated] (SPARK-23243) Shuffle+Repartition on an RDD could lead to incorrect answers

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-23243: Fix Version/s: 2.2.3 > Shuffle+Repartition on an RDD could lead to incorrect answers >

[jira] [Updated] (SPARK-20715) MapStatuses shouldn't be redundantly stored in both ShuffleMapStage and MapOutputTracker

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-20715: Fix Version/s: 2.2.3 > MapStatuses shouldn't be redundantly stored in both ShuffleMapStage and >

[jira] [Updated] (SPARK-22632) Fix the behavior of timestamp values for R's DataFrame to respect session timezone

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-22632: Target Version/s: 3.0.0 (was: 2.4.0) > Fix the behavior of timestamp values for R's DataFrame to

[jira] [Commented] (SPARK-22632) Fix the behavior of timestamp values for R's DataFrame to respect session timezone

2018-09-10 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610083#comment-16610083 ] Felix Cheung commented on SPARK-22632: -- mismatch between R and JVM time zone could be an issue but

[jira] [Commented] (SPARK-23580) Interpreted mode fallback should be implemented for all expressions & projections

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610082#comment-16610082 ] Wenchen Fan commented on SPARK-23580: - There are still 2 open PRs, one is inactive for a while, one

[jira] [Commented] (SPARK-25367) The column attributes obtained by Spark sql are inconsistent with hive

2018-09-10 Thread yy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610056#comment-16610056 ] yy commented on SPARK-25367: [~hyukjin.kwon] Thank you for your correction, I will pay attention next time.

[jira] [Commented] (SPARK-24882) data source v2 API improvement

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610029#comment-16610029 ] Apache Spark commented on SPARK-24882: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Commented] (SPARK-25313) Fix regression in FileFormatWriter output schema

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610021#comment-16610021 ] Apache Spark commented on SPARK-25313: -- User 'wangyum' has created a pull request for this issue:

[jira] [Updated] (SPARK-25184) Flaky test: FlatMapGroupsWithState "streaming with processing time timeout"

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-25184: Fix Version/s: (was: 3.0.0) 2.4.0 > Flaky test: FlatMapGroupsWithState

[jira] [Commented] (SPARK-25397) SparkSession.conf fails when given default value with Python 3

2018-09-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609928#comment-16609928 ] Hyukjin Kwon commented on SPARK-25397: -- That's fixed in

[jira] [Assigned] (SPARK-25399) Reusing execution threads from continuous processing for microbatch streaming can result in correctness issues

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25399: Assignee: (was: Apache Spark) > Reusing execution threads from continuous processing

[jira] [Commented] (SPARK-25399) Reusing execution threads from continuous processing for microbatch streaming can result in correctness issues

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609924#comment-16609924 ] Apache Spark commented on SPARK-25399: -- User 'mukulmurthy' has created a pull request for this

[jira] [Assigned] (SPARK-25399) Reusing execution threads from continuous processing for microbatch streaming can result in correctness issues

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25399: Assignee: Apache Spark > Reusing execution threads from continuous processing for

[jira] [Commented] (SPARK-25072) PySpark custom Row class can be given extra parameters

2018-09-10 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609920#comment-16609920 ] Dongjoon Hyun commented on SPARK-25072: --- [~bryanc] and [~smilegator] Since this is reverted from

[jira] [Updated] (SPARK-25072) PySpark custom Row class can be given extra parameters

2018-09-10 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25072: -- Fix Version/s: (was: 2.3.2) > PySpark custom Row class can be given extra parameters >

[jira] [Commented] (SPARK-23597) Audit Spark SQL code base for non-interpreted expressions

2018-09-10 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609918#comment-16609918 ] Liang-Chi Hsieh commented on SPARK-23597: - At least, I didn't find expressions that do not

[jira] [Updated] (SPARK-13587) Support virtualenv in PySpark

2018-09-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-13587: - Target Version/s: 3.0.0 (was: 2.4.0) > Support virtualenv in PySpark >

[jira] [Commented] (SPARK-25331) Structured Streaming File Sink duplicates records in case of driver failure

2018-09-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609892#comment-16609892 ] Reynold Xin commented on SPARK-25331: - Yes I would rely on idempotency here. Retries upon failure +

[jira] [Assigned] (SPARK-25400) Increase timeouts in schedulerIntegrationSuite

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25400: Assignee: (was: Apache Spark) > Increase timeouts in schedulerIntegrationSuite >

[jira] [Commented] (SPARK-25400) Increase timeouts in schedulerIntegrationSuite

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609841#comment-16609841 ] Apache Spark commented on SPARK-25400: -- User 'squito' has created a pull request for this issue:

[jira] [Assigned] (SPARK-25400) Increase timeouts in schedulerIntegrationSuite

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25400: Assignee: Apache Spark > Increase timeouts in schedulerIntegrationSuite >

[jira] [Commented] (SPARK-25400) Increase timeouts in schedulerIntegrationSuite

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609839#comment-16609839 ] Apache Spark commented on SPARK-25400: -- User 'squito' has created a pull request for this issue:

[jira] [Assigned] (SPARK-25400) Increase timeouts in schedulerIntegrationSuite

2018-09-10 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid reassigned SPARK-25400: Assignee: (was: Imran Rashid) > Increase timeouts in schedulerIntegrationSuite >

[jira] [Created] (SPARK-25400) Increase timeouts in schedulerIntegrationSuite

2018-09-10 Thread Imran Rashid (JIRA)
Imran Rashid created SPARK-25400: Summary: Increase timeouts in schedulerIntegrationSuite Key: SPARK-25400 URL: https://issues.apache.org/jira/browse/SPARK-25400 Project: Spark Issue Type:

[jira] [Updated] (SPARK-25399) Reusing execution threads from continuous processing for microbatch streaming can result in correctness issues

2018-09-10 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-25399: Priority: Critical (was: Blocker) > Reusing execution threads from continuous processing for microbatch

[jira] [Updated] (SPARK-25399) Reusing execution threads from continuous processing for microbatch streaming can result in correctness issues

2018-09-10 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-25399: Labels: correctness (was: ) > Reusing execution threads from continuous processing for microbatch

[jira] [Updated] (SPARK-25399) Reusing execution threads from continuous processing for microbatch streaming can result in correctness issues

2018-09-10 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-25399: Priority: Blocker (was: Major) > Reusing execution threads from continuous processing for microbatch

[jira] [Updated] (SPARK-25399) Reusing execution threads from continuous processing for microbatch streaming can result in correctness issues

2018-09-10 Thread Mukul Murthy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Murthy updated SPARK-25399: - Priority: Major (was: Blocker) > Reusing execution threads from continuous processing for

[jira] [Commented] (SPARK-25399) Reusing execution threads from continuous processing for microbatch streaming can result in correctness issues

2018-09-10 Thread Mukul Murthy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609819#comment-16609819 ] Mukul Murthy commented on SPARK-25399: -- cc [~joseph.torres] and [~tdas] > Reusing execution

[jira] [Created] (SPARK-25399) Reusing execution threads from continuous processing for microbatch streaming can result in correctness issues

2018-09-10 Thread Mukul Murthy (JIRA)
Mukul Murthy created SPARK-25399: Summary: Reusing execution threads from continuous processing for microbatch streaming can result in correctness issues Key: SPARK-25399 URL:

[jira] [Commented] (SPARK-23580) Interpreted mode fallback should be implemented for all expressions & projections

2018-09-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609787#comment-16609787 ] Reynold Xin commented on SPARK-23580: - 90% or 100%?   > Interpreted mode fallback should be

[jira] [Commented] (SPARK-25398) Minor bugs from comparing unrelated types

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609730#comment-16609730 ] Apache Spark commented on SPARK-25398: -- User 'srowen' has created a pull request for this issue:

[jira] [Assigned] (SPARK-25398) Minor bugs from comparing unrelated types

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25398: Assignee: Sean Owen (was: Apache Spark) > Minor bugs from comparing unrelated types >

[jira] [Assigned] (SPARK-25398) Minor bugs from comparing unrelated types

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25398: Assignee: Apache Spark (was: Sean Owen) > Minor bugs from comparing unrelated types >

[jira] [Commented] (SPARK-25398) Minor bugs from comparing unrelated types

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609728#comment-16609728 ] Apache Spark commented on SPARK-25398: -- User 'srowen' has created a pull request for this issue:

[jira] [Created] (SPARK-25398) Minor bugs from comparing unrelated types

2018-09-10 Thread Sean Owen (JIRA)
Sean Owen created SPARK-25398: - Summary: Minor bugs from comparing unrelated types Key: SPARK-25398 URL: https://issues.apache.org/jira/browse/SPARK-25398 Project: Spark Issue Type: Bug

[jira] [Comment Edited] (SPARK-23986) CompileException when using too many avg aggregation after joining

2018-09-10 Thread Dmitry Zanozin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609708#comment-16609708 ] Dmitry Zanozin edited comment on SPARK-23986 at 9/10/18 7:47 PM: - Spark

[jira] [Commented] (SPARK-23986) CompileException when using too many avg aggregation after joining

2018-09-10 Thread Dmitry Zanozin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609708#comment-16609708 ] Dmitry Zanozin commented on SPARK-23986: Spark 2.3.1 still generates methods with duplicate

[jira] [Resolved] (SPARK-23672) Document Support returning lists in Arrow UDFs

2018-09-10 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-23672. -- Resolution: Fixed Fix Version/s: 2.4.0 3.0.0 Issue resolved by pull

[jira] [Assigned] (SPARK-23672) Document Support returning lists in Arrow UDFs

2018-09-10 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-23672: Assignee: holdenk > Document Support returning lists in Arrow UDFs >

[jira] [Comment Edited] (SPARK-12417) Orc bloom filter options are not propagated during file write in spark

2018-09-10 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609638#comment-16609638 ] Dongjoon Hyun edited comment on SPARK-12417 at 9/10/18 6:23 PM: This is

[jira] [Resolved] (SPARK-12417) Orc bloom filter options are not propagated during file write in spark

2018-09-10 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-12417. --- Resolution: Fixed Fix Version/s: 2.0.0 This is fixed since 2.0.0. {code} scala>

[jira] [Updated] (SPARK-23425) load data for hdfs file path with wild card usage is not working properly

2018-09-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-23425: -- Docs Text: Release notes: Wildcard symbols {{*}} and {{?}} can now be used in SQL paths when loading

[jira] [Commented] (SPARK-25396) Read array of JSON objects via an Iterator

2018-09-10 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609605#comment-16609605 ] Maxim Gekk commented on SPARK-25396: I have a concern regarding to when I should close Jackson

[jira] [Commented] (SPARK-23425) load data for hdfs file path with wild card usage is not working properly

2018-09-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609595#comment-16609595 ] Shixiong Zhu commented on SPARK-23425: -- Added "release-note" label. Previously, when INPATH

[jira] [Commented] (SPARK-25397) SparkSession.conf fails when given default value with Python 3

2018-09-10 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609593#comment-16609593 ] Joseph K. Bradley commented on SPARK-25397: --- CC [~smilegator], [~cloud_fan] for visibility >

[jira] [Resolved] (SPARK-25395) Remove Spark Optional Java API

2018-09-10 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-25395. Resolution: Duplicate > Remove Spark Optional Java API > -- >

[jira] [Updated] (SPARK-25397) SparkSession.conf fails when given default value with Python 3

2018-09-10 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-25397: -- Priority: Minor (was: Major) > SparkSession.conf fails when given default value with

[jira] [Resolved] (SPARK-25091) UNCACHE TABLE, CLEAR CACHE, rdd.unpersist() does not clean up executor memory

2018-09-10 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-25091. Resolution: Duplicate > UNCACHE TABLE, CLEAR CACHE, rdd.unpersist() does not clean up

[jira] [Updated] (SPARK-23425) load data for hdfs file path with wild card usage is not working properly

2018-09-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-23425: - Labels: release-notes (was: release) > load data for hdfs file path with wild card usage is

[jira] [Created] (SPARK-25397) SparkSession.conf fails when given default value with Python 3

2018-09-10 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-25397: - Summary: SparkSession.conf fails when given default value with Python 3 Key: SPARK-25397 URL: https://issues.apache.org/jira/browse/SPARK-25397 Project:

[jira] [Updated] (SPARK-23425) load data for hdfs file path with wild card usage is not working properly

2018-09-10 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-23425: - Labels: release (was: ) > load data for hdfs file path with wild card usage is not working

[jira] [Commented] (SPARK-25332) Instead of broadcast hash join ,Sort merge join has selected when restart spark-shell/spark-JDBC for hive provider

2018-09-10 Thread Babulal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609534#comment-16609534 ] Babulal commented on SPARK-25332: - Hi [~maropu]  it seems to be a straightforward issue so raised

[jira] [Commented] (SPARK-21542) Helper functions for custom Python Persistence

2018-09-10 Thread Peter Knight (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609495#comment-16609495 ] Peter Knight commented on SPARK-21542: -- It would be really helpful to have some example code on how

[jira] [Commented] (SPARK-25396) Read array of JSON objects via an Iterator

2018-09-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609479#comment-16609479 ] Hyukjin Kwon commented on SPARK-25396: -- At that time, there's no multiple mode or json functions.

[jira] [Commented] (SPARK-25396) Read array of JSON objects via an Iterator

2018-09-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609475#comment-16609475 ] Hyukjin Kwon commented on SPARK-25396: -- Oh haha yea I tried this by myself before and kind of

[jira] [Updated] (SPARK-25396) Read array of JSON objects via an Iterator

2018-09-10 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxim Gekk updated SPARK-25396: --- Description: If a JSON file has a structure like below: {code} [ {

[jira] [Commented] (SPARK-25396) Read array of JSON objects via an Iterator

2018-09-10 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609469#comment-16609469 ] Maxim Gekk commented on SPARK-25396: [~hyukjin.kwon] WDYT > Read array of JSON objects via an

[jira] [Updated] (SPARK-25378) ArrayData.toArray(StringType) assume UTF8String in 2.4

2018-09-10 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-25378: -- Summary: ArrayData.toArray(StringType) assume UTF8String in 2.4 (was: ArrayData.toArray

[jira] [Created] (SPARK-25396) Read array of JSON objects via an Iterator

2018-09-10 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-25396: -- Summary: Read array of JSON objects via an Iterator Key: SPARK-25396 URL: https://issues.apache.org/jira/browse/SPARK-25396 Project: Spark Issue Type:

[jira] [Commented] (SPARK-25378) ArrayData.toArray assume UTF8String

2018-09-10 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609467#comment-16609467 ] Xiangrui Meng commented on SPARK-25378: --- I sent a PR to spark-tensorflow-connector at

[jira] [Commented] (SPARK-23597) Audit Spark SQL code base for non-interpreted expressions

2018-09-10 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609411#comment-16609411 ] Marco Gaido commented on SPARK-23597: - I haven't, [~hvanhovell]? > Audit Spark SQL code base for

[jira] [Commented] (SPARK-25376) Scenarios we should handle but missed in 2.4 for barrier execution mode

2018-09-10 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609391#comment-16609391 ] Imran Rashid commented on SPARK-25376: -- I raised some of my concerns on one of the earlier PRs,

[jira] [Commented] (SPARK-25395) Remove Spark Optional Java API

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609357#comment-16609357 ] Apache Spark commented on SPARK-25395: -- User 'mmolimar' has created a pull request for this issue:

[jira] [Assigned] (SPARK-25395) Remove Spark Optional Java API

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25395: Assignee: (was: Apache Spark) > Remove Spark Optional Java API >

[jira] [Commented] (SPARK-25395) Remove Spark Optional Java API

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609356#comment-16609356 ] Apache Spark commented on SPARK-25395: -- User 'mmolimar' has created a pull request for this issue:

[jira] [Assigned] (SPARK-25395) Remove Spark Optional Java API

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25395: Assignee: Apache Spark > Remove Spark Optional Java API > --

[jira] [Commented] (SPARK-21291) R bucketBy partitionBy API

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609325#comment-16609325 ] Wenchen Fan commented on SPARK-21291: - I'm removing the target version, since no one is working on

[jira] [Updated] (SPARK-21291) R bucketBy partitionBy API

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-21291: Target Version/s: (was: 2.4.0) > R bucketBy partitionBy API > -- > >

[jira] [Created] (SPARK-25395) Remove Spark Optional Java API

2018-09-10 Thread Mario Molina (JIRA)
Mario Molina created SPARK-25395: Summary: Remove Spark Optional Java API Key: SPARK-25395 URL: https://issues.apache.org/jira/browse/SPARK-25395 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-21320) Make sure all expressions support interpreted evaluation

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-21320. - Resolution: Duplicate > Make sure all expressions support interpreted evaluation >

[jira] [Commented] (SPARK-21320) Make sure all expressions support interpreted evaluation

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609300#comment-16609300 ] Wenchen Fan commented on SPARK-21320: - This is replaced by

[jira] [Updated] (SPARK-21395) Spark SQL hive-thriftserver doesn't register operation log before execute sql statement

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-21395: Target Version/s: (was: 2.4.0) > Spark SQL hive-thriftserver doesn't register operation log

[jira] [Commented] (SPARK-21395) Spark SQL hive-thriftserver doesn't register operation log before execute sql statement

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609298#comment-16609298 ] Wenchen Fan commented on SPARK-21395: - I'm removing the target version, since we are not going to

[jira] [Updated] (SPARK-21940) Support timezone for timestamps in SparkR

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-21940: Target Version/s: (was: 2.4.0) > Support timezone for timestamps in SparkR >

[jira] [Commented] (SPARK-21940) Support timezone for timestamps in SparkR

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609297#comment-16609297 ] Wenchen Fan commented on SPARK-21940: - I'm removing the target version, since no one is working on

[jira] [Updated] (SPARK-21972) Allow users to control input data persistence in ML Estimators via a handlePersistence ml.Param

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-21972: Target Version/s: (was: 2.4.0) > Allow users to control input data persistence in ML Estimators

[jira] [Commented] (SPARK-21972) Allow users to control input data persistence in ML Estimators via a handlePersistence ml.Param

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609296#comment-16609296 ] Wenchen Fan commented on SPARK-21972: - I'm removing the target version, since we are not going to

[jira] [Updated] (SPARK-22054) Allow release managers to inject their keys

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-22054: Target Version/s: (was: 2.4.0) > Allow release managers to inject their keys >

[jira] [Commented] (SPARK-22054) Allow release managers to inject their keys

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609295#comment-16609295 ] Wenchen Fan commented on SPARK-22054: - I'm removing the target version, since we can't make it

[jira] [Updated] (SPARK-22055) Port release scripts

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-22055: Target Version/s: (was: 2.4.0) > Port release scripts > > >

[jira] [Commented] (SPARK-22055) Port release scripts

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609294#comment-16609294 ] Wenchen Fan commented on SPARK-22055: - I'm removing the target version, since we can't make it

[jira] [Commented] (SPARK-22632) Fix the behavior of timestamp values for R's DataFrame to respect session timezone

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609292#comment-16609292 ] Wenchen Fan commented on SPARK-22632: - Is this still a problem now? > Fix the behavior of timestamp

[jira] [Commented] (SPARK-20715) MapStatuses shouldn't be redundantly stored in both ShuffleMapStage and MapOutputTracker

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609291#comment-16609291 ] Apache Spark commented on SPARK-20715: -- User 'bersprockets' has created a pull request for this

[jira] [Commented] (SPARK-20715) MapStatuses shouldn't be redundantly stored in both ShuffleMapStage and MapOutputTracker

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609290#comment-16609290 ] Apache Spark commented on SPARK-20715: -- User 'bersprockets' has created a pull request for this

[jira] [Commented] (SPARK-23243) Shuffle+Repartition on an RDD could lead to incorrect answers

2018-09-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609289#comment-16609289 ] Apache Spark commented on SPARK-23243: -- User 'bersprockets' has created a pull request for this

[jira] [Commented] (SPARK-22796) Add multiple column support to PySpark QuantileDiscretizer

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609286#comment-16609286 ] Wenchen Fan commented on SPARK-22796: - I'm removing the target version, since no progress yet. >

[jira] [Updated] (SPARK-22796) Add multiple column support to PySpark QuantileDiscretizer

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-22796: Target Version/s: (was: 2.4.0) > Add multiple column support to PySpark QuantileDiscretizer >

[jira] [Commented] (SPARK-22798) Add multiple column support to PySpark StringIndexer

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609285#comment-16609285 ] Wenchen Fan commented on SPARK-22798: - I'm removing the target version, since no progress yet. >

[jira] [Updated] (SPARK-22798) Add multiple column support to PySpark StringIndexer

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-22798: Target Version/s: (was: 2.4.0) > Add multiple column support to PySpark StringIndexer >

[jira] [Commented] (SPARK-23153) Support application dependencies in submission client's local file system

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609283#comment-16609283 ] Wenchen Fan commented on SPARK-23153: - I'm removing the target version, since no one is working on

[jira] [Updated] (SPARK-23153) Support application dependencies in submission client's local file system

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-23153: Target Version/s: (was: 2.4.0) > Support application dependencies in submission client's local

[jira] [Commented] (SPARK-23160) Add more window sql tests

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609282#comment-16609282 ] Wenchen Fan commented on SPARK-23160: - I'm removing the target version, since no one is working on

[jira] [Updated] (SPARK-23171) Reduce the time costs of the rule runs that do not change the plans

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-23171: Target Version/s: (was: 2.4.0) > Reduce the time costs of the rule runs that do not change the

[jira] [Updated] (SPARK-23160) Add more window sql tests

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-23160: Target Version/s: (was: 2.4.0) > Add more window sql tests > - > >

[jira] [Commented] (SPARK-23200) Reset configuration when restarting from checkpoints

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609278#comment-16609278 ] Wenchen Fan commented on SPARK-23200: - Is there any followup here? This seems an important fix to

[jira] [Commented] (SPARK-23171) Reduce the time costs of the rule runs that do not change the plans

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609279#comment-16609279 ] Wenchen Fan commented on SPARK-23171: - I'm removing the target version, since no progress so far. >

[jira] [Commented] (SPARK-23483) Feature parity for Python vs Scala APIs

2018-09-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609273#comment-16609273 ] Wenchen Fan commented on SPARK-23483: - Can we resolve this ticket? seems 90% done. > Feature parity

  1   2   >