[jira] [Created] (SPARK-28743) YarnShuffleService leads to NodeManager OOM because ChannelOutboundBuffer has to many entry

2019-08-15 Thread Jiandan Yang (JIRA)
Jiandan Yang created SPARK-28743: - Summary: YarnShuffleService leads to NodeManager OOM because ChannelOutboundBuffer has to many entry Key: SPARK-28743 URL: https://issues.apache.org/jira/browse/SPARK-28743

[jira] [Created] (SPARK-28745) Add benchmarks for `extract()`

2019-08-15 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-28745: -- Summary: Add benchmarks for `extract()` Key: SPARK-28745 URL: https://issues.apache.org/jira/browse/SPARK-28745 Project: Spark Issue Type: New Feature

[jira] [Created] (SPARK-28746) Add repartitionby hint to support RepartitionByExpression

2019-08-15 Thread ulysses you (JIRA)
ulysses you created SPARK-28746: --- Summary: Add repartitionby hint to support RepartitionByExpression Key: SPARK-28746 URL: https://issues.apache.org/jira/browse/SPARK-28746 Project: Spark

[jira] [Assigned] (SPARK-27592) Set the bucketed data source table SerDe correctly

2019-08-15 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-27592: --- Assignee: Yuming Wang > Set the bucketed data source table SerDe correctly >

[jira] [Resolved] (SPARK-27592) Set the bucketed data source table SerDe correctly

2019-08-15 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-27592. - Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24486

[jira] [Assigned] (SPARK-28543) Document Spark Jobs page

2019-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-28543: - Assignee: Pablo Langa Blanco > Document Spark Jobs page > > >

[jira] [Resolved] (SPARK-28543) Document Spark Jobs page

2019-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-28543. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 25424

[jira] [Updated] (SPARK-28741) Throw exceptions when casting to integers causes overflow

2019-08-15 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-28741: --- Description: To follow ANSI SQL, we should support a configurable mode that throws

[jira] [Updated] (SPARK-28742) StackOverflowError when using otherwise(col()) in a loop

2019-08-15 Thread Ivan Tsukanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Tsukanov updated SPARK-28742: -- Description: The following code {code:java} val rdd = sparkContext.makeRDD(Seq(Row("1"))) val

[jira] [Updated] (SPARK-28743) YarnShuffleService leads to NodeManager OOM because ChannelOutboundBuffer has to many entry

2019-08-15 Thread Jiandan Yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiandan Yang updated SPARK-28743: -- Affects Version/s: (was: 2.2.3) 2.4.0 > YarnShuffleService leads

[jira] [Updated] (SPARK-28742) StackOverflowError when using otherwise(col()) in a loop

2019-08-15 Thread Ivan Tsukanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Tsukanov updated SPARK-28742: -- Description: The following code {code:java} val rdd = sparkContext.makeRDD(Seq(Row("1"))) val

[jira] [Updated] (SPARK-28742) StackOverflowError when using otherwise(col()) in a loop

2019-08-15 Thread Ivan Tsukanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Tsukanov updated SPARK-28742: -- Description: The following code {code:java} val rdd = sparkContext.makeRDD(Seq(Row("1"))) val

[jira] [Created] (SPARK-28741) Throw exceptions when casting to integers causes overflow

2019-08-15 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-28741: -- Summary: Throw exceptions when casting to integers causes overflow Key: SPARK-28741 URL: https://issues.apache.org/jira/browse/SPARK-28741 Project: Spark

[jira] [Updated] (SPARK-28743) YarnShuffleService leads to NodeManager OOM because ChannelOutboundBuffer has to many entry

2019-08-15 Thread Jiandan Yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiandan Yang updated SPARK-28743: -- Affects Version/s: (was: 2.4.0) 2.3.0 > YarnShuffleService leads

[jira] [Resolved] (SPARK-28695) Make Kafka source more robust with CaseInsensitiveMap

2019-08-15 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-28695. - Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 25418

[jira] [Assigned] (SPARK-28695) Make Kafka source more robust with CaseInsensitiveMap

2019-08-15 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-28695: --- Assignee: Gabor Somogyi > Make Kafka source more robust with CaseInsensitiveMap >

[jira] [Updated] (SPARK-28742) StackOverflowError when using otherwise(col()) in a loop

2019-08-15 Thread Ivan Tsukanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Tsukanov updated SPARK-28742: -- Attachment: image-2019-08-15-15-19-33-319.png > StackOverflowError when using

[jira] [Updated] (SPARK-28742) StackOverflowError when using otherwise(col()) in a loop

2019-08-15 Thread Ivan Tsukanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Tsukanov updated SPARK-28742: -- Attachment: (was: image-2019-08-15-15-19-33-319.png) > StackOverflowError when using

[jira] [Created] (SPARK-28742) StackOverflowError when using otherwise(col()) in a loop

2019-08-15 Thread Ivan Tsukanov (JIRA)
Ivan Tsukanov created SPARK-28742: - Summary: StackOverflowError when using otherwise(col()) in a loop Key: SPARK-28742 URL: https://issues.apache.org/jira/browse/SPARK-28742 Project: Spark

[jira] [Updated] (SPARK-28742) StackOverflowError when using otherwise(col()) in a loop

2019-08-15 Thread Ivan Tsukanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan Tsukanov updated SPARK-28742: -- Description: The following code {code:java} val rdd = sparkContext.makeRDD(Seq(Row("1"))) val

[jira] [Created] (SPARK-28744) rename SharedSQLContext to SharedSparkSession

2019-08-15 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-28744: --- Summary: rename SharedSQLContext to SharedSparkSession Key: SPARK-28744 URL: https://issues.apache.org/jira/browse/SPARK-28744 Project: Spark Issue Type: Test

[jira] [Updated] (SPARK-28743) YarnShuffleService leads to NodeManager OOM because ChannelOutboundBuffer has t0o many entries

2019-08-15 Thread Jiandan Yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiandan Yang updated SPARK-28743: -- Summary: YarnShuffleService leads to NodeManager OOM because ChannelOutboundBuffer has t0o

[jira] [Updated] (SPARK-28743) YarnShuffleService leads to NodeManager OOM because ChannelOutboundBuffer has to many entry

2019-08-15 Thread Jiandan Yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiandan Yang updated SPARK-28743: -- Attachment: histo.jpg > YarnShuffleService leads to NodeManager OOM because

[jira] [Updated] (SPARK-28743) YarnShuffleService leads to NodeManager OOM because ChannelOutboundBuffer has to many entry

2019-08-15 Thread Jiandan Yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiandan Yang updated SPARK-28743: -- Attachment: dominator.jpg > YarnShuffleService leads to NodeManager OOM because

[jira] [Updated] (SPARK-28743) YarnShuffleService leads to NodeManager OOM because ChannelOutboundBuffer has to many entry

2019-08-15 Thread Jiandan Yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiandan Yang updated SPARK-28743: -- Description: NodeManager heap size is 4G, io.netty.channel.ChannelOutboundBuffer$Entry

[jira] [Updated] (SPARK-28743) YarnShuffleService leads to NodeManager OOM because ChannelOutboundBuffer has to many entry

2019-08-15 Thread Jiandan Yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiandan Yang updated SPARK-28743: -- Description: NodeManager heap size is 4G, io.netty.channel.ChannelOutboundBuffer$Entry

[jira] [Created] (SPARK-28747) merge the two data source v2 fallback configs

2019-08-15 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-28747: --- Summary: merge the two data source v2 fallback configs Key: SPARK-28747 URL: https://issues.apache.org/jira/browse/SPARK-28747 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-23977) Add commit protocol binding to Hadoop 3.1 PathOutputCommitter mechanism

2019-08-15 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-23977: -- Assignee: Steve Loughran > Add commit protocol binding to Hadoop 3.1

[jira] [Resolved] (SPARK-23977) Add commit protocol binding to Hadoop 3.1 PathOutputCommitter mechanism

2019-08-15 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-23977. Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24970

[jira] [Commented] (SPARK-24666) Word2Vec generate infinity vectors when numIterations are large

2019-08-15 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908267#comment-16908267 ] holdenk commented on SPARK-24666: - [~zhongyu09]specific code & data which leads to repro can help. >

[jira] [Created] (SPARK-28748) Zeros(0) inserted as decimal (n , n) in hive tables shows as null in spark sql

2019-08-15 Thread Rohit Sindhu (JIRA)
Rohit Sindhu created SPARK-28748: Summary: Zeros(0) inserted as decimal (n , n) in hive tables shows as null in spark sql Key: SPARK-28748 URL: https://issues.apache.org/jira/browse/SPARK-28748

[jira] [Updated] (SPARK-28634) Failed to start SparkSession with Keytab file

2019-08-15 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-28634: --- Priority: Minor (was: Major) > Failed to start SparkSession with Keytab file >

[jira] [Commented] (SPARK-28733) DataFrameReader of Spark not able to recognize the very first quote character, while custom unicode quote character is used

2019-08-15 Thread Christian Hollinger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908512#comment-16908512 ] Christian Hollinger commented on SPARK-28733: - I can confirm this issue on PopOS + OpenJDK 8

[jira] [Created] (SPARK-28750) Use `--release 8` for javac

2019-08-15 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-28750: - Summary: Use `--release 8` for javac Key: SPARK-28750 URL: https://issues.apache.org/jira/browse/SPARK-28750 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-27683) Remove usage of TraversableOnce

2019-08-15 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908410#comment-16908410 ] holdenk commented on SPARK-27683: - Interesting related discussion over in 

[jira] [Updated] (SPARK-28749) Fix PySpark tests not to require kafka-0-8 in branch-2.4

2019-08-15 Thread Matt Foley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Foley updated SPARK-28749: --- Description: As noted in SPARK-27550 we want to encourage testing of Spark 2.4.x with Scala-2.12,

[jira] [Updated] (SPARK-28749) Fix PySpark tests not to require kafka-0-8 in branch-2.4

2019-08-15 Thread Matt Foley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Foley updated SPARK-28749: --- Description: As noted in SPARK-27550 we want to encourage testing of Spark 2.4.x with Scala-2.12,

[jira] [Assigned] (SPARK-28745) Add benchmarks for `extract()`

2019-08-15 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-28745: - Assignee: Maxim Gekk > Add benchmarks for `extract()` > --

[jira] [Resolved] (SPARK-28745) Add benchmarks for `extract()`

2019-08-15 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-28745. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 25462

[jira] [Updated] (SPARK-28735) MultilayerPerceptronClassifierTest.test_raw_and_probability_prediction fails on JDK11

2019-08-15 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28735: -- Component/s: ML > MultilayerPerceptronClassifierTest.test_raw_and_probability_prediction

[jira] [Commented] (SPARK-28735) MultilayerPerceptronClassifierTest.test_raw_and_probability_prediction fails on JDK11

2019-08-15 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908446#comment-16908446 ] Dongjoon Hyun commented on SPARK-28735: --- Thank you so much, [~hyukjin.kwon]! >

[jira] [Updated] (SPARK-28736) pyspark.mllib.clustering fails on JDK11

2019-08-15 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28736: -- Component/s: MLlib > pyspark.mllib.clustering fails on JDK11 >

[jira] [Reopened] (SPARK-28634) Failed to start SparkSession with Keytab file

2019-08-15 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reopened SPARK-28634: > Failed to start SparkSession with Keytab file >

[jira] [Commented] (SPARK-28634) Failed to start SparkSession with Keytab file

2019-08-15 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908450#comment-16908450 ] Marcelo Vanzin commented on SPARK-28634: I think it's still worth it to fix it, so that users

[jira] [Created] (SPARK-28749) Fix PySpark tests not to require kafka-0-8 in branch-2.4

2019-08-15 Thread Matt Foley (JIRA)
Matt Foley created SPARK-28749: -- Summary: Fix PySpark tests not to require kafka-0-8 in branch-2.4 Key: SPARK-28749 URL: https://issues.apache.org/jira/browse/SPARK-28749 Project: Spark Issue

[jira] [Assigned] (SPARK-28578) Improve Github pull request template

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-28578: Assignee: Hyukjin Kwon > Improve Github pull request template >

[jira] [Resolved] (SPARK-28578) Improve Github pull request template

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-28578. -- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 25310

[jira] [Commented] (SPARK-28750) Use `--release 8` for javac

2019-08-15 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908587#comment-16908587 ] Sean Owen commented on SPARK-28750: --- [~dongjoon] I've tried this for an hour, and can't get it to

[jira] [Created] (SPARK-28751) Imporve java serializer deserialization performance

2019-08-15 Thread Xianyang Liu (JIRA)
Xianyang Liu created SPARK-28751: Summary: Imporve java serializer deserialization performance Key: SPARK-28751 URL: https://issues.apache.org/jira/browse/SPARK-28751 Project: Spark Issue

[jira] [Created] (SPARK-28753) Dynamically reuse subqueries in AQE

2019-08-15 Thread Maryann Xue (JIRA)
Maryann Xue created SPARK-28753: --- Summary: Dynamically reuse subqueries in AQE Key: SPARK-28753 URL: https://issues.apache.org/jira/browse/SPARK-28753 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-23829) spark-sql-kafka source in spark 2.3 causes reading stream failure frequently

2019-08-15 Thread leo.zhi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908682#comment-16908682 ] leo.zhi commented on SPARK-23829: - {color:#14892c}I am using 2.4.0-chd6.3.0, and got this error

[jira] [Updated] (SPARK-28749) Fix PySpark tests not to require kafka-0-8 in branch-2.4

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-28749: - Target Version/s: (was: 2.4.4) > Fix PySpark tests not to require kafka-0-8 in branch-2.4 >

[jira] [Updated] (SPARK-28732) org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator - failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java' when storing t

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-28732: - Priority: Major (was: Blocker) >

[jira] [Commented] (SPARK-28732) org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator - failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java' when storing

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908706#comment-16908706 ] Hyukjin Kwon commented on SPARK-28732: -- [~ametivier], please provide self-contained reproducer. >

[jira] [Resolved] (SPARK-28733) DataFrameReader of Spark not able to recognize the very first quote character, while custom unicode quote character is used

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-28733. -- Resolution: Cannot Reproduce Resolving this. It would be nicer if somebody identifies the

[jira] [Updated] (SPARK-28733) DataFrameReader of Spark not able to recognize the very first quote character, while custom unicode quote character is used

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-28733: - Description: I have encountered a strange behaviour recently, while reading a CSV file using

[jira] [Updated] (SPARK-28732) org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator - failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java' when storing t

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-28732: - Description: I am using agg function on a dataset, and i want to count the number of lines

[jira] [Created] (SPARK-28754) [UDF] Supports for alter, rename, owner change should be supported in Spark

2019-08-15 Thread ABHISHEK KUMAR GUPTA (JIRA)
ABHISHEK KUMAR GUPTA created SPARK-28754: Summary: [UDF] Supports for alter, rename, owner change should be supported in Spark Key: SPARK-28754 URL: https://issues.apache.org/jira/browse/SPARK-28754

[jira] [Commented] (SPARK-28575) Time lag between two consecutive spark actions using Spark 2.3.1

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908710#comment-16908710 ] Hyukjin Kwon commented on SPARK-28575: -- [~kumahaja], can you please show some codes for your

[jira] [Commented] (SPARK-25474) Support `spark.sql.statistics.fallBackToHdfs` in data source tables

2019-08-15 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908660#comment-16908660 ] Yuming Wang commented on SPARK-25474: - More benchmark result about

[jira] [Created] (SPARK-28752) Documentation build script to support Python 3

2019-08-15 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-28752: Summary: Documentation build script to support Python 3 Key: SPARK-28752 URL: https://issues.apache.org/jira/browse/SPARK-28752 Project: Spark Issue Type:

[jira] [Reopened] (SPARK-27884) Deprecate Python 2 support in Spark 3.0

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reopened SPARK-27884: -- > Deprecate Python 2 support in Spark 3.0 > --- > >

[jira] [Commented] (SPARK-28587) JDBC data source's partition whereClause should support jdbc dialect

2019-08-15 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908694#comment-16908694 ] Takeshi Yamamuro commented on SPARK-28587: -- Thanks for the info, but I'm currently not sure

[jira] [Updated] (SPARK-28742) StackOverflowError when using otherwise(col()) in a loop

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-28742: - Component/s: (was: Spark Core) SQL > StackOverflowError when using

[jira] [Comment Edited] (SPARK-28726) Spark with DynamicAllocation always got connect rest by peers

2019-08-15 Thread angerszhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908711#comment-16908711 ] angerszhu edited comment on SPARK-28726 at 8/16/19 5:04 AM: [~hyukjin.kwon]

[jira] [Commented] (SPARK-28726) Spark with DynamicAllocation always got connect rest by peers

2019-08-15 Thread angerszhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908714#comment-16908714 ] angerszhu commented on SPARK-28726: --- [~hyukjin.kwon] Just SparkthriftServer run sql with dynamic

[jira] [Commented] (SPARK-28752) Documentation build script to support Python 3

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908664#comment-16908664 ] Hyukjin Kwon commented on SPARK-28752: -- cc [~dongjoon] and [~WeichenXu123] I am kind of stuck for

[jira] [Commented] (SPARK-28749) Fix PySpark tests not to require kafka-0-8 in branch-2.4

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908700#comment-16908700 ] Hyukjin Kwon commented on SPARK-28749: -- Seems you can workaround by explicitly setting

[jira] [Updated] (SPARK-28748) Zeros(0) inserted as decimal (n , n) in hive tables shows as null in spark sql

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-28748: - Target Version/s: (was: 2.2.1, 2.3.1, 2.3.2, 2.4.3) > Zeros(0) inserted as decimal (n , n) in

[jira] [Commented] (SPARK-28738) Add ability to include metadata in CanCommitOffsets API

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908704#comment-16908704 ] Hyukjin Kwon commented on SPARK-28738: -- [~jrciii], please clarify the usecase, examples and

[jira] [Updated] (SPARK-28733) DataFrameReader of Spark not able to recognize the very first quote character, while custom unicode quote character is used

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-28733: - Priority: Major (was: Critical) > DataFrameReader of Spark not able to recognize the very

[jira] [Updated] (SPARK-28733) DataFrameReader of Spark not able to recognize the very first quote character, while custom unicode quote character is used

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-28733: - Description: I have encountered a strange behaviour recently, while reading a CSV file using

[jira] [Commented] (SPARK-28726) Spark with DynamicAllocation always got connect rest by peers

2019-08-15 Thread angerszhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908711#comment-16908711 ] angerszhu commented on SPARK-28726: --- [~hyukjin.kwon] spark thrift server, sql dynamic allocation

[jira] [Comment Edited] (SPARK-28726) Spark with DynamicAllocation always got connect rest by peers

2019-08-15 Thread angerszhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908711#comment-16908711 ] angerszhu edited comment on SPARK-28726 at 8/16/19 5:00 AM: [~hyukjin.kwon]

[jira] [Resolved] (SPARK-27020) Unable to insert data with partial dynamic partition with Spark & Hive 3

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-27020. -- Resolution: Cannot Reproduce > Unable to insert data with partial dynamic partition with

[jira] [Comment Edited] (SPARK-28726) Spark with DynamicAllocation always got connect rest by peers

2019-08-15 Thread angerszhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908711#comment-16908711 ] angerszhu edited comment on SPARK-28726 at 8/16/19 4:59 AM: [~hyukjin.kwon]

[jira] [Updated] (SPARK-28751) Imporve java serializer deserialization performance

2019-08-15 Thread Xianyang Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xianyang Liu updated SPARK-28751: - Description: Improve the performance of java serializer deserialization by caching the

[jira] [Updated] (SPARK-28748) Zeros(0) inserted as decimal (n , n) in hive tables shows as null in spark sql

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-28748: - Description: Zeros(0) inserted as decimal (n , n) in hive tables shows as null in spark sql.

[jira] [Updated] (SPARK-28748) 0 as decimal (n , n) in Hive tables shows as NULL in Spark

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-28748: - Summary: 0 as decimal (n , n) in Hive tables shows as NULL in Spark (was: Zeros(0) inserted as

[jira] [Updated] (SPARK-28748) Zeros(0) inserted as decimal (n , n) in hive tables shows as null in spark sql

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-28748: - Labels: (was: Spark-SQL spark-sql) > Zeros(0) inserted as decimal (n , n) in hive tables

[jira] [Resolved] (SPARK-28712) spark structured stream with kafka don't really delete temp files in spark standalone cluster

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-28712. -- Resolution: Invalid > spark structured stream with kafka don't really delete temp files in

[jira] [Commented] (SPARK-28712) spark structured stream with kafka don't really delete temp files in spark standalone cluster

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908709#comment-16908709 ] Hyukjin Kwon commented on SPARK-28712: -- [~yangcong3643] if you're not clear if it is an issue or

[jira] [Commented] (SPARK-28726) Spark with DynamicAllocation always got connect rest by peers

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908708#comment-16908708 ] Hyukjin Kwon commented on SPARK-28726: -- Can you clarify what codes you ran and which configuration

[jira] [Commented] (SPARK-28726) Spark with DynamicAllocation always got connect rest by peers

2019-08-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908713#comment-16908713 ] Hyukjin Kwon commented on SPARK-28726: -- [~angerszhuuu], please clarify the codes you ran and exact

[jira] [Comment Edited] (SPARK-23829) spark-sql-kafka source in spark 2.3 causes reading stream failure frequently

2019-08-15 Thread leo.zhi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908682#comment-16908682 ] leo.zhi edited comment on SPARK-23829 at 8/16/19 5:34 AM: -- {color:#14892c}I am

[jira] [Commented] (SPARK-28749) Fix PySpark tests not to require kafka-0-8 in branch-2.4

2019-08-15 Thread Matt Foley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908751#comment-16908751 ] Matt Foley commented on SPARK-28749: Hi [~hyukjin.kwon], thanks for looking at the issue. I did try

[jira] [Comment Edited] (SPARK-28749) Fix PySpark tests not to require kafka-0-8 in branch-2.4

2019-08-15 Thread Matt Foley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908751#comment-16908751 ] Matt Foley edited comment on SPARK-28749 at 8/16/19 5:55 AM: - Hi

[jira] [Commented] (SPARK-28750) Use `--release 8` for javac

2019-08-15 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908748#comment-16908748 ] Dongjoon Hyun commented on SPARK-28750: --- Thank you for update, [~srowen]! > Use `--release 8` for

[jira] [Commented] (SPARK-28752) Documentation build script to support Python 3

2019-08-15 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908721#comment-16908721 ] Dongjoon Hyun commented on SPARK-28752: --- Got it. I'll take a look at too. For now, it seems that