[jira] [Updated] (SPARK-21008) Streaming applications read stale credentials file when recovering from checkpoint.

2017-06-19 Thread Xing Shi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xing Shi updated SPARK-21008: - Description: On a security(Kerberos) enabled cluster, streaming applications renew HDFS delegation token

[jira] [Updated] (SPARK-21008) Streaming applications read stale credentials file when recovering from checkpoint.

2017-06-19 Thread Xing Shi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xing Shi updated SPARK-21008: - Description: On a security(Kerberos) enabled cluster, streaming applications renew HDFS delegation token

[jira] [Created] (SPARK-21135) On history server page,duration of incompleted applications should be hidden instead of showing up as 0

2017-06-19 Thread Jinhua Fu (JIRA)
Jinhua Fu created SPARK-21135: - Summary: On history server page,duration of incompleted applications should be hidden instead of showing up as 0 Key: SPARK-21135 URL: https://issues.apache.org/jira/browse/SPARK-21135

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-06-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053551#comment-16053551 ] Liang-Chi Hsieh commented on SPARK-21109: - The {{Dataset.union}} method has the f

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-06-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053559#comment-16053559 ] Liang-Chi Hsieh commented on SPARK-21109: - Specifically, both data1 and data2 are

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-06-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053563#comment-16053563 ] Liang-Chi Hsieh commented on SPARK-21109: - So if you don't have more comments on

[jira] [Commented] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-06-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053564#comment-16053564 ] Liang-Chi Hsieh commented on SPARK-21109: - Btw, there is a related ticket SPARK-2

[jira] [Resolved] (SPARK-20896) spark executor get java.lang.ClassCastException when trigger two job at same time

2017-06-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20896. --- Resolution: Not A Problem Fix Version/s: (was: 1.6.4) Target Version/s: (was: 1

[jira] [Reopened] (SPARK-20896) spark executor get java.lang.ClassCastException when trigger two job at same time

2017-06-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reopened SPARK-20896: --- > spark executor get java.lang.ClassCastException when trigger two job at same > time >

[jira] [Resolved] (SPARK-21132) DISTINCT modifier of function arguments should not be silently ignored

2017-06-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-21132. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 18340 [https://githu

[jira] [Created] (SPARK-21136) Misleading error message for typo in SQL

2017-06-19 Thread Daniel Darabos (JIRA)
Daniel Darabos created SPARK-21136: -- Summary: Misleading error message for typo in SQL Key: SPARK-21136 URL: https://issues.apache.org/jira/browse/SPARK-21136 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-21101) Error running Hive temporary UDTF on latest Spark 2.2

2017-06-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053647#comment-16053647 ] Liang-Chi Hsieh commented on SPARK-21101: - May I ask what Hive version your UDTF

[jira] [Comment Edited] (SPARK-21101) Error running Hive temporary UDTF on latest Spark 2.2

2017-06-19 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053647#comment-16053647 ] Liang-Chi Hsieh edited comment on SPARK-21101 at 6/19/17 9:01 AM: -

[jira] [Commented] (SPARK-21135) On history server page,duration of incompleted applications should be hidden instead of showing up as 0

2017-06-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053665#comment-16053665 ] Apache Spark commented on SPARK-21135: -- User 'fjh100456' has created a pull request

[jira] [Assigned] (SPARK-21135) On history server page,duration of incompleted applications should be hidden instead of showing up as 0

2017-06-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21135: Assignee: Apache Spark > On history server page,duration of incompleted applications shoul

[jira] [Assigned] (SPARK-21135) On history server page,duration of incompleted applications should be hidden instead of showing up as 0

2017-06-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21135: Assignee: (was: Apache Spark) > On history server page,duration of incompleted applica

[jira] [Resolved] (SPARK-21109) union two dataset[A] don't work as expected if one of the datasets is originated from a dataframe

2017-06-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21109. --- Resolution: Not A Problem > union two dataset[A] don't work as expected if one of the datasets is >

[jira] [Resolved] (SPARK-18934) Writing to dynamic partitions does not preserve sort order if spill occurs

2017-06-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18934. --- Resolution: Not A Problem > Writing to dynamic partitions does not preserve sort order if spill occur

[jira] [Resolved] (SPARK-8674) 2-sample, 2-sided Kolmogorov Smirnov Test

2017-06-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-8674. -- Resolution: Won't Fix > 2-sample, 2-sided Kolmogorov Smirnov Test >

[jira] [Commented] (SPARK-650) Add a "setup hook" API for running initialization code on each executor

2017-06-19 Thread Ritesh Tijoriwala (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053745#comment-16053745 ] Ritesh Tijoriwala commented on SPARK-650: - [~Skamandros] - Any similar tricks for s

[jira] [Created] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread sam (JIRA)
sam created SPARK-21137: --- Summary: Spark cannot read many small files (wholeTextFiles) Key: SPARK-21137 URL: https://issues.apache.org/jira/browse/SPARK-21137 Project: Spark Issue Type: Bug C

[jira] [Resolved] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21137. --- Resolution: Invalid This is a question for the mailing list, not JIRA. It's not clear you're actually

[jira] [Commented] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053808#comment-16053808 ] sam commented on SPARK-21137: - [~srowen] Sorry about the lack of detail Sean. I guess I just

[jira] [Updated] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sam updated SPARK-21137: Description: A very common use case in big data is to read a large number of small files. For example the Enron e

[jira] [Resolved] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21137. --- Resolution: Invalid Don't reopen this please. Someone will do that if it's appropriate. This still d

[jira] [Updated] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sam updated SPARK-21137: Description: A very common use case in big data is to read a large number of small files. For example the Enron e

[jira] [Reopened] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sam reopened SPARK-21137: - Reopened after adding detail. > Spark cannot read many small files (wholeTextFiles) > --

[jira] [Updated] (SPARK-21138) Cannot delete staging dir when the clusters of "spark.yarn.stagingDir" and "spark.hadoop.fs.defaultFS" are different

2017-06-19 Thread sharkd tu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sharkd tu updated SPARK-21138: -- Description: When I set different clusters for "spark.hadoop.fs.defaultFS" and "spark.yarn.stagingDir"

[jira] [Created] (SPARK-21138) Cannot delete staging dir when the clusters of "spark.yarn.stagingDir" and "spark.hadoop.fs.defaultFS" are different

2017-06-19 Thread sharkd tu (JIRA)
sharkd tu created SPARK-21138: - Summary: Cannot delete staging dir when the clusters of "spark.yarn.stagingDir" and "spark.hadoop.fs.defaultFS" are different Key: SPARK-21138 URL: https://issues.apache.org/jira/brows

[jira] [Comment Edited] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053808#comment-16053808 ] sam edited comment on SPARK-21137 at 6/19/17 11:14 AM: --- [~srowen] S

[jira] [Closed] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen closed SPARK-21137. - > Spark cannot read many small files (wholeTextFiles) > --- >

[jira] [Commented] (SPARK-21138) Cannot delete staging dir when the clusters of "spark.yarn.stagingDir" and "spark.hadoop.fs.defaultFS" are different

2017-06-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053817#comment-16053817 ] Apache Spark commented on SPARK-21138: -- User 'sharkdtu' has created a pull request f

[jira] [Assigned] (SPARK-21138) Cannot delete staging dir when the clusters of "spark.yarn.stagingDir" and "spark.hadoop.fs.defaultFS" are different

2017-06-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21138: Assignee: Apache Spark > Cannot delete staging dir when the clusters of "spark.yarn.stagin

[jira] [Assigned] (SPARK-21138) Cannot delete staging dir when the clusters of "spark.yarn.stagingDir" and "spark.hadoop.fs.defaultFS" are different

2017-06-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21138: Assignee: (was: Apache Spark) > Cannot delete staging dir when the clusters of "spark.

[jira] [Commented] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053820#comment-16053820 ] Sean Owen commented on SPARK-21137: --- Here's a hint, or example of what could be going w

[jira] [Commented] (SPARK-20568) Delete files after processing in structured streaming

2017-06-19 Thread Fei Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053871#comment-16053871 ] Fei Shao commented on SPARK-20568: -- I also do not support this feature too. If we delet

[jira] [Created] (SPARK-21139) java.util.concurrent.RejectedExecutionException: rejected from java.util.concurrent.ThreadPoolExecutor@46477dd0[Terminated, pool size = 0, active threads = 0, queued tas

2017-06-19 Thread shining (JIRA)
shining created SPARK-21139: --- Summary: java.util.concurrent.RejectedExecutionException: rejected from java.util.concurrent.ThreadPoolExecutor@46477dd0[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 14109]

[jira] [Commented] (SPARK-21139) java.util.concurrent.RejectedExecutionException: rejected from java.util.concurrent.ThreadPoolExecutor@46477dd0[Terminated, pool size = 0, active threads = 0, queued t

2017-06-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053911#comment-16053911 ] Sean Owen commented on SPARK-21139: --- That looks like an issue from the HBase client, no

[jira] [Resolved] (SPARK-20931) Built-in SQL Function ABS support string type

2017-06-19 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang resolved SPARK-20931. - Resolution: Fixed Fix Version/s: 2.3.0 > Built-in SQL Function ABS support string type > -

[jira] [Updated] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sam updated SPARK-21137: Description: A very common use case in big data is to read a large number of small files. For example the Enron e

[jira] [Commented] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053977#comment-16053977 ] sam commented on SPARK-21137: - [~srowen] So I've provided full reproduce steps here (includi

[jira] [Created] (SPARK-21140) Reduce collect high memory requrements

2017-06-19 Thread michael procopio (JIRA)
michael procopio created SPARK-21140: Summary: Reduce collect high memory requrements Key: SPARK-21140 URL: https://issues.apache.org/jira/browse/SPARK-21140 Project: Spark Issue Type: Im

[jira] [Created] (SPARK-21141) spark-update --version is hard to parse

2017-06-19 Thread michael procopio (JIRA)
michael procopio created SPARK-21141: Summary: spark-update --version is hard to parse Key: SPARK-21141 URL: https://issues.apache.org/jira/browse/SPARK-21141 Project: Spark Issue Type: I

[jira] [Resolved] (SPARK-21140) Reduce collect high memory requrements

2017-06-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21140. --- Resolution: Invalid There's no real detail here. Executor memory doesn't directly matter to how much

[jira] [Resolved] (SPARK-21141) spark-update --version is hard to parse

2017-06-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21141. --- Resolution: Not A Problem There is no spark-update. It is not intended as an API to determine the ver

[jira] [Commented] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054004#comment-16054004 ] Sean Owen commented on SPARK-21137: --- As i say, you're not setting anything about the pa

[jira] [Commented] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054026#comment-16054026 ] sam commented on SPARK-21137: - [~srowen] As I said in the description, which you may have mi

[jira] [Commented] (SPARK-19809) NullPointerException on empty ORC file

2017-06-19 Thread Renu Yadav (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054031#comment-16054031 ] Renu Yadav commented on SPARK-19809: What is the resolution of this issue. spark.sql.

[jira] [Comment Edited] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054026#comment-16054026 ] sam edited comment on SPARK-21137 at 6/19/17 1:53 PM: -- [~srowen] As

[jira] [Updated] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sam updated SPARK-21137: Description: A very common use case in big data is to read a large number of small files. For example the Enron e

[jira] [Commented] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054042#comment-16054042 ] Sean Owen commented on SPARK-21137: --- Are you sure it's not just appearing to be stuck r

[jira] [Commented] (SPARK-21140) Reduce collect high memory requrements

2017-06-19 Thread michael procopio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054043#comment-16054043 ] michael procopio commented on SPARK-21140: -- I disagree executor memory does depe

[jira] [Comment Edited] (SPARK-21140) Reduce collect high memory requrements

2017-06-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054043#comment-16054043 ] Sean Owen edited comment on SPARK-21140 at 6/19/17 2:02 PM: I

[jira] [Reopened] (SPARK-21140) Reduce collect high memory requrements

2017-06-19 Thread michael procopio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] michael procopio reopened SPARK-21140: -- I am not sure what detail you are looking for. I provided the test code I was using. See

[jira] [Reopened] (SPARK-21141) spark-update --version is hard to parse

2017-06-19 Thread michael procopio (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] michael procopio reopened SPARK-21141: -- My apologies, I mean spark-submit --version. > spark-update --version is hard to parse >

[jira] [Commented] (SPARK-21140) Reduce collect high memory requrements

2017-06-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054055#comment-16054055 ] Sean Owen commented on SPARK-21140: --- Yes, it's possible the executor makes a copy of so

[jira] [Resolved] (SPARK-21141) spark-update --version is hard to parse

2017-06-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21141. --- Resolution: Not A Problem [~mprocop] please don't reopen JIRAs. We can reopen if needed. As I say, I

[jira] [Commented] (SPARK-19809) NullPointerException on empty ORC file

2017-06-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054076#comment-16054076 ] Hyukjin Kwon commented on SPARK-19809: -- What you see is what you get. This is "Reope

[jira] [Commented] (SPARK-19809) NullPointerException on empty ORC file

2017-06-19 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054108#comment-16054108 ] Dongjoon Hyun commented on SPARK-19809: --- Yep. I'm trying to fix this with new ORC d

[jira] [Comment Edited] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054111#comment-16054111 ] sam edited comment on SPARK-21137 at 6/19/17 2:35 PM: -- [~srowen] >

[jira] [Commented] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054111#comment-16054111 ] sam commented on SPARK-21137: - [~srowen] > what stages are executing if any? *None, no tas

[jira] [Comment Edited] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054111#comment-16054111 ] sam edited comment on SPARK-21137 at 6/19/17 2:36 PM: -- [~srowen] >

[jira] [Commented] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054115#comment-16054115 ] Sean Owen commented on SPARK-21137: --- Try a thread dump on the driver. Until there's som

[jira] [Commented] (SPARK-18016) Code Generation: Constant Pool Past Limit for Wide/Nested Dataset

2017-06-19 Thread Aleksander Eskilson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054118#comment-16054118 ] Aleksander Eskilson commented on SPARK-18016: - [~cloud_fan], [~divshukla], ye

[jira] [Commented] (SPARK-21142) spark-streaming-kafka-0-10 has too fat dependency on kafka

2017-06-19 Thread Tim Van Wassenhove (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054130#comment-16054130 ] Tim Van Wassenhove commented on SPARK-21142: Opened a PR on github: https://g

[jira] [Created] (SPARK-21142) spark-streaming-kafka-0-10 has too fat dependency on kafka

2017-06-19 Thread Tim Van Wassenhove (JIRA)
Tim Van Wassenhove created SPARK-21142: -- Summary: spark-streaming-kafka-0-10 has too fat dependency on kafka Key: SPARK-21142 URL: https://issues.apache.org/jira/browse/SPARK-21142 Project: Spark

[jira] [Resolved] (SPARK-17176) Task are sorted by "Index" in Stage Page.

2017-06-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17176. --- Resolution: Won't Fix > Task are sorted by "Index" in Stage Page. > -

[jira] [Commented] (SPARK-21080) Workaround for HDFS delegation token expiry broken with some Hadoop versions

2017-06-19 Thread Lukasz Raszka (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054155#comment-16054155 ] Lukasz Raszka commented on SPARK-21080: --- [~jerryshao] Yes, it's in HA mode. Updatin

[jira] [Commented] (SPARK-650) Add a "setup hook" API for running initialization code on each executor

2017-06-19 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054162#comment-16054162 ] Michael Schmeißer commented on SPARK-650: - [~riteshtijoriwala] - Sorry, but I am no

[jira] [Commented] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054176#comment-16054176 ] sam commented on SPARK-21137: - [~srowen] Ah OK, sorry, not used to that process. On other pr

[jira] [Comment Edited] (SPARK-21137) Spark cannot read many small files (wholeTextFiles)

2017-06-19 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054176#comment-16054176 ] sam edited comment on SPARK-21137 at 6/19/17 3:20 PM: -- [~srowen] Ah

[jira] [Created] (SPARK-21143) Fail to fetch blocks >1MB in size in presence of conflicting Netty version

2017-06-19 Thread Ryan Williams (JIRA)
Ryan Williams created SPARK-21143: - Summary: Fail to fetch blocks >1MB in size in presence of conflicting Netty version Key: SPARK-21143 URL: https://issues.apache.org/jira/browse/SPARK-21143 Project:

[jira] [Resolved] (SPARK-19688) Spark on Yarn Credentials File set to different application directory

2017-06-19 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-19688. Resolution: Fixed Fix Version/s: 2.3.0 2.2.1 2.

[jira] [Updated] (SPARK-16430) Add an option in file stream source to read 1 file at a time

2017-06-19 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-16430: - Fix Version/s: 2.1.0 > Add an option in file stream source to read 1 file at a time > ---

[jira] [Updated] (SPARK-16430) Add an option in file stream source to read 1 file at a time

2017-06-19 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-16430: - Fix Version/s: (was: 2.1.0) 2.0.0 > Add an option in file stream source to

[jira] [Resolved] (SPARK-21123) Options for file stream source are in a wrong table

2017-06-19 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-21123. -- Resolution: Fixed Fix Version/s: 2.3.0 2.2.0 > Options for file strea

[jira] [Updated] (SPARK-21142) spark-streaming-kafka-0-10 has too fat dependency on kafka

2017-06-19 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-21142: - Component/s: (was: Structured Streaming) DStreams > spark-streaming-kafka-0-

[jira] [Commented] (SPARK-12414) Remove closure serializer

2017-06-19 Thread Ritesh Tijoriwala (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054529#comment-16054529 ] Ritesh Tijoriwala commented on SPARK-12414: --- I have a similar situation. I have

[jira] [Commented] (SPARK-21102) Refresh command is too aggressive in parsing

2017-06-19 Thread Anton Okolnychyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054591#comment-16054591 ] Anton Okolnychyi commented on SPARK-21102: -- Hi [~rxin], I took a look at this i

[jira] [Resolved] (SPARK-19975) Add map_keys and map_values functions to Python

2017-06-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-19975. - Resolution: Fixed Assignee: Yong Tang Fix Version/s: 2.3.0 > Add map_keys and map_values

[jira] [Commented] (SPARK-21143) Fail to fetch blocks >1MB in size in presence of conflicting Netty version

2017-06-19 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054592#comment-16054592 ] Shixiong Zhu commented on SPARK-21143: -- As Netty is so core to Spark, it's too risky

[jira] [Commented] (SPARK-21143) Fail to fetch blocks >1MB in size in presence of conflicting Netty version

2017-06-19 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054593#comment-16054593 ] Shixiong Zhu commented on SPARK-21143: -- The reason you cannot use 4.0.42.Final is be

[jira] [Commented] (SPARK-20928) Continuous Processing Mode for Structured Streaming

2017-06-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054596#comment-16054596 ] Michael Armbrust commented on SPARK-20928: -- Hi Cody, I do plan to flesh this out

[jira] [Commented] (SPARK-21102) Refresh command is too aggressive in parsing

2017-06-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054600#comment-16054600 ] Reynold Xin commented on SPARK-21102: - Can you submit a pull request so we can discus

[jira] [Updated] (SPARK-21133) HighlyCompressedMapStatus#writeExternal throws NPE

2017-06-19 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-21133: - Target Version/s: 2.2.0 Priority: Blocker (was: Major) Description:

[jira] [Assigned] (SPARK-21142) spark-streaming-kafka-0-10 has too fat dependency on kafka

2017-06-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21142: Assignee: (was: Apache Spark) > spark-streaming-kafka-0-10 has too fat dependency on k

[jira] [Assigned] (SPARK-21142) spark-streaming-kafka-0-10 has too fat dependency on kafka

2017-06-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-21142: Assignee: Apache Spark > spark-streaming-kafka-0-10 has too fat dependency on kafka >

[jira] [Commented] (SPARK-21142) spark-streaming-kafka-0-10 has too fat dependency on kafka

2017-06-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054612#comment-16054612 ] Apache Spark commented on SPARK-21142: -- User 'timvw' has created a pull request for

[jira] [Commented] (SPARK-21143) Fail to fetch blocks >1MB in size in presence of conflicting Netty version

2017-06-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054631#comment-16054631 ] Sean Owen commented on SPARK-21143: --- If this reduces to a 4.0 vs 4.1 conflict, then thi

[jira] [Commented] (SPARK-20928) Continuous Processing Mode for Structured Streaming

2017-06-19 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054690#comment-16054690 ] Cody Koeninger commented on SPARK-20928: Cool, can you label it SPIP so it shows

[jira] [Commented] (SPARK-11170) ​ EOFException on History server reading in progress lz4

2017-06-19 Thread remoteServer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054701#comment-16054701 ] remoteServer commented on SPARK-11170: -- Do we have steps to reproduce the issue? I h

[jira] [Commented] (SPARK-21143) Fail to fetch blocks >1MB in size in presence of conflicting Netty version

2017-06-19 Thread Ryan Williams (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054707#comment-16054707 ] Ryan Williams commented on SPARK-21143: --- [~zsxwing] bq. it's too risky to upgrade

[jira] [Commented] (SPARK-18016) Code Generation: Constant Pool Past Limit for Wide/Nested Dataset

2017-06-19 Thread Aleksander Eskilson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054784#comment-16054784 ] Aleksander Eskilson commented on SPARK-18016: - [~cloud_fan], [~divshukla], I'

[jira] [Created] (SPARK-21144) Unexpected results when the data schema and partition schema have the duplicate columns

2017-06-19 Thread Xiao Li (JIRA)
Xiao Li created SPARK-21144: --- Summary: Unexpected results when the data schema and partition schema have the duplicate columns Key: SPARK-21144 URL: https://issues.apache.org/jira/browse/SPARK-21144 Project

[jira] [Commented] (SPARK-18016) Code Generation: Constant Pool Past Limit for Wide/Nested Dataset

2017-06-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054785#comment-16054785 ] Apache Spark commented on SPARK-18016: -- User 'bdrillard' has created a pull request

[jira] [Commented] (SPARK-21144) Unexpected results when the data schema and partition schema have the duplicate columns

2017-06-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054788#comment-16054788 ] Xiao Li commented on SPARK-21144: - cc [~maropu] > Unexpected results when the data schem

[jira] [Updated] (SPARK-21144) Unexpected results when the data schema and partition schema have the duplicate columns

2017-06-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-21144: Target Version/s: 2.2.0 > Unexpected results when the data schema and partition schema have the > duplicat

[jira] [Resolved] (SPARK-21124) Wrong user shown in UI when using kerberos

2017-06-19 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-21124. Resolution: Fixed Assignee: Marcelo Vanzin Fix Version/s: 2.3.0 > Wrong use

[jira] [Resolved] (SPARK-21138) Cannot delete staging dir when the clusters of "spark.yarn.stagingDir" and "spark.hadoop.fs.defaultFS" are different

2017-06-19 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-21138. Resolution: Fixed Assignee: sharkd tu Fix Version/s: 2.3.0

[jira] [Created] (SPARK-21145) Restarted queries reuse same StateStoreProvider, causing multiple concurrent tasks to update same StateStore

2017-06-19 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-21145: - Summary: Restarted queries reuse same StateStoreProvider, causing multiple concurrent tasks to update same StateStore Key: SPARK-21145 URL: https://issues.apache.org/jira/browse

  1   2   >