[jira] [Comment Edited] (SPARK-22248) spark marks all columns as null when its unable to parse single column

2017-10-11 Thread Gaurav Shah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201433#comment-16201433 ] Gaurav Shah edited comment on SPARK-22248 at 10/12/17 4:49 AM: --- [~maropu] I

[jira] [Updated] (SPARK-22261) Collect and show failed task metrics for ui

2017-10-11 Thread zhoukang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated SPARK-22261: - Attachment: (was: fetchfailed.png) > Collect and show failed task metrics for ui >

[jira] [Commented] (SPARK-22248) spark marks all columns as null when its unable to parse single column

2017-10-11 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201438#comment-16201438 ] Takeshi Yamamuro commented on SPARK-22248: -- yea, ok. Probably, you need to support both modes:

[jira] [Updated] (SPARK-22261) Collect and show failed task metrics for ui

2017-10-11 Thread zhoukang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated SPARK-22261: - Attachment: fetchfailed.png taskkill-taskresultlost.png > Collect and show failed task

[jira] [Comment Edited] (SPARK-22248) spark marks all columns as null when its unable to parse single column

2017-10-11 Thread Gaurav Shah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200074#comment-16200074 ] Gaurav Shah edited comment on SPARK-22248 at 10/12/17 4:50 AM: --- We can work

[jira] [Commented] (SPARK-22248) spark marks all columns as null when its unable to parse single column

2017-10-11 Thread Gaurav Shah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201433#comment-16201433 ] Gaurav Shah commented on SPARK-22248: - [~maropu] I am not sure on CSV, but on JSON we tokenize the

[jira] [Updated] (SPARK-22261) Collect and show failed task metrics for ui

2017-10-11 Thread zhoukang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated SPARK-22261: - Attachment: (was: taskkill-taskresultlost.png) > Collect and show failed task metrics for ui >

[jira] [Updated] (SPARK-22261) Collect and show failed task metrics for ui

2017-10-11 Thread zhoukang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated SPARK-22261: - Attachment: fetchfailed.png taskkill-taskresultlost.png > Collect and show failed task

[jira] [Updated] (SPARK-22261) Collect and show failed task metrics for ui

2017-10-11 Thread zhoukang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated SPARK-22261: - Description: Currentlly, the metrics of failed task could not showed on UI since metrics did not

[jira] [Updated] (SPARK-22261) Collect and show failed task metrics for ui

2017-10-11 Thread zhoukang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated SPARK-22261: - Description: Currentlly, the metrics of failed task could not showed on UI since metrics did not

[jira] [Updated] (SPARK-22261) Collect and show failed task metrics for ui

2017-10-11 Thread zhoukang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated SPARK-22261: - Description: Currentlly, the metrics of failed task could not showed on UI since metrics did not

[jira] [Created] (SPARK-22261) Collect and show failed task metrics for ui

2017-10-11 Thread zhoukang (JIRA)
zhoukang created SPARK-22261: Summary: Collect and show failed task metrics for ui Key: SPARK-22261 URL: https://issues.apache.org/jira/browse/SPARK-22261 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-22260) java.lang.RuntimeException: hdfs://HdfsHA/logrep/1/sspstatistic/_metadata is not a Parquet file (too small)

2017-10-11 Thread Liu Dinghua (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Dinghua updated SPARK-22260: Description: the codes which encountered errors are as follow: *

[jira] [Updated] (SPARK-22260) java.lang.RuntimeException: hdfs://HdfsHA/logrep/1/sspstatistic/_metadata is not a Parquet file (too small)

2017-10-11 Thread Liu Dinghua (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Dinghua updated SPARK-22260: Description: the codes which encountered errors are as follow: *

[jira] [Created] (SPARK-22260) java.lang.RuntimeException: hdfs://HdfsHA/logrep/1/sspstatistic/_metadata is not a Parquet file (too small)

2017-10-11 Thread Liu Dinghua (JIRA)
Liu Dinghua created SPARK-22260: --- Summary: java.lang.RuntimeException: hdfs://HdfsHA/logrep/1/sspstatistic/_metadata is not a Parquet file (too small) Key: SPARK-22260 URL:

[jira] [Created] (SPARK-22259) hdfs://HdfsHA/logrep/1/sspstatistic/_metadata is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [5, 28, 21, 12]

2017-10-11 Thread Liu Dinghua (JIRA)
Liu Dinghua created SPARK-22259: --- Summary: hdfs://HdfsHA/logrep/1/sspstatistic/_metadata is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [5, 28, 21, 12] Key: SPARK-22259 URL:

[jira] [Comment Edited] (SPARK-21762) FileFormatWriter/BasicWriteTaskStatsTracker metrics collection fails if a new file isn't yet visible

2017-10-11 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201418#comment-16201418 ] Dongjoon Hyun edited comment on SPARK-21762 at 10/12/17 4:21 AM: - Since

[jira] [Commented] (SPARK-21762) FileFormatWriter/BasicWriteTaskStatsTracker metrics collection fails if a new file isn't yet visible

2017-10-11 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201418#comment-16201418 ] Dongjoon Hyun commented on SPARK-21762: --- Since this is a regression like SPARK-22258, I updated the

[jira] [Updated] (SPARK-21762) FileFormatWriter/BasicWriteTaskStatsTracker metrics collection fails if a new file isn't yet visible

2017-10-11 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-21762: -- Priority: Major (was: Minor) > FileFormatWriter/BasicWriteTaskStatsTracker metrics collection

[jira] [Commented] (SPARK-22240) S3 CSV number of partitions incorrectly computed

2017-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201407#comment-16201407 ] Hyukjin Kwon commented on SPARK-22240: -- There were multiple JIRAs for this feature. I believe

[jira] [Resolved] (SPARK-22258) Writing empty dataset fails with ORC format

2017-10-11 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-22258. --- Resolution: Duplicate I missed the on going PR at SPARK-21762. > Writing empty dataset

[jira] [Updated] (SPARK-22242) streaming job failed to restart from checkpoint

2017-10-11 Thread StephenZou (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] StephenZou updated SPARK-22242: --- Summary: streaming job failed to restart from checkpoint (was: job failed to restart from

[jira] [Updated] (SPARK-22242) streaming job failed to restart from checkpoint

2017-10-11 Thread StephenZou (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] StephenZou updated SPARK-22242: --- Component/s: (was: Spark Core) DStreams > streaming job failed to restart from

[jira] [Commented] (SPARK-22258) Writing empty dataset fails with ORC format

2017-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201363#comment-16201363 ] Apache Spark commented on SPARK-22258: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Assigned] (SPARK-22258) Writing empty dataset fails with ORC format

2017-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22258: Assignee: (was: Apache Spark) > Writing empty dataset fails with ORC format >

[jira] [Assigned] (SPARK-22258) Writing empty dataset fails with ORC format

2017-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22258: Assignee: Apache Spark > Writing empty dataset fails with ORC format >

[jira] [Commented] (SPARK-2243) Support multiple SparkContexts in the same JVM

2017-10-11 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201364#comment-16201364 ] ASF GitHub Bot commented on SPARK-2243: --- Github user 561152 commented on the issue:

[jira] [Created] (SPARK-22258) Writing empty dataset fails with ORC format

2017-10-11 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-22258: - Summary: Writing empty dataset fails with ORC format Key: SPARK-22258 URL: https://issues.apache.org/jira/browse/SPARK-22258 Project: Spark Issue Type:

[jira] [Commented] (SPARK-22062) BlockManager does not account for memory consumed by remote fetches

2017-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201334#comment-16201334 ] Apache Spark commented on SPARK-22062: -- User 'jerryshao' has created a pull request for this issue:

[jira] [Assigned] (SPARK-22062) BlockManager does not account for memory consumed by remote fetches

2017-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22062: Assignee: (was: Apache Spark) > BlockManager does not account for memory consumed by

[jira] [Assigned] (SPARK-22062) BlockManager does not account for memory consumed by remote fetches

2017-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22062: Assignee: Apache Spark > BlockManager does not account for memory consumed by remote

[jira] [Commented] (SPARK-22248) spark marks all columns as null when its unable to parse single column

2017-10-11 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201309#comment-16201309 ] Takeshi Yamamuro commented on SPARK-22248: -- Once failed, I feel it is difficult to recover the

[jira] [Commented] (SPARK-22062) BlockManager does not account for memory consumed by remote fetches

2017-10-11 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201305#comment-16201305 ] Saisai Shao commented on SPARK-22062: - Yes, there potentially has OOM problem, but I think this kind

[jira] [Commented] (SPARK-22257) Reserve all non-deterministic expressions in ExpressionSet.

2017-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201090#comment-16201090 ] Apache Spark commented on SPARK-22257: -- User 'gengliangwang' has created a pull request for this

[jira] [Assigned] (SPARK-22257) Reserve all non-deterministic expressions in ExpressionSet.

2017-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22257: Assignee: (was: Apache Spark) > Reserve all non-deterministic expressions in

[jira] [Assigned] (SPARK-22257) Reserve all non-deterministic expressions in ExpressionSet.

2017-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22257: Assignee: Apache Spark > Reserve all non-deterministic expressions in ExpressionSet. >

[jira] [Created] (SPARK-22257) Reserve all non-deterministic expressions in ExpressionSet.

2017-10-11 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-22257: -- Summary: Reserve all non-deterministic expressions in ExpressionSet. Key: SPARK-22257 URL: https://issues.apache.org/jira/browse/SPARK-22257 Project: Spark

[jira] [Updated] (SPARK-21988) Add default stats to StreamingRelation and StreamingExecutionRelation

2017-10-11 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-21988: - Summary: Add default stats to StreamingRelation and StreamingExecutionRelation (was: Add

[jira] [Resolved] (SPARK-21988) Add default stats to StreamingExecutionRelation

2017-10-11 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-21988. -- Resolution: Fixed > Add default stats to StreamingExecutionRelation >

[jira] [Commented] (SPARK-22256) Introduce spark.mesos.driver.memoryOverhead

2017-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200876#comment-16200876 ] Sean Owen commented on SPARK-22256: --- You don't need it assigned > Introduce

[jira] [Commented] (SPARK-22256) Introduce spark.mesos.driver.memoryOverhead

2017-10-11 Thread Cosmin Lehene (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200855#comment-16200855 ] Cosmin Lehene commented on SPARK-22256: --- It seems I can't assign this issue to myself. [~srowen]

[jira] [Commented] (SPARK-22256) Introduce spark.mesos.driver.memoryOverhead

2017-10-11 Thread Cosmin Lehene (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200852#comment-16200852 ] Cosmin Lehene commented on SPARK-22256: --- I have a patch, still testing it. I will go over the

[jira] [Commented] (SPARK-22255) SPARK-22255 FileAppender InputStream Read timeout and blocking state

2017-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200833#comment-16200833 ] Sean Owen commented on SPARK-22255: --- Yes, that's all by design. The normal case is having 0 bytes

[jira] [Commented] (SPARK-18359) Let user specify locale in CSV parsing

2017-10-11 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200831#comment-16200831 ] Andrew Ash commented on SPARK-18359: I agree with Sean -- using the submitting JVM's locale is

[jira] [Commented] (SPARK-22255) SPARK-22255 FileAppender InputStream Read timeout and blocking state

2017-10-11 Thread Mariusz Galus (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200824#comment-16200824 ] Mariusz Galus commented on SPARK-22255: --- Sean, can you describe a case where there are 0 input

[jira] [Commented] (SPARK-22255) SPARK-22255 FileAppender InputStream Read timeout and blocking state

2017-10-11 Thread Mariusz Galus (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200807#comment-16200807 ] Mariusz Galus commented on SPARK-22255: --- It is wrapped in a while loop, so it will constantly be

[jira] [Updated] (SPARK-22256) Introduce spark.mesos.driver.memoryOverhead

2017-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-22256: -- Target Version/s: (was: 2.2.1) Priority: Minor (was: Major) Fix Version/s:

[jira] [Created] (SPARK-22256) Introduce spark.mesos.driver.memoryOverhead

2017-10-11 Thread Cosmin Lehene (JIRA)
Cosmin Lehene created SPARK-22256: - Summary: Introduce spark.mesos.driver.memoryOverhead Key: SPARK-22256 URL: https://issues.apache.org/jira/browse/SPARK-22256 Project: Spark Issue Type:

[jira] [Commented] (SPARK-22255) SPARK-22255 FileAppender InputStream Read timeout and blocking state

2017-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200759#comment-16200759 ] Sean Owen commented on SPARK-22255: --- The purpose of the class is to transfer all the bytes from an

[jira] [Commented] (SPARK-22255) SPARK-22255 FileAppender InputStream Read timeout and blocking state

2017-10-11 Thread Mariusz Galus (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200749#comment-16200749 ] Mariusz Galus commented on SPARK-22255: --- I would like an answer on why we need to block. I am using

[jira] [Commented] (SPARK-22253) Python packaging using JDK 8 fails when run using Spark 2.2

2017-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200724#comment-16200724 ] Sean Owen commented on SPARK-22253: --- I don't think it has to do with JDK 8. Spark has required and run

[jira] [Updated] (SPARK-22255) SPARK-22255 FileAppender InputStream Read timeout and blocking state

2017-10-11 Thread Mariusz Galus (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mariusz Galus updated SPARK-22255: -- Description: The FileAppender logic when reading from InputStream blocks. This can be simply

[jira] [Commented] (SPARK-22255) FileAppender InputStream.read() timeout and blocking state

2017-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200732#comment-16200732 ] Sean Owen commented on SPARK-22255: --- You have to block here, right? I'm not clear how you wait on input

[jira] [Resolved] (SPARK-22253) Python packaging using JDK 8 fails when run using Spark 2.2

2017-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-22253. --- Resolution: Not A Problem That kind of think is a knock-on error from something else failing. The

[jira] [Updated] (SPARK-22255) FileAppender InputStream.read() timeout and blocking state

2017-10-11 Thread Mariusz Galus (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mariusz Galus updated SPARK-22255: -- Summary: FileAppender InputStream.read() timeout and blocking state (was: FileAppender

[jira] [Commented] (SPARK-22253) Python packaging using JDK 8 fails when run using Spark 2.2

2017-10-11 Thread Swaapnika Guntaka (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200730#comment-16200730 ] Swaapnika Guntaka commented on SPARK-22253: --- After the above error I see a bunch of lost tasks

[jira] [Updated] (SPARK-22255) SPARK-22255 FileAppender InputStream Read timeout and blocking state

2017-10-11 Thread Mariusz Galus (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mariusz Galus updated SPARK-22255: -- Summary: SPARK-22255 FileAppender InputStream Read timeout and blocking state (was:

[jira] [Created] (SPARK-22255) FileAppender InputStream.read() timeout

2017-10-11 Thread Mariusz Galus (JIRA)
Mariusz Galus created SPARK-22255: - Summary: FileAppender InputStream.read() timeout Key: SPARK-22255 URL: https://issues.apache.org/jira/browse/SPARK-22255 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-18492) GeneratedIterator grows beyond 64 KB

2017-10-11 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200682#comment-16200682 ] Kazuaki Ishizaki commented on SPARK-18492: -- [~ramzanfarooq] I see. We do not need your original

[jira] [Commented] (SPARK-22229) SPIP: RDMA Accelerated Shuffle Engine

2017-10-11 Thread Yuval Degani (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200664#comment-16200664 ] Yuval Degani commented on SPARK-9: -- Yes, transitioning from a ShuffleManager to a

[jira] [Commented] (SPARK-19700) Design an API for pluggable scheduler implementations

2017-10-11 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200659#comment-16200659 ] Felix Cheung commented on SPARK-19700: -- Not that I'm aware of - I agree it is very important to take

[jira] [Created] (SPARK-22254) clean up the implementation of `growToSize` in CompactBuffer

2017-10-11 Thread Feng Liu (JIRA)
Feng Liu created SPARK-22254: Summary: clean up the implementation of `growToSize` in CompactBuffer Key: SPARK-22254 URL: https://issues.apache.org/jira/browse/SPARK-22254 Project: Spark Issue

[jira] [Commented] (SPARK-22226) splitExpression can create too many method calls (generating a Constant Pool limit error)

2017-10-11 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200629#comment-16200629 ] Kazuaki Ishizaki commented on SPARK-6: -- [~mgaido] I think that you can reopen this JIRA with

[jira] [Updated] (SPARK-22253) Python packaging using JDK 8 fails when run using Spark 2.2

2017-10-11 Thread Swaapnika Guntaka (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swaapnika Guntaka updated SPARK-22253: -- Description: Python packaging fails with {Java EOF Exception} when run using spark

[jira] [Updated] (SPARK-22253) Python packaging using JDK 8 fails when run using Spark 2.2

2017-10-11 Thread Swaapnika Guntaka (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swaapnika Guntaka updated SPARK-22253: -- Shepherd: Sean Owen > Python packaging using JDK 8 fails when run using Spark 2.2 >

[jira] [Commented] (SPARK-18492) GeneratedIterator grows beyond 64 KB

2017-10-11 Thread Muhammad Ramzan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200625#comment-16200625 ] Muhammad Ramzan commented on SPARK-18492: - [~kiszk] i am sorry i can not share the code since its

[jira] [Created] (SPARK-22253) Python packaging using JDK 8 fails when run using Spark 2.2

2017-10-11 Thread Swaapnika Guntaka (JIRA)
Swaapnika Guntaka created SPARK-22253: - Summary: Python packaging using JDK 8 fails when run using Spark 2.2 Key: SPARK-22253 URL: https://issues.apache.org/jira/browse/SPARK-22253 Project: Spark

[jira] [Commented] (SPARK-18492) GeneratedIterator grows beyond 64 KB

2017-10-11 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200606#comment-16200606 ] Kazuaki Ishizaki commented on SPARK-18492: -- [~cenyuhai][~ramzanfarooq] Thank you for reporting

[jira] [Commented] (SPARK-22240) S3 CSV number of partitions incorrectly computed

2017-10-11 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200485#comment-16200485 ] Steve Loughran commented on SPARK-22240: What's the link to the multiline JIRA? As that could

[jira] [Resolved] (SPARK-21649) Support writing data into hive bucket table.

2017-10-11 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil resolved SPARK-21649. - Resolution: Duplicate > Support writing data into hive bucket table. >

[jira] [Commented] (SPARK-21649) Support writing data into hive bucket table.

2017-10-11 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200428#comment-16200428 ] Tejas Patil commented on SPARK-21649: - yes. It is duplicate of SPARK-19256 > Support writing data

[jira] [Commented] (SPARK-22240) S3 CSV number of partitions incorrectly computed

2017-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200413#comment-16200413 ] Hyukjin Kwon commented on SPARK-22240: -- Hm, but this particular issue looks more like related when

[jira] [Assigned] (SPARK-21661) SparkSQL can't merge load table from Hadoop

2017-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-21661: Assignee: Li Yuanjian > SparkSQL can't merge load table from Hadoop >

[jira] [Resolved] (SPARK-21661) SparkSQL can't merge load table from Hadoop

2017-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-21661. -- Resolution: Fixed I think it was fixed in the PR above. Please reopen this if this still

[jira] [Commented] (SPARK-21649) Support writing data into hive bucket table.

2017-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200373#comment-16200373 ] Hyukjin Kwon commented on SPARK-21649: -- So, is it a duplicate of SPARK-19256? Could you maybe

[jira] [Commented] (SPARK-22252) FileFormatWriter should respect the input query schema

2017-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200405#comment-16200405 ] Apache Spark commented on SPARK-22252: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-22252) FileFormatWriter should respect the input query schema

2017-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22252: Assignee: Wenchen Fan (was: Apache Spark) > FileFormatWriter should respect the input

[jira] [Assigned] (SPARK-22252) FileFormatWriter should respect the input query schema

2017-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22252: Assignee: Apache Spark (was: Wenchen Fan) > FileFormatWriter should respect the input

[jira] [Commented] (SPARK-21730) Consider officially dropping PyPy pre-2.5 support

2017-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200381#comment-16200381 ] Hyukjin Kwon commented on SPARK-21730: -- +1 > Consider officially dropping PyPy pre-2.5 support >

[jira] [Resolved] (SPARK-21763) InferSchema option does not infer the correct schema (timestamp) from xlsx file.

2017-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-21763. -- Resolution: Invalid I think it should be asked to https://github.com/crealytics/spark-excel.

[jira] [Comment Edited] (SPARK-22022) Unable to use Python Profiler with SparkSession

2017-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200393#comment-16200393 ] Hyukjin Kwon edited comment on SPARK-22022 at 10/11/17 2:48 PM: I am

[jira] [Resolved] (SPARK-21978) schemaInference option not to convert strings with leading zeros to int/long

2017-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-21978. -- Resolution: Won't Fix I am reolsving this as {{Won't Fix}} assuming there is no argument

[jira] [Resolved] (SPARK-22022) Unable to use Python Profiler with SparkSession

2017-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-22022. -- Resolution: Won't Fix I am resolving this assuming there is no arumgnet against ^. > Unable

[jira] [Commented] (SPARK-22243) streaming job failed to restart from checkpoint

2017-10-11 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200372#comment-16200372 ] Saisai Shao commented on SPARK-22243: - Yes, it is a related issue regarding to Spark Streaming

[jira] [Resolved] (SPARK-21790) Running Docker-based Integration Test Suites throws exception

2017-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-21790. -- Resolution: Fixed > Running Docker-based Integration Test Suites throws exception >

[jira] [Created] (SPARK-22252) FileFormatWriter should respect the input query schema

2017-10-11 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-22252: --- Summary: FileFormatWriter should respect the input query schema Key: SPARK-22252 URL: https://issues.apache.org/jira/browse/SPARK-22252 Project: Spark Issue

[jira] [Commented] (SPARK-22240) S3 CSV number of partitions incorrectly computed

2017-10-11 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200367#comment-16200367 ] Steve Loughran commented on SPARK-22240: Amazon EMR is amazon's own fork of Spark & Hadoop, with

[jira] [Assigned] (SPARK-22251) Metric "aggregate time" is incorrect when codegen is off

2017-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22251: Assignee: Apache Spark > Metric "aggregate time" is incorrect when codegen is off >

[jira] [Commented] (SPARK-22251) Metric "aggregate time" is incorrect when codegen is off

2017-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200355#comment-16200355 ] Apache Spark commented on SPARK-22251: -- User 'ala' has created a pull request for this issue:

[jira] [Assigned] (SPARK-22251) Metric "aggregate time" is incorrect when codegen is off

2017-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22251: Assignee: (was: Apache Spark) > Metric "aggregate time" is incorrect when codegen is

[jira] [Resolved] (SPARK-21032) Support add_years and add_days functions

2017-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-21032. -- Resolution: Won't Fix Not sure for aliases. I won't mind reopening if there are some

[jira] [Commented] (SPARK-22251) Metric "aggregate time" is incorrect when codegen is off

2017-10-11 Thread Ala Luszczak (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200345#comment-16200345 ] Ala Luszczak commented on SPARK-22251: -- I checked that if you type

[jira] [Commented] (SPARK-21001) Staging folders from Hive table are not being cleared.

2017-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200340#comment-16200340 ] Hyukjin Kwon commented on SPARK-21001: -- [~ajaycherukuri], Would you have some time to check if this

[jira] [Commented] (SPARK-22251) Metric "aggregate time" is incorrect when codegen is off

2017-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200338#comment-16200338 ] Sean Owen commented on SPARK-22251: --- Did you try setting this before the app starts? I'm not actually

[jira] [Created] (SPARK-22251) Metric "aggregate time" is incorrect when codegen is off

2017-10-11 Thread Ala Luszczak (JIRA)
Ala Luszczak created SPARK-22251: Summary: Metric "aggregate time" is incorrect when codegen is off Key: SPARK-22251 URL: https://issues.apache.org/jira/browse/SPARK-22251 Project: Spark

[jira] [Commented] (SPARK-22240) S3 CSV number of partitions incorrectly computed

2017-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200333#comment-16200333 ] Hyukjin Kwon commented on SPARK-22240: -- This is an unfortunate limitation when {{multiLine}} is

[jira] [Commented] (SPARK-22250) Be less restrictive on type checking

2017-10-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200309#comment-16200309 ] Hyukjin Kwon commented on SPARK-22250: -- {{createDataFrame(... verifySchema=False)}}? > Be less

[jira] [Updated] (SPARK-22242) job failed to restart from checkpoint

2017-10-11 Thread StephenZou (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] StephenZou updated SPARK-22242: --- Shepherd: (was: Shixiong Zhu) > job failed to restart from checkpoint >

[jira] [Created] (SPARK-22250) Be less restrictive on type checking

2017-10-11 Thread Fernando Pereira (JIRA)
Fernando Pereira created SPARK-22250: Summary: Be less restrictive on type checking Key: SPARK-22250 URL: https://issues.apache.org/jira/browse/SPARK-22250 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-22249) UnsupportedOperationException: empty.reduceLeft when caching a dataframe

2017-10-11 Thread Andreas Maier (JIRA)
Andreas Maier created SPARK-22249: - Summary: UnsupportedOperationException: empty.reduceLeft when caching a dataframe Key: SPARK-22249 URL: https://issues.apache.org/jira/browse/SPARK-22249 Project:

  1   2   >