[jira] [Assigned] (SPARK-20781) the location of Dockerfile in docker.properties.template is wrong

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20781: Assignee: Apache Spark > the location of Dockerfile in docker.properties.template is

[jira] [Assigned] (SPARK-20781) the location of Dockerfile in docker.properties.template is wrong

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20781: Assignee: (was: Apache Spark) > the location of Dockerfile in

[jira] [Commented] (SPARK-20781) the location of Dockerfile in docker.properties.template is wrong

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013563#comment-16013563 ] Apache Spark commented on SPARK-20781: -- User 'liu-zhaokun' has created a pull request for this

[jira] [Created] (SPARK-20781) the location of Dockerfile in docker.properties.template is wrong

2017-05-16 Thread liuzhaokun (JIRA)
liuzhaokun created SPARK-20781: -- Summary: the location of Dockerfile in docker.properties.template is wrong Key: SPARK-20781 URL: https://issues.apache.org/jira/browse/SPARK-20781 Project: Spark

[jira] [Commented] (SPARK-12297) Add work-around for Parquet/Hive int96 timestamp bug.

2017-05-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013540#comment-16013540 ] Reynold Xin commented on SPARK-12297: - I don't think the CSV example you gave make sense. It is still

[jira] [Comment Edited] (SPARK-12297) Add work-around for Parquet/Hive int96 timestamp bug.

2017-05-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013540#comment-16013540 ] Reynold Xin edited comment on SPARK-12297 at 5/17/17 5:12 AM: -- I don't think

[jira] [Resolved] (SPARK-20776) Fix JobProgressListener perf. problems caused by empty TaskMetrics initialization

2017-05-16 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-20776. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 18008

[jira] [Resolved] (SPARK-20690) Analyzer shouldn't add missing attributes through subquery

2017-05-16 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-20690. - Resolution: Fixed Assignee: Liang-Chi Hsieh Fix Version/s: 2.3.0 > Analyzer

[jira] [Updated] (SPARK-20690) Analyzer shouldn't add missing attributes through subquery

2017-05-16 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-20690: Labels: release-notes (was: ) > Analyzer shouldn't add missing attributes through subquery >

[jira] [Updated] (SPARK-20780) Spark Kafka10 Consumer Hangs

2017-05-16 Thread jayadeepj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jayadeepj updated SPARK-20780: -- Environment: Spark 2.1.0 Spark Streaming Kafka 010 Yarn - Cluster Mode CDH 5.8.4 CentOS Linux release

[jira] [Updated] (SPARK-20762) Make String Params Case-Insensitive

2017-05-16 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-20762: - Description: Make String Params (excpet Cols) case-insensitve: {{solver}} {{modelType}}

[jira] [Updated] (SPARK-20779) The ASF header placed in an incorrect location in some files

2017-05-16 Thread zuotingbing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zuotingbing updated SPARK-20779: Description: when i test some examples, i found the license is not at the top in some files. and

[jira] [Updated] (SPARK-20780) Spark Kafka10 Consumer Hangs

2017-05-16 Thread jayadeepj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jayadeepj updated SPARK-20780: -- Environment: Spark 2.1.0 Spark Streaming Kafka 010 CDH 5.8.4 CentOS Linux release 7.2 was: Spark

[jira] [Updated] (SPARK-20780) Spark Kafka10 Consumer Hangs

2017-05-16 Thread jayadeepj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jayadeepj updated SPARK-20780: -- Attachment: streaming_1.png streaming_2.png tasks_timing_out_3.png >

[jira] [Updated] (SPARK-20779) The ASF header placed in an incorrect location in some files

2017-05-16 Thread zuotingbing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zuotingbing updated SPARK-20779: Issue Type: Improvement (was: Bug) > The ASF header placed in an incorrect location in some files

[jira] [Created] (SPARK-20780) Spark Kafka10 Consumer Hangs

2017-05-16 Thread jayadeepj (JIRA)
jayadeepj created SPARK-20780: - Summary: Spark Kafka10 Consumer Hangs Key: SPARK-20780 URL: https://issues.apache.org/jira/browse/SPARK-20780 Project: Spark Issue Type: Bug Components:

[jira] [Assigned] (SPARK-20779) The ASF header placed in an incorrect location in some files

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20779: Assignee: (was: Apache Spark) > The ASF header placed in an incorrect location in

[jira] [Assigned] (SPARK-20779) The ASF header placed in an incorrect location in some files

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20779: Assignee: Apache Spark > The ASF header placed in an incorrect location in some files >

[jira] [Commented] (SPARK-20779) The ASF header placed in an incorrect location in some files

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013465#comment-16013465 ] Apache Spark commented on SPARK-20779: -- User 'zuotingbing' has created a pull request for this

[jira] [Created] (SPARK-20779) The ASF header placed in an incorrect location in some files

2017-05-16 Thread zuotingbing (JIRA)
zuotingbing created SPARK-20779: --- Summary: The ASF header placed in an incorrect location in some files Key: SPARK-20779 URL: https://issues.apache.org/jira/browse/SPARK-20779 Project: Spark

[jira] [Commented] (SPARK-20772) Add support for query parameters in redirects on Yarn

2017-05-16 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013431#comment-16013431 ] Saisai Shao commented on SPARK-20772: - I'm guessing if it is an issue of {{AmIpFilter}}, should be a

[jira] [Updated] (SPARK-20772) Add support for query parameters in redirects on Yarn

2017-05-16 Thread Bjorn Jonsson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bjorn Jonsson updated SPARK-20772: -- Description: Spark uses rewrites of query parameters to paths (http://:4040/jobs/job?id=0 -->

[jira] [Assigned] (SPARK-19089) Support nested arrays/seqs in Datasets

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19089: Assignee: (was: Apache Spark) > Support nested arrays/seqs in Datasets >

[jira] [Assigned] (SPARK-19089) Support nested arrays/seqs in Datasets

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19089: Assignee: Apache Spark > Support nested arrays/seqs in Datasets >

[jira] [Commented] (SPARK-19089) Support nested arrays/seqs in Datasets

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013350#comment-16013350 ] Apache Spark commented on SPARK-19089: -- User 'michalsenkyr' has created a pull request for this

[jira] [Assigned] (SPARK-20778) Implement array_intersect function

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20778: Assignee: (was: Apache Spark) > Implement array_intersect function >

[jira] [Assigned] (SPARK-20778) Implement array_intersect function

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20778: Assignee: Apache Spark > Implement array_intersect function >

[jira] [Commented] (SPARK-20778) Implement array_intersect function

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013337#comment-16013337 ] Apache Spark commented on SPARK-20778: -- User 'ericvandenbergfb' has created a pull request for this

[jira] [Created] (SPARK-20778) Implement array_intersect function

2017-05-16 Thread Eric Vandenberg (JIRA)
Eric Vandenberg created SPARK-20778: --- Summary: Implement array_intersect function Key: SPARK-20778 URL: https://issues.apache.org/jira/browse/SPARK-20778 Project: Spark Issue Type:

[jira] [Commented] (SPARK-18838) High latency of event processing for large jobs

2017-05-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013317#comment-16013317 ] Josh Rosen commented on SPARK-18838: I think that SPARK-20776 /

[jira] [Commented] (SPARK-20235) Hive on S3 s3:sse and non S3:sse buckets

2017-05-16 Thread Franck Tago (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013290#comment-16013290 ] Franck Tago commented on SPARK-20235: - was this comment meant for me? what does that mean ? > Hive

[jira] [Commented] (SPARK-18891) Support for specific collection types

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013278#comment-16013278 ] Apache Spark commented on SPARK-18891: -- User 'michalsenkyr' has created a pull request for this

[jira] [Created] (SPARK-20777) Spark Streaming NullPointerException when restoring from hdfs checkpoint

2017-05-16 Thread Richard Moorhead (JIRA)
Richard Moorhead created SPARK-20777: Summary: Spark Streaming NullPointerException when restoring from hdfs checkpoint Key: SPARK-20777 URL: https://issues.apache.org/jira/browse/SPARK-20777

[jira] [Updated] (SPARK-20776) Fix JobProgressListener perf. problems caused by empty TaskMetrics initialization

2017-05-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-20776: --- Attachment: (was: screenshot-1.png) > Fix JobProgressListener perf. problems caused by empty

[jira] [Updated] (SPARK-20776) Fix JobProgressListener perf. problems caused by empty TaskMetrics initialization

2017-05-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-20776: --- Description: In {code} ./bin/spark-shell --master=local[64] {code} I ran {code}

[jira] [Updated] (SPARK-20776) Fix JobProgressListener perf. problems caused by empty TaskMetrics initialization

2017-05-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-20776: --- Description: In {code} ./bin/spark-shell --master=local[64] {code} I ran {code}

[jira] [Updated] (SPARK-20776) Fix JobProgressListener perf. problems caused by empty TaskMetrics initialization

2017-05-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-20776: --- Summary: Fix JobProgressListener perf. problems caused by empty TaskMetrics initialization (was:

[jira] [Commented] (SPARK-16441) Spark application hang when dynamic allocation is enabled

2017-05-16 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013247#comment-16013247 ] Ruslan Dautkhanov commented on SPARK-16441: --- We did not have

[jira] [Commented] (SPARK-20776) Fix performance problems in TaskMetrics.nameToAccums map initialization

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013220#comment-16013220 ] Apache Spark commented on SPARK-20776: -- User 'JoshRosen' has created a pull request for this issue:

[jira] [Assigned] (SPARK-20776) Fix performance problems in TaskMetrics.nameToAccums map initialization

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20776: Assignee: Apache Spark (was: Josh Rosen) > Fix performance problems in

[jira] [Assigned] (SPARK-20776) Fix performance problems in TaskMetrics.nameToAccums map initialization

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20776: Assignee: Josh Rosen (was: Apache Spark) > Fix performance problems in

[jira] [Created] (SPARK-20776) Fix performance problems in TaskMetrics.nameToAccums map initialization

2017-05-16 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-20776: -- Summary: Fix performance problems in TaskMetrics.nameToAccums map initialization Key: SPARK-20776 URL: https://issues.apache.org/jira/browse/SPARK-20776 Project: Spark

[jira] [Updated] (SPARK-20776) Fix performance problems in TaskMetrics.nameToAccums map initialization

2017-05-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-20776: --- Attachment: screenshot-1.png > Fix performance problems in TaskMetrics.nameToAccums map

[jira] [Commented] (SPARK-15703) Make ListenerBus event queue size configurable

2017-05-16 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013208#comment-16013208 ] Ruslan Dautkhanov commented on SPARK-15703: --- We keep running into this issue too - would be

[jira] [Comment Edited] (SPARK-20700) InferFiltersFromConstraints stackoverflows for query (v2)

2017-05-16 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013199#comment-16013199 ] Jiang Xingbo edited comment on SPARK-20700 at 5/16/17 10:23 PM: In the

[jira] [Commented] (SPARK-20700) InferFiltersFromConstraints stackoverflows for query (v2)

2017-05-16 Thread Jiang Xingbo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013199#comment-16013199 ] Jiang Xingbo commented on SPARK-20700: -- In the previous approach we used `aliasMap` to link an

[jira] [Resolved] (SPARK-20140) Remove hardcoded kinesis retry wait and max retries

2017-05-16 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-20140. - Resolution: Fixed Fix Version/s: 2.3.0 2.2.1 > Remove hardcoded

[jira] [Assigned] (SPARK-20140) Remove hardcoded kinesis retry wait and max retries

2017-05-16 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz reassigned SPARK-20140: --- Assignee: Yash Sharma > Remove hardcoded kinesis retry wait and max retries >

[jira] [Commented] (SPARK-20140) Remove hardcoded kinesis retry wait and max retries

2017-05-16 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013191#comment-16013191 ] Burak Yavuz commented on SPARK-20140: - resolved by https://github.com/apache/spark/pull/17467 >

[jira] [Assigned] (SPARK-19372) Code generation for Filter predicate including many OR conditions exceeds JVM method size limit

2017-05-16 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-19372: Assignee: Kazuaki Ishizaki > Code generation for Filter predicate including many OR

[jira] [Resolved] (SPARK-19372) Code generation for Filter predicate including many OR conditions exceeds JVM method size limit

2017-05-16 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-19372. -- Resolution: Fixed Fix Version/s: 2.3.0 > Code generation for Filter predicate including

[jira] [Created] (SPARK-20775) from_json should also have an API where the schema is specified with a string

2017-05-16 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-20775: --- Summary: from_json should also have an API where the schema is specified with a string Key: SPARK-20775 URL: https://issues.apache.org/jira/browse/SPARK-20775 Project:

[jira] [Updated] (SPARK-20503) ML 2.2 QA: API: Python API coverage

2017-05-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-20503: -- Fix Version/s: 2.2.0 > ML 2.2 QA: API: Python API coverage >

[jira] [Resolved] (SPARK-20509) SparkR 2.2 QA: New R APIs and API docs

2017-05-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-20509. --- Resolution: Done Fix Version/s: 2.2.0 > SparkR 2.2 QA: New R APIs and API

[jira] [Created] (SPARK-20774) BroadcastExchangeExec doesn't cancel the Spark job if broadcasting a relation timeouts.

2017-05-16 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-20774: Summary: BroadcastExchangeExec doesn't cancel the Spark job if broadcasting a relation timeouts. Key: SPARK-20774 URL: https://issues.apache.org/jira/browse/SPARK-20774

[jira] [Commented] (SPARK-20509) SparkR 2.2 QA: New R APIs and API docs

2017-05-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013060#comment-16013060 ] Joseph K. Bradley commented on SPARK-20509: --- I checked the new and changed APIs, comparing the

[jira] [Commented] (SPARK-14584) Improve recognition of non-nullability in Dataset transformations

2017-05-16 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013001#comment-16013001 ] Takeshi Yamamuro commented on SPARK-14584: -- Could we close this as resolved? It seems the merged

[jira] [Assigned] (SPARK-20773) ParquetWriteSupport.writeFields is quadratic in number of fields

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20773: Assignee: (was: Apache Spark) > ParquetWriteSupport.writeFields is quadratic in

[jira] [Assigned] (SPARK-20773) ParquetWriteSupport.writeFields is quadratic in number of fields

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20773: Assignee: Apache Spark > ParquetWriteSupport.writeFields is quadratic in number of fields

[jira] [Commented] (SPARK-20773) ParquetWriteSupport.writeFields is quadratic in number of fields

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012992#comment-16012992 ] Apache Spark commented on SPARK-20773: -- User 'tpoterba' has created a pull request for this issue:

[jira] [Updated] (SPARK-20773) ParquetWriteSupport.writeFields is quadratic in number of fields

2017-05-16 Thread T Poterba (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Poterba updated SPARK-20773: -- Summary: ParquetWriteSupport.writeFields is quadratic in number of fields (was:

[jira] [Commented] (SPARK-20746) Built-in SQL Function Improvement

2017-05-16 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012982#comment-16012982 ] Takeshi Yamamuro commented on SPARK-20746: -- Could I take on some of them? [~smilegator] >

[jira] [Created] (SPARK-20773) ParquetWriteSupport.writeFields has is quadratic in number of fields

2017-05-16 Thread T Poterba (JIRA)
T Poterba created SPARK-20773: - Summary: ParquetWriteSupport.writeFields has is quadratic in number of fields Key: SPARK-20773 URL: https://issues.apache.org/jira/browse/SPARK-20773 Project: Spark

[jira] [Commented] (SPARK-20509) SparkR 2.2 QA: New R APIs and API docs

2017-05-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012913#comment-16012913 ] Joseph K. Bradley commented on SPARK-20509: --- I'll take this one > SparkR 2.2 QA: New R APIs

[jira] [Assigned] (SPARK-20509) SparkR 2.2 QA: New R APIs and API docs

2017-05-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-20509: - Assignee: Joseph K. Bradley > SparkR 2.2 QA: New R APIs and API docs >

[jira] [Created] (SPARK-20772) Add support for query parameters in redirects on Yarn

2017-05-16 Thread Bjorn Jonsson (JIRA)
Bjorn Jonsson created SPARK-20772: - Summary: Add support for query parameters in redirects on Yarn Key: SPARK-20772 URL: https://issues.apache.org/jira/browse/SPARK-20772 Project: Spark

[jira] [Updated] (SPARK-20771) Usability issues with weekofyear()

2017-05-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-20771: -- Description: The weekofyear() implementation follows HIVE / ISO 8601 week number. However it

[jira] [Created] (SPARK-20771) Usability issues with weekofyear()

2017-05-16 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-20771: - Summary: Usability issues with weekofyear() Key: SPARK-20771 URL: https://issues.apache.org/jira/browse/SPARK-20771 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

2017-05-16 Thread Brian Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012818#comment-16012818 ] Brian Zhang commented on SPARK-15616: - Hello, Just wondering what's the current status of this issue?

[jira] [Commented] (SPARK-18838) High latency of event processing for large jobs

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012813#comment-16012813 ] Apache Spark commented on SPARK-18838: -- User 'bOOm-X' has created a pull request for this issue:

[jira] [Resolved] (SPARK-20529) Worker should not use the received Master address

2017-05-16 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-20529. -- Resolution: Fixed Fix Version/s: 2.2.0 > Worker should not use the received Master

[jira] [Commented] (SPARK-20503) ML 2.2 QA: API: Python API coverage

2017-05-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012774#comment-16012774 ] Joseph K. Bradley commented on SPARK-20503: --- Thanks a lot! > ML 2.2 QA: API: Python API

[jira] [Assigned] (SPARK-20770) Improve ColumnStats

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20770: Assignee: Apache Spark > Improve ColumnStats > --- > >

[jira] [Commented] (SPARK-20770) Improve ColumnStats

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012636#comment-16012636 ] Apache Spark commented on SPARK-20770: -- User 'kiszk' has created a pull request for this issue:

[jira] [Assigned] (SPARK-20770) Improve ColumnStats

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20770: Assignee: (was: Apache Spark) > Improve ColumnStats > --- > >

[jira] [Commented] (SPARK-9215) Implement WAL-free Kinesis receiver that give at-least once guarantee

2017-05-16 Thread Richard Moorhead (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012591#comment-16012591 ] Richard Moorhead commented on SPARK-9215: - WAL is not necessary for fault tolerant Kinesis

[jira] [Comment Edited] (SPARK-12297) Add work-around for Parquet/Hive int96 timestamp bug.

2017-05-16 Thread Zoltan Ivanfi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012542#comment-16012542 ] Zoltan Ivanfi edited comment on SPARK-12297 at 5/16/17 3:16 PM: What I

[jira] [Comment Edited] (SPARK-12297) Add work-around for Parquet/Hive int96 timestamp bug.

2017-05-16 Thread Zoltan Ivanfi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012542#comment-16012542 ] Zoltan Ivanfi edited comment on SPARK-12297 at 5/16/17 3:11 PM: What I

[jira] [Commented] (SPARK-12297) Add work-around for Parquet/Hive int96 timestamp bug.

2017-05-16 Thread Zoltan Ivanfi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012542#comment-16012542 ] Zoltan Ivanfi commented on SPARK-12297: --- What I meant is that if a CSV file ("STORED AS TEXTFILE"

[jira] [Created] (SPARK-20770) Improve ColumnStats

2017-05-16 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20770: Summary: Improve ColumnStats Key: SPARK-20770 URL: https://issues.apache.org/jira/browse/SPARK-20770 Project: Spark Issue Type: Sub-task

[jira] [Assigned] (SPARK-20769) Incorrect documentation for using Jupyter notebook

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20769: Assignee: Apache Spark > Incorrect documentation for using Jupyter notebook >

[jira] [Assigned] (SPARK-20769) Incorrect documentation for using Jupyter notebook

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20769: Assignee: (was: Apache Spark) > Incorrect documentation for using Jupyter notebook >

[jira] [Commented] (SPARK-20769) Incorrect documentation for using Jupyter notebook

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012527#comment-16012527 ] Apache Spark commented on SPARK-20769: -- User 'aray' has created a pull request for this issue:

[jira] [Updated] (SPARK-14098) Generate Java code to build CachedColumnarBatch and get values from CachedColumnarBatch when DataFrame.cache() is called

2017-05-16 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-14098: - Description:

[jira] [Created] (SPARK-20769) Incorrect documentation for using Jupyter notebook

2017-05-16 Thread Andrew Ray (JIRA)
Andrew Ray created SPARK-20769: -- Summary: Incorrect documentation for using Jupyter notebook Key: SPARK-20769 URL: https://issues.apache.org/jira/browse/SPARK-20769 Project: Spark Issue Type:

[jira] [Updated] (SPARK-14098) Generate Java code to build CachedColumnarBatch and get values from CachedColumnarBatch when DataFrame.cache() is called

2017-05-16 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-14098: - Issue Type: Umbrella (was: Improvement) > Generate Java code to build

[jira] [Commented] (SPARK-14098) Generate Java code to build CachedColumnarBatch and get values from CachedColumnarBatch when DataFrame.cache() is called

2017-05-16 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012488#comment-16012488 ] Kazuaki Ishizaki commented on SPARK-14098: -- [~lins05] Sorry, I overlooked this message. I synced

[jira] [Updated] (SPARK-14098) Generate Java code to build CachedColumnarBatch and get values from CachedColumnarBatch when DataFrame.cache() is called

2017-05-16 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-14098: - Summary: Generate Java code to build CachedColumnarBatch and get values from

[jira] [Commented] (SPARK-20748) Built-in SQL Function Support - CH[A]R

2017-05-16 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012487#comment-16012487 ] Yuming Wang commented on SPARK-20748: - I am working on this > Built-in SQL Function Support - CH[A]R

[jira] [Commented] (SPARK-20765) Cannot load persisted PySpark ML Pipeline that includes 3rd party stage (Transformer or Estimator) if the package name of stage is not "org.apache.spark" and "pyspark"

2017-05-16 Thread APeng Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012465#comment-16012465 ] APeng Zhang commented on SPARK-20765: - Yes, the class is on the classpath. The problem is the current

[jira] [Commented] (SPARK-18359) Let user specify locale in CSV parsing

2017-05-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012455#comment-16012455 ] Sean Owen commented on SPARK-18359: --- Using the JVM locale is a bad way to get this behavior, because

[jira] [Commented] (SPARK-20740) Expose UserDefinedType make sure could extends it

2017-05-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012453#comment-16012453 ] Hyukjin Kwon commented on SPARK-20740: -- ping [~darion]. If we can't explain the use case, I would

[jira] [Resolved] (SPARK-20761) Union uses column order rather than schema

2017-05-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-20761. -- Resolution: Duplicate I am pretty sure that it is a duplicate of SPARK-15918. Please reopen

[jira] [Commented] (SPARK-20765) Cannot load persisted PySpark ML Pipeline that includes 3rd party stage (Transformer or Estimator) if the package name of stage is not "org.apache.spark" and "pyspark"

2017-05-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012439#comment-16012439 ] Sean Owen commented on SPARK-20765: --- Yes, but doesn't this code lead com.abc.xyz as com.abc.xyz as

[jira] [Commented] (SPARK-20364) Parquet predicate pushdown on columns with dots return empty results

2017-05-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012422#comment-16012422 ] Apache Spark commented on SPARK-20364: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Commented] (SPARK-20765) Cannot load persisted PySpark ML Pipeline that includes 3rd party stage (Transformer or Estimator) if the package name of stage is not "org.apache.spark" and "pyspark"

2017-05-16 Thread APeng Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012405#comment-16012405 ] APeng Zhang commented on SPARK-20765: - PySpark will get the Python calss name from Scala class name

[jira] [Commented] (SPARK-20555) Incorrect handling of Oracle's decimal types via JDBC

2017-05-16 Thread Gabor Feher (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012394#comment-16012394 ] Gabor Feher commented on SPARK-20555: - Hi, Maybe it was not clear from the title, but this issue is

[jira] [Commented] (SPARK-18359) Let user specify locale in CSV parsing

2017-05-16 Thread Alexander Enns (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012377#comment-16012377 ] Alexander Enns commented on SPARK-18359: This is exactly why there is a possibility to set the

[jira] [Comment Edited] (SPARK-20712) [SQL] Spark can't read Hive table when column type has length greater than 4000 bytes

2017-05-16 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-20712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012277#comment-16012277 ] Maciej Bryński edited comment on SPARK-20712 at 5/16/17 1:00 PM: - CC:

[jira] [Commented] (SPARK-16731) use StructType in CatalogTable and remove CatalogColumn

2017-05-16 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-16731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012303#comment-16012303 ] Maciej Bryński commented on SPARK-16731: [~cloud_fan] Is it your PR connected with this problem

  1   2   >