[jira] [Created] (SPARK-23054) Incorrect results of casting UserDefinedType to String

2018-01-11 Thread Takeshi Yamamuro (JIRA)
Takeshi Yamamuro created SPARK-23054: Summary: Incorrect results of casting UserDefinedType to String Key: SPARK-23054 URL: https://issues.apache.org/jira/browse/SPARK-23054 Project: Spark

[jira] [Commented] (SPARK-23021) AnalysisBarrier should not cut off the explain output for Parsed Logical Plan

2018-01-11 Thread Kris Mok (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323672#comment-16323672 ] Kris Mok commented on SPARK-23021: -- Hi [~maropu]-san, Thanks! I'm not familiar with this part of the

[jira] [Resolved] (SPARK-22986) Avoid instantiating multiple instances of broadcast variables

2018-01-11 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-22986. - Resolution: Fixed Assignee: ho3rexqj Fix Version/s: 2.3.0 > Avoid instantiating

[jira] [Commented] (SPARK-23027) optimizer a simple query using a non-existent data is too slow

2018-01-11 Thread wangminfeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323621#comment-16323621 ] wangminfeng commented on SPARK-23027: - 2.1 perfermace nice, perfect。 > optimizer a simple query

[jira] [Closed] (SPARK-23027) optimizer a simple query using a non-existent data is too slow

2018-01-11 Thread wangminfeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangminfeng closed SPARK-23027. --- > optimizer a simple query using a non-existent data is too slow >

[jira] [Commented] (SPARK-23021) AnalysisBarrier should not cut off the explain output for Parsed Logical Plan

2018-01-11 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323623#comment-16323623 ] Takeshi Yamamuro commented on SPARK-23021: -- yea, sure, I have bandwidth to do :)) Is the fix

[jira] [Commented] (SPARK-21475) Change to use NIO's Files API for external shuffle service

2018-01-11 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323617#comment-16323617 ] Shixiong Zhu commented on SPARK-21475: -- Fixed > Change to use NIO's Files API for external shuffle

[jira] [Updated] (SPARK-21475) Change to use NIO's Files API for external shuffle service

2018-01-11 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-21475: - Fix Version/s: (was: 3.0.0) > Change to use NIO's Files API for external shuffle service >

[jira] [Commented] (SPARK-23021) AnalysisBarrier should not cut off the explain output for Parsed Logical Plan

2018-01-11 Thread Kris Mok (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323612#comment-16323612 ] Kris Mok commented on SPARK-23021: -- Hi [~maropu]-san, Thanks for looking at it! No, I only noticed this

[jira] [Commented] (SPARK-22995) Spark UI stdout/stderr links point to executors internal address

2018-01-11 Thread guoxiaolongzte (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323607#comment-16323607 ] guoxiaolongzte commented on SPARK-22995: Please close this jira, thank you. > Spark UI

[jira] [Commented] (SPARK-23021) AnalysisBarrier should not cut off the explain output for Parsed Logical Plan

2018-01-11 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323599#comment-16323599 ] Takeshi Yamamuro commented on SPARK-23021: -- hi, kris, you're working on it? I think we just

[jira] [Commented] (SPARK-21475) Change to use NIO's Files API for external shuffle service

2018-01-11 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323590#comment-16323590 ] Saisai Shao commented on SPARK-21475: - [~zsxwing] is 3.0.0 the valid fix version? > Change to use

[jira] [Commented] (SPARK-22958) Spark is stuck when the only one executor fails to register with driver

2018-01-11 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323569#comment-16323569 ] Saisai Shao commented on SPARK-22958: - If executor is failed to register itself to driver, it will

[jira] [Commented] (SPARK-21213) Support collecting partition-level statistics: rowCount and sizeInBytes

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323547#comment-16323547 ] Apache Spark commented on SPARK-21213: -- User 'maropu' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23053) taskBinarySerialization and task partitions calculate in DagScheduler.submitMissingTasks should keep the same RDD checkpoint status

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23053: Assignee: (was: Apache Spark) > taskBinarySerialization and task partitions calculate

[jira] [Commented] (SPARK-23053) taskBinarySerialization and task partitions calculate in DagScheduler.submitMissingTasks should keep the same RDD checkpoint status

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323541#comment-16323541 ] Apache Spark commented on SPARK-23053: -- User 'ivoson' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23053) taskBinarySerialization and task partitions calculate in DagScheduler.submitMissingTasks should keep the same RDD checkpoint status

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23053: Assignee: Apache Spark > taskBinarySerialization and task partitions calculate in >

[jira] [Created] (SPARK-23053) taskBinarySerialization and task partitions calculate in DagScheduler.submitMissingTasks should keep the same RDD checkpoint status

2018-01-11 Thread huangtengfei (JIRA)
huangtengfei created SPARK-23053: Summary: taskBinarySerialization and task partitions calculate in DagScheduler.submitMissingTasks should keep the same RDD checkpoint status Key: SPARK-23053 URL:

[jira] [Resolved] (SPARK-23029) Setting spark.shuffle.file.buffer will make the shuffle fail

2018-01-11 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saisai Shao resolved SPARK-23029. - Resolution: Not A Problem > Setting spark.shuffle.file.buffer will make the shuffle fail >

[jira] [Commented] (SPARK-23029) Setting spark.shuffle.file.buffer will make the shuffle fail

2018-01-11 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323496#comment-16323496 ] Saisai Shao commented on SPARK-23029: - Please see the comments in the code in

[jira] [Assigned] (SPARK-23052) Migrate ConsoleSink to v2

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23052: Assignee: (was: Apache Spark) > Migrate ConsoleSink to v2 > -

[jira] [Commented] (SPARK-23052) Migrate ConsoleSink to v2

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323492#comment-16323492 ] Apache Spark commented on SPARK-23052: -- User 'jose-torres' has created a pull request for this

[jira] [Assigned] (SPARK-23052) Migrate ConsoleSink to v2

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23052: Assignee: Apache Spark > Migrate ConsoleSink to v2 > - > >

[jira] [Created] (SPARK-23052) Migrate ConsoleSink to v2

2018-01-11 Thread Jose Torres (JIRA)
Jose Torres created SPARK-23052: --- Summary: Migrate ConsoleSink to v2 Key: SPARK-23052 URL: https://issues.apache.org/jira/browse/SPARK-23052 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-23051) job description in Spark UI is broken

2018-01-11 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323368#comment-16323368 ] Wenchen Fan commented on SPARK-23051: - cc [~vanzin] > job description in Spark UI is broken >

[jira] [Commented] (SPARK-23051) job description in Spark UI is broken

2018-01-11 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323360#comment-16323360 ] Shixiong Zhu commented on SPARK-23051: -- I marked this is a blocker since it's a regression. > job

[jira] [Updated] (SPARK-23051) job description in Spark UI is broken

2018-01-11 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-23051: - Labels: regression (was: ) > job description in Spark UI is broken >

[jira] [Created] (SPARK-23051) job description in Spark UI is broken

2018-01-11 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-23051: Summary: job description in Spark UI is broken Key: SPARK-23051 URL: https://issues.apache.org/jira/browse/SPARK-23051 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-23050) Structured Streaming with S3 file source duplicates data because of eventual consistency.

2018-01-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323329#comment-16323329 ] Sean Owen commented on SPARK-23050: --- Also have you read Steve's documentation on how S3 works with

[jira] [Comment Edited] (SPARK-23050) Structured Streaming with S3 file source duplicates data because of eventual consistency.

2018-01-11 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323320#comment-16323320 ] Shixiong Zhu edited comment on SPARK-23050 at 1/12/18 12:56 AM: How do

[jira] [Commented] (SPARK-23050) Structured Streaming with S3 file source duplicates data because of eventual consistency.

2018-01-11 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323320#comment-16323320 ] Shixiong Zhu commented on SPARK-23050: -- How do you read the output? If you use Spark to read the

[jira] [Commented] (SPARK-23008) OnehotEncoderEstimator python API

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323297#comment-16323297 ] Apache Spark commented on SPARK-23008: -- User 'WeichenXu123' has created a pull request for this

[jira] [Resolved] (SPARK-23008) OnehotEncoderEstimator python API

2018-01-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-23008. --- Resolution: Fixed Fix Version/s: 2.3.0 Resolved with

[jira] [Commented] (SPARK-23008) OnehotEncoderEstimator python API

2018-01-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323275#comment-16323275 ] Joseph K. Bradley commented on SPARK-23008: --- I'm going to backport this to branch-2.3 as a

[jira] [Assigned] (SPARK-23008) OnehotEncoderEstimator python API

2018-01-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-23008: - Assignee: Weichen Xu > OnehotEncoderEstimator python API >

[jira] [Commented] (SPARK-17888) Memory leak in streaming driver when use SparkSQL in Streaming

2018-01-11 Thread William Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323260#comment-16323260 ] William Shen commented on SPARK-17888: -- [~FireThief], did you ever find out what was wrong? We are

[jira] [Updated] (SPARK-23050) Structured Streaming with S3 file source duplicates data because of eventual consistency.

2018-01-11 Thread Yash Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yash Sharma updated SPARK-23050: Description: Spark Structured streaming with S3 file source duplicates data because of eventual

[jira] [Created] (SPARK-23050) Structured Streaming with S3 file source duplicates data because of eventual consistency.

2018-01-11 Thread Yash Sharma (JIRA)
Yash Sharma created SPARK-23050: --- Summary: Structured Streaming with S3 file source duplicates data because of eventual consistency. Key: SPARK-23050 URL: https://issues.apache.org/jira/browse/SPARK-23050

[jira] [Commented] (SPARK-23049) `spark.sql.files.ignoreCorruptFiles` should work for ORC files

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323143#comment-16323143 ] Apache Spark commented on SPARK-23049: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Assigned] (SPARK-23049) `spark.sql.files.ignoreCorruptFiles` should work for ORC files

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23049: Assignee: Apache Spark > `spark.sql.files.ignoreCorruptFiles` should work for ORC files >

[jira] [Assigned] (SPARK-23049) `spark.sql.files.ignoreCorruptFiles` should work for ORC files

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23049: Assignee: (was: Apache Spark) > `spark.sql.files.ignoreCorruptFiles` should work for

[jira] [Created] (SPARK-23049) `spark.sql.files.ignoreCorruptFiles` should work for ORC files

2018-01-11 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-23049: - Summary: `spark.sql.files.ignoreCorruptFiles` should work for ORC files Key: SPARK-23049 URL: https://issues.apache.org/jira/browse/SPARK-23049 Project: Spark

[jira] [Updated] (SPARK-23046) Have RFormula include VectorSizeHint in pipeline

2018-01-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-23046: -- Target Version/s: 2.3.0, 2.4.0 > Have RFormula include VectorSizeHint in pipeline >

[jira] [Resolved] (SPARK-23046) Have RFormula include VectorSizeHint in pipeline

2018-01-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-23046. --- Resolution: Fixed Fix Version/s: 2.4.0 2.3.0 Resolved by

[jira] [Assigned] (SPARK-23046) Have RFormula include VectorSizeHint in pipeline

2018-01-11 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-23046: - Assignee: Bago Amirbekian > Have RFormula include VectorSizeHint in pipeline >

[jira] [Created] (SPARK-23048) Update mllib docs to replace OneHotEncoder with OneHotEncoderEstimator

2018-01-11 Thread Bago Amirbekian (JIRA)
Bago Amirbekian created SPARK-23048: --- Summary: Update mllib docs to replace OneHotEncoder with OneHotEncoderEstimator Key: SPARK-23048 URL: https://issues.apache.org/jira/browse/SPARK-23048

[jira] [Updated] (SPARK-22974) CountVectorModel does not attach attributes to output column

2018-01-11 Thread William Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] William Zhang updated SPARK-22974: -- Description: If CountVectorModel transforms columns, the output column will not have

[jira] [Assigned] (SPARK-23047) Change MapVector to NullableMapVector in ArrowColumnVector

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23047: Assignee: Apache Spark > Change MapVector to NullableMapVector in ArrowColumnVector >

[jira] [Assigned] (SPARK-23047) Change MapVector to NullableMapVector in ArrowColumnVector

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23047: Assignee: (was: Apache Spark) > Change MapVector to NullableMapVector in

[jira] [Commented] (SPARK-23047) Change MapVector to NullableMapVector in ArrowColumnVector

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322916#comment-16322916 ] Apache Spark commented on SPARK-23047: -- User 'icexelloss' has created a pull request for this issue:

[jira] [Created] (SPARK-23047) Change MapVector to NullableMapVector in ArrowColumnVector

2018-01-11 Thread Li Jin (JIRA)
Li Jin created SPARK-23047: -- Summary: Change MapVector to NullableMapVector in ArrowColumnVector Key: SPARK-23047 URL: https://issues.apache.org/jira/browse/SPARK-23047 Project: Spark Issue Type:

[jira] [Comment Edited] (SPARK-22947) SPIP: as-of join in Spark SQL

2018-01-11 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322847#comment-16322847 ] Li Jin edited comment on SPARK-22947 at 1/11/18 7:48 PM: - I have looked at

[jira] [Commented] (SPARK-22947) SPIP: as-of join in Spark SQL

2018-01-11 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322847#comment-16322847 ] Li Jin commented on SPARK-22947: I have looked at SPARK-8682 and tried to figured out general way to

[jira] [Commented] (SPARK-22229) SPIP: RDMA Accelerated Shuffle Engine

2018-01-11 Thread Yuval Degani (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322747#comment-16322747 ] Yuval Degani commented on SPARK-9: -- [~byronyi], thank you for kicking off the discussion again.

[jira] [Resolved] (SPARK-22908) add basic continuous kafka source

2018-01-11 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-22908. --- Resolution: Fixed Fix Version/s: 2.3.0 3.0.0 Issue resolved by

[jira] [Assigned] (SPARK-22908) add basic continuous kafka source

2018-01-11 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das reassigned SPARK-22908: - Assignee: Jose Torres > add basic continuous kafka source >

[jira] [Resolved] (SPARK-22994) Require a single container image for Spark-on-K8S

2018-01-11 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-22994. Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 20192

[jira] [Assigned] (SPARK-22994) Require a single container image for Spark-on-K8S

2018-01-11 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-22994: -- Assignee: Marcelo Vanzin > Require a single container image for Spark-on-K8S >

[jira] [Assigned] (SPARK-23046) Have RFormula include VectorSizeHint in pipeline

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23046: Assignee: (was: Apache Spark) > Have RFormula include VectorSizeHint in pipeline >

[jira] [Assigned] (SPARK-23046) Have RFormula include VectorSizeHint in pipeline

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23046: Assignee: Apache Spark > Have RFormula include VectorSizeHint in pipeline >

[jira] [Commented] (SPARK-23046) Have RFormula include VectorSizeHint in pipeline

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322605#comment-16322605 ] Apache Spark commented on SPARK-23046: -- User 'MrBago' has created a pull request for this issue:

[jira] [Created] (SPARK-23046) Have RFormula include VectorSizeHint in pipeline

2018-01-11 Thread Bago Amirbekian (JIRA)
Bago Amirbekian created SPARK-23046: --- Summary: Have RFormula include VectorSizeHint in pipeline Key: SPARK-23046 URL: https://issues.apache.org/jira/browse/SPARK-23046 Project: Spark Issue

[jira] [Assigned] (SPARK-23045) Have RFormula use OneHoEncoderEstimator

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23045: Assignee: (was: Apache Spark) > Have RFormula use OneHoEncoderEstimator >

[jira] [Commented] (SPARK-23045) Have RFormula use OneHoEncoderEstimator

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322574#comment-16322574 ] Apache Spark commented on SPARK-23045: -- User 'MrBago' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23045) Have RFormula use OneHoEncoderEstimator

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23045: Assignee: Apache Spark > Have RFormula use OneHoEncoderEstimator >

[jira] [Updated] (SPARK-23045) Have RFormula use OneHoEncoderEstimator

2018-01-11 Thread Bago Amirbekian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bago Amirbekian updated SPARK-23045: Summary: Have RFormula use OneHoEncoderEstimator (was: Have RFormula use OneHotEstimator)

[jira] [Updated] (SPARK-23037) RFormula should not use deprecated OneHotEncoder and should include VectorSizeHint in pipeline

2018-01-11 Thread Bago Amirbekian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bago Amirbekian updated SPARK-23037: Affects Version/s: (was: 2.2.0) 2.3.0 > RFormula should not use

[jira] [Created] (SPARK-23045) Have RFormula use OneHotEstimator

2018-01-11 Thread Bago Amirbekian (JIRA)
Bago Amirbekian created SPARK-23045: --- Summary: Have RFormula use OneHotEstimator Key: SPARK-23045 URL: https://issues.apache.org/jira/browse/SPARK-23045 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-22577) executor page blacklist status should update with TaskSet level blacklisting

2018-01-11 Thread Attila Zsolt Piros (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Zsolt Piros updated SPARK-22577: --- Attachment: app_blacklisting.png > executor page blacklist status should update with

[jira] [Resolved] (SPARK-22517) NullPointerException in ShuffleExternalSorter.spill()

2018-01-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-22517. --- Resolution: Duplicate > NullPointerException in ShuffleExternalSorter.spill() >

[jira] [Resolved] (SPARK-23027) optimizer a simple query using a non-existent data is too slow

2018-01-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-23027. --- Resolution: Invalid > optimizer a simple query using a non-existent data is too slow >

[jira] [Assigned] (SPARK-22980) Using pandas_udf when inputs are not Pandas's Series or DataFrame

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22980: Assignee: Apache Spark > Using pandas_udf when inputs are not Pandas's Series or

[jira] [Assigned] (SPARK-22980) Using pandas_udf when inputs are not Pandas's Series or DataFrame

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22980: Assignee: (was: Apache Spark) > Using pandas_udf when inputs are not Pandas's Series

[jira] [Commented] (SPARK-22980) Using pandas_udf when inputs are not Pandas's Series or DataFrame

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322435#comment-16322435 ] Apache Spark commented on SPARK-22980: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Assigned] (SPARK-23044) merge script has bug when assigning jiras to non-contributors

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23044: Assignee: Apache Spark > merge script has bug when assigning jiras to non-contributors >

[jira] [Assigned] (SPARK-23044) merge script has bug when assigning jiras to non-contributors

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23044: Assignee: (was: Apache Spark) > merge script has bug when assigning jiras to

[jira] [Commented] (SPARK-23044) merge script has bug when assigning jiras to non-contributors

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322433#comment-16322433 ] Apache Spark commented on SPARK-23044: -- User 'squito' has created a pull request for this issue:

[jira] [Commented] (SPARK-22921) Merge script should prompt for assigning jiras

2018-01-11 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322414#comment-16322414 ] Imran Rashid commented on SPARK-22921: -- doh, sorry -- will add some error handling, filed

[jira] [Created] (SPARK-23044) merge script has bug when assigning jiras to non-contributors

2018-01-11 Thread Imran Rashid (JIRA)
Imran Rashid created SPARK-23044: Summary: merge script has bug when assigning jiras to non-contributors Key: SPARK-23044 URL: https://issues.apache.org/jira/browse/SPARK-23044 Project: Spark

[jira] [Assigned] (SPARK-22887) ML test for StructuredStreaming: spark.ml.fpm

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22887: Assignee: Apache Spark > ML test for StructuredStreaming: spark.ml.fpm >

[jira] [Commented] (SPARK-22887) ML test for StructuredStreaming: spark.ml.fpm

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322208#comment-16322208 ] Apache Spark commented on SPARK-22887: -- User 'smurakozi' has created a pull request for this issue:

[jira] [Assigned] (SPARK-22887) ML test for StructuredStreaming: spark.ml.fpm

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22887: Assignee: (was: Apache Spark) > ML test for StructuredStreaming: spark.ml.fpm >

[jira] [Resolved] (SPARK-22967) VersionSuite failed on Windows caused by Windows format path

2018-01-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-22967. -- Resolution: Fixed Assignee: wuyi Fix Version/s: 2.3.0 Fixed in

[jira] [Commented] (SPARK-19732) DataFrame.fillna() does not work for bools in PySpark

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322139#comment-16322139 ] Apache Spark commented on SPARK-19732: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Commented] (SPARK-22229) SPIP: RDMA Accelerated Shuffle Engine

2018-01-11 Thread Bairen Yi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322129#comment-16322129 ] Bairen Yi commented on SPARK-9: --- [~srowen] It is very clear to me that no GPL code is introduced in

[jira] [Commented] (SPARK-23043) Upgrade json4s-jackson to 3.5.3

2018-01-11 Thread Takako Shimamoto (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322102#comment-16322102 ] Takako Shimamoto commented on SPARK-23043: -- I verified that the code still compiles, passes

[jira] [Assigned] (SPARK-20657) Speed up Stage page

2018-01-11 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-20657: --- Assignee: Marcelo Vanzin > Speed up Stage page > --- > >

[jira] [Resolved] (SPARK-20657) Speed up Stage page

2018-01-11 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-20657. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 20013

[jira] [Commented] (SPARK-23043) Upgrade json4s-jackson to 3.5.3

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322077#comment-16322077 ] Apache Spark commented on SPARK-23043: -- User 'shimamoto' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23043) Upgrade json4s-jackson to 3.5.3

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23043: Assignee: (was: Apache Spark) > Upgrade json4s-jackson to 3.5.3 >

[jira] [Assigned] (SPARK-23043) Upgrade json4s-jackson to 3.5.3

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23043: Assignee: Apache Spark > Upgrade json4s-jackson to 3.5.3 >

[jira] [Commented] (SPARK-23043) Upgrade json4s-jackson to 3.5.3

2018-01-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322059#comment-16322059 ] Sean Owen commented on SPARK-23043: --- This is good info, but as you say, there are problems in updating

[jira] [Commented] (SPARK-21179) Unable to return Hive INT data type into Spark via Hive JDBC driver: Caused by: java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to int.

2018-01-11 Thread Matthew Walton (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322058#comment-16322058 ] Matthew Walton commented on SPARK-21179: I'll check with Simba to see if they ever actually

[jira] [Commented] (SPARK-21179) Unable to return Hive INT data type into Spark via Hive JDBC driver: Caused by: java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to int.

2018-01-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322057#comment-16322057 ] Sean Owen commented on SPARK-21179: --- This is good info, but I'm not clear if the suggestion is to add

[jira] [Created] (SPARK-23043) Upgrade json4s-jackson to 3.5.3

2018-01-11 Thread Takako Shimamoto (JIRA)
Takako Shimamoto created SPARK-23043: Summary: Upgrade json4s-jackson to 3.5.3 Key: SPARK-23043 URL: https://issues.apache.org/jira/browse/SPARK-23043 Project: Spark Issue Type:

[jira] [Commented] (SPARK-22229) SPIP: RDMA Accelerated Shuffle Engine

2018-01-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322055#comment-16322055 ] Sean Owen commented on SPARK-9: --- [~byronyi] what PR? There appears to be an unresolved question

[jira] [Commented] (SPARK-23042) Use OneHotEncoderModel to encode labels in MultilayerPerceptronClassifier

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16322033#comment-16322033 ] Apache Spark commented on SPARK-23042: -- User 'viirya' has created a pull request for this issue:

[jira] [Assigned] (SPARK-23042) Use OneHotEncoderModel to encode labels in MultilayerPerceptronClassifier

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23042: Assignee: Apache Spark > Use OneHotEncoderModel to encode labels in

[jira] [Assigned] (SPARK-23042) Use OneHotEncoderModel to encode labels in MultilayerPerceptronClassifier

2018-01-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23042: Assignee: (was: Apache Spark) > Use OneHotEncoderModel to encode labels in

[jira] [Created] (SPARK-23042) Use OneHotEncoderModel to encode labels in MultilayerPerceptronClassifier

2018-01-11 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-23042: --- Summary: Use OneHotEncoderModel to encode labels in MultilayerPerceptronClassifier Key: SPARK-23042 URL: https://issues.apache.org/jira/browse/SPARK-23042

  1   2   >