[jira] [Resolved] (SPARK-24741) Have a built-in AVRO data source implementation

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-24741. - Resolution: Fixed Assignee: Gengliang Wang > Have a built-in AVRO data source implementation >

[jira] [Resolved] (SPARK-23007) Add schema evolution test suite for file-based data sources

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23007. - Resolution: Fixed Assignee: Dongjoon Hyun Fix Version/s: 2.4.0 > Add schema evolution

[jira] [Commented] (SPARK-24793) Make spark-submit more useful with k8s

2018-07-12 Thread Erik Erlandson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542102#comment-16542102 ] Erik Erlandson commented on SPARK-24793: Also a good point that {{--kill}} and {{--status}} are

[jira] [Comment Edited] (SPARK-24793) Make spark-submit more useful with k8s

2018-07-12 Thread Erik Erlandson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542102#comment-16542102 ] Erik Erlandson edited comment on SPARK-24793 at 7/12/18 7:01 PM: - Also a

[jira] [Updated] (SPARK-24724) Discuss necessary info and access in barrier mode + Kubernetes

2018-07-12 Thread Yinan Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yinan Li updated SPARK-24724: - Component/s: Kubernetes > Discuss necessary info and access in barrier mode + Kubernetes >

[jira] [Comment Edited] (SPARK-24793) Make spark-submit more useful with k8s

2018-07-12 Thread Yinan Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542109#comment-16542109 ] Yinan Li edited comment on SPARK-24793 at 7/12/18 7:11 PM: --- Oh, yeah, {{kill}} 

[jira] [Commented] (SPARK-24793) Make spark-submit more useful with k8s

2018-07-12 Thread Yinan Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542109#comment-16542109 ] Yinan Li commented on SPARK-24793: -- Oh, yeah, {{kill}} and {{status}} are existing options of

[jira] [Assigned] (SPARK-24610) wholeTextFiles broken for small files

2018-07-12 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves reassigned SPARK-24610: - Assignee: Dhruve Ashar > wholeTextFiles broken for small files >

[jira] [Resolved] (SPARK-24610) wholeTextFiles broken for small files

2018-07-12 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-24610. --- Resolution: Fixed Fix Version/s: 2.4.0 > wholeTextFiles broken for small files >

[jira] [Commented] (SPARK-24793) Make spark-submit more useful with k8s

2018-07-12 Thread Anirudh Ramanathan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542135#comment-16542135 ] Anirudh Ramanathan commented on SPARK-24793: Great! I'll take a stab at a PR in a few days.

[jira] [Commented] (SPARK-24741) Have a built-in AVRO data source implementation

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542207#comment-16542207 ] Xiao Li commented on SPARK-24741: - [~mn-mikke] So far, we do not have a plan to improve the

[jira] [Created] (SPARK-24796) Support GROUPED_AGG_PANDAS_UDF in Pivot

2018-07-12 Thread Xiao Li (JIRA)
Xiao Li created SPARK-24796: --- Summary: Support GROUPED_AGG_PANDAS_UDF in Pivot Key: SPARK-24796 URL: https://issues.apache.org/jira/browse/SPARK-24796 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-24797) Analyzer should respect spark.sql.hive.convertMetastoreOrc/Parquet when build the data source table

2018-07-12 Thread Nan Zhu (JIRA)
Nan Zhu created SPARK-24797: --- Summary: Analyzer should respect spark.sql.hive.convertMetastoreOrc/Parquet when build the data source table Key: SPARK-24797 URL: https://issues.apache.org/jira/browse/SPARK-24797

[jira] [Assigned] (SPARK-24537) Add array_remove / array_zip / map_from_arrays / array_distinct

2018-07-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-24537: Assignee: Huaxin Gao > Add array_remove / array_zip / map_from_arrays / array_distinct >

[jira] [Resolved] (SPARK-24537) Add array_remove / array_zip / map_from_arrays / array_distinct

2018-07-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-24537. -- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21645

[jira] [Assigned] (SPARK-24797) Analyzer should respect spark.sql.hive.convertMetastoreOrc/Parquet when build the data source table

2018-07-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24797: Assignee: Apache Spark > Analyzer should respect

[jira] [Assigned] (SPARK-24797) Analyzer should respect spark.sql.hive.convertMetastoreOrc/Parquet when build the data source table

2018-07-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24797: Assignee: (was: Apache Spark) > Analyzer should respect

[jira] [Commented] (SPARK-24797) Analyzer should respect spark.sql.hive.convertMetastoreOrc/Parquet when build the data source table

2018-07-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542437#comment-16542437 ] Apache Spark commented on SPARK-24797: -- User 'CodingCat' has created a pull request for this issue:

[jira] [Commented] (SPARK-24796) Support GROUPED_AGG_PANDAS_UDF in Pivot

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542338#comment-16542338 ] Xiao Li commented on SPARK-24796: - cc [~icexelloss] [~bryanc] [~hyukjin.kwon] > Support

[jira] [Resolved] (SPARK-24790) Allow complex aggregate expressions in Pivot

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-24790. - Resolution: Fixed Assignee: Maryann Xue Fix Version/s: 2.4.0 > Allow complex aggregate

[jira] [Assigned] (SPARK-24795) Implement barrier execution mode

2018-07-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24795: Assignee: Apache Spark > Implement barrier execution mode >

[jira] [Commented] (SPARK-24795) Implement barrier execution mode

2018-07-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542538#comment-16542538 ] Apache Spark commented on SPARK-24795: -- User 'jiangxb1987' has created a pull request for this

[jira] [Assigned] (SPARK-24795) Implement barrier execution mode

2018-07-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24795: Assignee: (was: Apache Spark) > Implement barrier execution mode >

[jira] [Updated] (SPARK-24730) Add policy to choose max as global watermark when streaming query has multiple watermarks

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24730: Fix Version/s: (was: 3.0.0) 2.4.0 > Add policy to choose max as global watermark

[jira] [Updated] (SPARK-23033) disable task-level retry for continuous execution

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23033: Fix Version/s: (was: 3.0.0) > disable task-level retry for continuous execution >

[jira] [Updated] (SPARK-22839) Refactor Kubernetes code for configuring driver/executor pods to use consistent and cleaner abstraction

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-22839: Fix Version/s: (was: 3.0.0) 2.4.0 > Refactor Kubernetes code for configuring

[jira] [Updated] (SPARK-22908) add basic continuous kafka source

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-22908: Fix Version/s: (was: 3.0.0) > add basic continuous kafka source > - >

[jira] [Updated] (SPARK-23144) Add console sink for continuous queries

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23144: Fix Version/s: (was: 3.0.0) > Add console sink for continuous queries >

[jira] [Updated] (SPARK-21925) Update trigger interval documentation in docs with behavior change in Spark 2.2

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-21925: Fix Version/s: (was: 3.0.0) 2.3.2 > Update trigger interval documentation in docs

[jira] [Updated] (SPARK-22238) EnsureStatefulOpPartitioning shouldn't ask for the child RDD before planning is completed

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-22238: Fix Version/s: (was: 3.0.0) 2.4.0 > EnsureStatefulOpPartitioning shouldn't ask for

[jira] [Updated] (SPARK-23004) Structured Streaming raise "llegalStateException: Cannot remove after already committed or aborted"

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23004: Fix Version/s: (was: 3.0.0) 2.4.0 > Structured Streaming raise

[jira] [Updated] (SPARK-23143) Add Python support for continuous trigger

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23143: Fix Version/s: (was: 3.0.0) > Add Python support for continuous trigger >

[jira] [Updated] (SPARK-23099) Migrate foreach sink

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23099: Fix Version/s: (was: 3.0.0) > Migrate foreach sink > > > Key:

[jira] [Updated] (SPARK-23096) Migrate rate source to v2

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23096: Fix Version/s: (was: 3.0.0) 2.4.0 > Migrate rate source to v2 >

[jira] [Updated] (SPARK-23454) Add Trigger information to the Structured Streaming programming guide

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23454: Fix Version/s: (was: 3.0.0) 2.4.0 > Add Trigger information to the Structured

[jira] [Updated] (SPARK-23362) Migrate Kafka microbatch source to v2

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23362: Fix Version/s: (was: 3.0.0) 2.4.0 > Migrate Kafka microbatch source to v2 >

[jira] [Updated] (SPARK-22884) ML test for StructuredStreaming: spark.ml.clustering

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-22884: Fix Version/s: (was: 3.0.0) 2.4.0 > ML test for StructuredStreaming:

[jira] [Updated] (SPARK-22018) Catalyst Optimizer does not preserve top-level metadata while collapsing projects

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-22018: Fix Version/s: (was: 3.0.0) 2.4.0 > Catalyst Optimizer does not preserve top-level

[jira] [Updated] (SPARK-22017) watermark evaluation with multi-input stream operators is unspecified

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-22017: Fix Version/s: (was: 3.0.0) 2.4.0 > watermark evaluation with multi-input stream

[jira] [Updated] (SPARK-23097) Migrate text socket source to v2

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23097: Fix Version/s: (was: 3.0.0) 2.4.0 > Migrate text socket source to v2 >

[jira] [Updated] (SPARK-23092) Migrate MemoryStream to DataSource V2

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23092: Fix Version/s: (was: 3.0.0) 2.4.0 > Migrate MemoryStream to DataSource V2 >

[jira] [Updated] (SPARK-23142) Add documentation for Continuous Processing

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23142: Fix Version/s: (was: 3.0.0) > Add documentation for Continuous Processing >

[jira] [Updated] (SPARK-23052) Migrate Microbatch ConsoleSink to v2

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23052: Fix Version/s: (was: 3.0.0) > Migrate Microbatch ConsoleSink to v2 >

[jira] [Assigned] (SPARK-24615) Accelerator aware task scheduling for Spark

2018-07-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reassigned SPARK-24615: - Assignee: Saisai Shao > Accelerator aware task scheduling for Spark >

[jira] [Updated] (SPARK-23503) continuous execution should sequence committed epochs

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23503: Fix Version/s: (was: 3.0.0) 2.4.0 > continuous execution should sequence committed

[jira] [Updated] (SPARK-23827) StreamingJoinExec should ensure that input data is partitioned into specific number of partitions

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23827: Fix Version/s: (was: 3.0.0) > StreamingJoinExec should ensure that input data is partitioned into

[jira] [Updated] (SPARK-23747) Add EpochCoordinator unit tests

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23747: Fix Version/s: (was: 3.0.0) 2.4.0 > Add EpochCoordinator unit tests >

[jira] [Updated] (SPARK-23484) Fix possible race condition in KafkaContinuousReader

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23484: Fix Version/s: (was: 3.0.0) 2.4.0 > Fix possible race condition in

[jira] [Updated] (SPARK-24039) remove restarting iterators hack

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24039: Fix Version/s: (was: 3.0.0) 2.4.0 > remove restarting iterators hack >

[jira] [Updated] (SPARK-23559) add epoch ID to data writer factory

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23559: Fix Version/s: (was: 3.0.0) 2.4.0 > add epoch ID to data writer factory >

[jira] [Updated] (SPARK-23491) continuous symptom

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23491: Fix Version/s: (was: 3.0.0) 2.4.0 > continuous symptom > -- > >

[jira] [Updated] (SPARK-23748) Support select from temp tables

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23748: Fix Version/s: (was: 3.0.0) 2.4.0 > Support select from temp tables >

[jira] [Updated] (SPARK-23408) Flaky test: StreamingOuterJoinSuite.left outer early state exclusion on right

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23408: Fix Version/s: (was: 3.0.0) > Flaky test: StreamingOuterJoinSuite.left outer early state exclusion on

[jira] [Updated] (SPARK-24155) Instrumentation improvement for clustering

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24155: Fix Version/s: (was: 3.0.0) 2.4.0 > Instrumentation improvement for clustering >

[jira] [Updated] (SPARK-24094) Change description strings of v2 streaming sources to reflect the change

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24094: Fix Version/s: (was: 3.0.0) 2.4.0 > Change description strings of v2 streaming

[jira] [Updated] (SPARK-24157) Enable no-data micro batches for streaming aggregation and deduplication

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24157: Fix Version/s: (was: 3.0.0) 2.4.0 > Enable no-data micro batches for streaming

[jira] [Updated] (SPARK-23484) Fix possible race condition in KafkaContinuousReader

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23484: Fix Version/s: (was: 2.4.0) > Fix possible race condition in KafkaContinuousReader >

[jira] [Updated] (SPARK-23454) Add Trigger information to the Structured Streaming programming guide

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23454: Fix Version/s: (was: 2.4.0) > Add Trigger information to the Structured Streaming programming guide >

[jira] [Updated] (SPARK-21696) State Store can't handle corrupted snapshots

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-21696: Fix Version/s: (was: 3.0.0) 2.3.0 > State Store can't handle corrupted snapshots >

[jira] [Updated] (SPARK-21925) Update trigger interval documentation in docs with behavior change in Spark 2.2

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-21925: Fix Version/s: (was: 2.3.2) 2.3.0 > Update trigger interval documentation in docs

[jira] [Updated] (SPARK-20449) Upgrade breeze version to 0.13.1

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-20449: Fix Version/s: (was: 3.0.0) 2.3.0 > Upgrade breeze version to 0.13.1 >

[jira] [Updated] (SPARK-21765) Ensure all leaf nodes that are derived from streaming sources have isStreaming=true

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-21765: Fix Version/s: (was: 3.0.0) 2.3.0 > Ensure all leaf nodes that are derived from

[jira] [Updated] (SPARK-21587) Filter pushdown for EventTime Watermark Operator

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-21587: Fix Version/s: (was: 3.0.0) 2.3.0 > Filter pushdown for EventTime Watermark

[jira] [Updated] (SPARK-19378) StateOperator metrics should still return the total number of rows in state even if there was no data for a trigger

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-19378: Fix Version/s: (was: 3.0.0) 2.2.0 > StateOperator metrics should still return the

[jira] [Updated] (SPARK-21464) Minimize deprecation warnings caused by ProcessingTime class

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-21464: Fix Version/s: (was: 3.0.0) 2.3.0 > Minimize deprecation warnings caused by

[jira] [Resolved] (SPARK-23486) LookupFunctions should not check the same function name more than once

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23486. - Resolution: Fixed Assignee: kevin yu Fix Version/s: 2.4.0 > LookupFunctions should not

[jira] [Updated] (SPARK-24158) Enable no-data micro batches for streaming joins

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24158: Fix Version/s: (was: 3.0.0) 2.4.0 > Enable no-data micro batches for streaming

[jira] [Updated] (SPARK-24453) Fix error recovering from the failure in a no-data batch

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24453: Fix Version/s: (was: 3.0.0) 2.4.0 > Fix error recovering from the failure in a

[jira] [Updated] (SPARK-24231) Python API: Provide evaluateEachIteration method or equivalent for spark.ml GBTs

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24231: Fix Version/s: (was: 3.0.0) 2.4.0 > Python API: Provide evaluateEachIteration

[jira] [Updated] (SPARK-24386) implement continuous processing coalesce(1)

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24386: Fix Version/s: (was: 3.0.0) 2.4.0 > implement continuous processing coalesce(1) >

[jira] [Updated] (SPARK-24132) Instrumentation improvement for classification

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24132: Fix Version/s: (was: 3.0.0) 2.4.0 > Instrumentation improvement for classification

[jira] [Updated] (SPARK-24115) improve instrumentation for spark.ml.tuning

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24115: Fix Version/s: (was: 3.0.0) 2.4.0 > improve instrumentation for spark.ml.tuning >

[jira] [Updated] (SPARK-23966) Refactoring all checkpoint file writing logic in a common interface

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23966: Fix Version/s: (was: 3.0.0) 2.4.0 > Refactoring all checkpoint file writing logic

[jira] [Updated] (SPARK-24050) StreamingQuery does not calculate input / processing rates in some cases

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24050: Fix Version/s: (was: 3.0.0) 2.4.0 > StreamingQuery does not calculate input /

[jira] [Updated] (SPARK-24038) refactor continuous write exec to its own class

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24038: Fix Version/s: (was: 3.0.0) 2.4.0 > refactor continuous write exec to its own

[jira] [Updated] (SPARK-24697) Fix the reported start offsets in streaming query progress

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24697: Fix Version/s: (was: 3.0.0) 2.4.0 > Fix the reported start offsets in streaming

[jira] [Updated] (SPARK-24662) Structured Streaming should support LIMIT

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24662: Fix Version/s: (was: 3.0.0) 2.4.0 > Structured Streaming should support LIMIT >

[jira] [Updated] (SPARK-24396) Add Structured Streaming ForeachWriter for python

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24396: Fix Version/s: (was: 3.0.0) 2.4.0 > Add Structured Streaming ForeachWriter for

[jira] [Updated] (SPARK-24056) Make consumer creation lazy in Kafka source for Structured streaming

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24056: Fix Version/s: (was: 3.0.0) 2.4.0 > Make consumer creation lazy in Kafka source

[jira] [Updated] (SPARK-24137) [K8s] Mount temporary directories in emptydir volumes

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24137: Fix Version/s: (was: 3.0.0) 2.4.0 > [K8s] Mount temporary directories in emptydir

[jira] [Updated] (SPARK-24234) create the bottom-of-task RDD with row buffer

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24234: Fix Version/s: (was: 3.0.0) 2.4.0 > create the bottom-of-task RDD with row buffer

[jira] [Updated] (SPARK-24232) Allow referring to kubernetes secrets as env variable

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24232: Fix Version/s: (was: 3.0.0) 2.4.0 > Allow referring to kubernetes secrets as env

[jira] [Updated] (SPARK-24397) Add TaskContext.getLocalProperties in Python

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24397: Fix Version/s: (was: 3.0.0) 2.4.0 > Add TaskContext.getLocalProperties in Python >

[jira] [Updated] (SPARK-20449) Upgrade breeze version to 0.13.1

2018-07-12 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-20449: Fix Version/s: (was: 2.3.0) > Upgrade breeze version to 0.13.1 > > >

[jira] [Commented] (SPARK-24568) Code refactoring for DataType equalsXXX methods

2018-07-12 Thread Swapnil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542516#comment-16542516 ] Swapnil commented on SPARK-24568: - I am working on this. I will create PR very soon. > Code refactoring

[jira] [Created] (SPARK-24793) Make spark-submit more useful with k8s

2018-07-12 Thread Anirudh Ramanathan (JIRA)
Anirudh Ramanathan created SPARK-24793: -- Summary: Make spark-submit more useful with k8s Key: SPARK-24793 URL: https://issues.apache.org/jira/browse/SPARK-24793 Project: Spark Issue

[jira] [Commented] (SPARK-24770) Supporting to convert a column into binary of AVRO format

2018-07-12 Thread Felipe Melo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541302#comment-16541302 ] Felipe Melo commented on SPARK-24770: - I believe I can help with it. Will dive into. > Supporting

[jira] [Resolved] (SPARK-23914) High-order function: array_union(x, y) → array

2018-07-12 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin resolved SPARK-23914. --- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21061

[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive

2018-07-12 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541191#comment-16541191 ] Felix Cheung commented on SPARK-20202: -- How like will there be a hive release? HIVE-16391 is still

[jira] [Updated] (SPARK-24793) Make spark-submit more useful with k8s

2018-07-12 Thread Anirudh Ramanathan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anirudh Ramanathan updated SPARK-24793: --- Description: Support controlling the lifecycle of Spark Application through

[jira] [Comment Edited] (SPARK-24763) Remove redundant key data from value in streaming aggregation

2018-07-12 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541367#comment-16541367 ] Jungtaek Lim edited comment on SPARK-24763 at 7/12/18 9:21 AM: --- I had a

[jira] [Updated] (SPARK-24794) DriverWrapper should have both master addresses in -Dspark.master

2018-07-12 Thread Behroz Sikander (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Behroz Sikander updated SPARK-24794: Description: In standalone cluster mode, one could launch a Driver with supervise mode

[jira] [Created] (SPARK-24794) DriverWrapper should have both master addresses in -Dspark.master

2018-07-12 Thread Behroz Sikander (JIRA)
Behroz Sikander created SPARK-24794: --- Summary: DriverWrapper should have both master addresses in -Dspark.master Key: SPARK-24794 URL: https://issues.apache.org/jira/browse/SPARK-24794 Project:

[jira] [Updated] (SPARK-24794) DriverWrapper should have both master addresses in -Dspark.master

2018-07-12 Thread Behroz Sikander (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Behroz Sikander updated SPARK-24794: Description: In standalone cluster mode, one could launch a Driver with supervise mode

[jira] [Commented] (SPARK-24760) Pandas UDF does not handle NaN correctly

2018-07-12 Thread Mortada Mehyar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541209#comment-16541209 ] Mortada Mehyar commented on SPARK-24760: [~bryanc] thanks for the example. It looks like

[jira] [Comment Edited] (SPARK-24763) Remove redundant key data from value in streaming aggregation

2018-07-12 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541367#comment-16541367 ] Jungtaek Lim edited comment on SPARK-24763 at 7/12/18 9:20 AM: --- I had a

[jira] [Commented] (SPARK-24763) Remove redundant key data from value in streaming aggregation

2018-07-12 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541367#comment-16541367 ] Jungtaek Lim commented on SPARK-24763: -- I had a chance to craft various key/value cases (bigger

[jira] [Resolved] (SPARK-23885) trying to spark submit 2.3.0 on minikube

2018-07-12 Thread Anirudh Ramanathan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anirudh Ramanathan resolved SPARK-23885. Resolution: Not A Bug > trying to spark submit 2.3.0 on minikube >

[jira] [Commented] (SPARK-24432) Add support for dynamic resource allocation

2018-07-12 Thread Anirudh Ramanathan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541239#comment-16541239 ] Anirudh Ramanathan commented on SPARK-24432: Hi Mark, we did it once before in our fork in a

[jira] [Commented] (SPARK-21962) Distributed Tracing in Spark

2018-07-12 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541265#comment-16541265 ] Andrew Ash commented on SPARK-21962: Note that HTrace is now being removed from Hadoop – 

  1   2   >