[jira] [Updated] (SPARK-15777) Catalog federation

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-15777: Target Version/s: (was: 2.1.0) > Catalog federation > -- > >

[jira] [Updated] (SPARK-17730) Show task size (including the broadcast variable for the task) in web UI

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17730: Target Version/s: (was: 2.1.0) > Show task size (including the broadcast variable for the task)

[jira] [Updated] (SPARK-16011) SQL metrics include duplicated attempts

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16011: Target Version/s: 2.2.0 (was: 2.1.0) > SQL metrics include duplicated attempts >

[jira] [Updated] (SPARK-16412) Generate Java code that gets an array in each column of CachedBatch when DataFrame.cache() is called

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16412: Target Version/s: 2.2.0 (was: 2.1.0) > Generate Java code that gets an array in each column of

[jira] [Updated] (SPARK-13421) Make output of a SparkPlan configurable

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-13421: Target Version/s: (was: 2.1.0) > Make output of a SparkPlan configurable >

[jira] [Closed] (SPARK-12978) Skip unnecessary final group-by when input data already clustered with group-by keys

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-12978. --- Resolution: Later > Skip unnecessary final group-by when input data already clustered with >

[jira] [Updated] (SPARK-13683) Finalize the public interface for OutputWriter[Factory]

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-13683: Target Version/s: (was: 2.1.0) > Finalize the public interface for OutputWriter[Factory] >

[jira] [Closed] (SPARK-17064) Reconsider spark.job.interruptOnCancel

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-17064. --- Resolution: Won't Fix Marking this as won't fix for now. We can still continue to discuss it here.

[jira] [Updated] (SPARK-13682) Finalize the public API for FileFormat

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-13682: Target Version/s: 2.2.0 (was: 2.1.0) > Finalize the public API for FileFormat >

[jira] [Updated] (SPARK-14083) Analyze JVM bytecode and turn closures into Catalyst expressions

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-14083: Target Version/s: (was: 2.1.0) > Analyze JVM bytecode and turn closures into Catalyst

[jira] [Updated] (SPARK-17203) data source options should always be case insensitive

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17203: Target Version/s: 2.2.0 (was: 2.1.0) > data source options should always be case insensitive >

[jira] [Updated] (SPARK-14098) Generate code that get a float/double value in each column from CachedBatch when DataFrame.cache() is called

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-14098: Target Version/s: 2.2.0 (was: 2.1.0) > Generate code that get a float/double value in each column

[jira] [Updated] (SPARK-16196) Optimize in-memory scan performance using ColumnarBatches

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16196: Target Version/s: 2.2.0 (was: 2.1.0) > Optimize in-memory scan performance using ColumnarBatches

[jira] [Updated] (SPARK-16475) Broadcast Hint for SQL Queries

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16475: Target Version/s: 2.2.0 (was: 2.1.0) > Broadcast Hint for SQL Queries >

[jira] [Updated] (SPARK-17982) Spark 2.0.0 CREATE VIEW statement fails :: java.lang.RuntimeException: Failed to analyze the canonicalized SQL. It is possible there is a bug in Spark.

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17982: Target Version/s: 2.1.0 > Spark 2.0.0 CREATE VIEW statement fails :: java.lang.RuntimeException:

[jira] [Updated] (SPARK-16452) basic INFORMATION_SCHEMA support

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16452: Target Version/s: 2.2.0 (was: 2.1.0) > basic INFORMATION_SCHEMA support >

[jira] [Commented] (SPARK-13649) Move CalendarInterval out of unsafe package

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15626635#comment-15626635 ] Reynold Xin commented on SPARK-13649: - Do we actually return it to the user as an external data type?

[jira] [Updated] (SPARK-17912) Refactor code generation to get data for ColumnVector/ColumnarBatch

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17912: Target Version/s: 2.2.0 (was: 2.1.0) > Refactor code generation to get data for

[jira] [Updated] (SPARK-9576) DataFrame API improvement umbrella ticket (in Spark 2.x)

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-9576: --- Summary: DataFrame API improvement umbrella ticket (in Spark 2.x) (was: DataFrame API improvement

[jira] [Updated] (SPARK-9576) DataFrame API improvement umbrella ticket (Spark 2.0 and 2.1)

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-9576: --- Target Version/s: 2.2.0 (was: 2.1.0) > DataFrame API improvement umbrella ticket (Spark 2.0 and 2.1)

[jira] [Updated] (SPARK-15690) Fast single-node (single-process) in-memory shuffle

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-15690: Target Version/s: 2.2.0 (was: 2.1.0) > Fast single-node (single-process) in-memory shuffle >

[jira] [Updated] (SPARK-15691) Refactor and improve Hive support

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-15691: Target Version/s: 2.2.0 (was: 2.1.0) > Refactor and improve Hive support >

[jira] [Updated] (SPARK-15694) Implement ScriptTransformation in sql/core

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-15694: Target Version/s: 2.2.0 (was: 2.1.0) > Implement ScriptTransformation in sql/core >

[jira] [Updated] (SPARK-15693) Write schema definition out for file-based data sources to avoid schema inference

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-15693: Target Version/s: 2.2.0 (was: 2.1.0) > Write schema definition out for file-based data sources to

[jira] [Updated] (SPARK-15712) Proper temp table support

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-15712: Target Version/s: (was: 2.1.0) > Proper temp table support > - > >

[jira] [Closed] (SPARK-15576) Add back hive tests blacklisted by SPARK-15539

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-15576. --- Resolution: Won't Fix > Add back hive tests blacklisted by SPARK-15539 >

[jira] [Updated] (SPARK-4502) Spark SQL reads unneccesary nested fields from Parquet

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-4502: --- Target Version/s: 2.2.0 (was: 2.1.0) > Spark SQL reads unneccesary nested fields from Parquet >

[jira] [Updated] (SPARK-16217) Support SELECT INTO statement

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16217: Target Version/s: 2.2.0 (was: 2.1.0) > Support SELECT INTO statement >

[jira] [Updated] (SPARK-16282) Implement percentile SQL function

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16282: Target Version/s: 2.1.0 > Implement percentile SQL function > - >

[jira] [Updated] (SPARK-16275) Implement all the Hive fallback functions

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16275: Target Version/s: 2.2.0 (was: 2.1.0) > Implement all the Hive fallback functions >

[jira] [Updated] (SPARK-17556) Executor side broadcast for broadcast joins

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17556: Target Version/s: 2.2.0 (was: 2.1.0) > Executor side broadcast for broadcast joins >

[jira] [Updated] (SPARK-7768) Make user-defined type (UDT) API public

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-7768: --- Target Version/s: 2.2.0 (was: 2.1.0) > Make user-defined type (UDT) API public >

[jira] [Updated] (SPARK-15687) Columnar execution engine

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-15687: Target Version/s: (was: 2.1.0) > Columnar execution engine > - > >

[jira] [Updated] (SPARK-17915) Prepare ColumnVector implementation for UnsafeData

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17915: Target Version/s: 2.2.0 (was: 2.1.0) > Prepare ColumnVector implementation for UnsafeData >

[jira] [Updated] (SPARK-14220) Build and test Spark against Scala 2.12

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-14220: Target Version/s: 2.2.0 (was: 2.1.0) > Build and test Spark against Scala 2.12 >

[jira] [Updated] (SPARK-14393) monotonicallyIncreasingId not monotonically increasing with downstream coalesce

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-14393: Priority: Blocker (was: Major) > monotonicallyIncreasingId not monotonically increasing with

[jira] [Updated] (SPARK-17813) Maximum data per trigger

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17813: Fix Version/s: (was: 2.0.3) 2.0.2 > Maximum data per trigger >

[jira] [Updated] (SPARK-18189) task not serializable with groupByKey() + mapGroups() + map

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18189: Fix Version/s: (was: 2.0.3) 2.0.2 > task not serializable with groupByKey()

[jira] [Updated] (SPARK-18154) CLONE - Change Source API so that sources do not need to keep unbounded state

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18154: Fix Version/s: (was: 2.0.3) 2.0.2 > CLONE - Change Source API so that

[jira] [Updated] (SPARK-18164) ForeachSink should fail the Spark job if `process` throws exception

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18164: Fix Version/s: (was: 2.0.3) 2.0.2 > ForeachSink should fail the Spark job

[jira] [Updated] (SPARK-18143) History Server is broken because of the refactoring work in Structured Streaming

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18143: Fix Version/s: (was: 2.0.3) 2.0.2 > History Server is broken because of the

[jira] [Updated] (SPARK-18132) spark 2.0 branch's spark-release-publish failed because style check failed.

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18132: Fix Version/s: (was: 2.0.3) 2.0.2 > spark 2.0 branch's

[jira] [Updated] (SPARK-18148) Misleading Error Message for Aggregation Without Window/GroupBy

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18148: Fix Version/s: (was: 2.0.3) 2.0.2 > Misleading Error Message for

[jira] [Updated] (SPARK-18114) MesosClusterScheduler generate bad command options

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18114: Fix Version/s: (was: 2.0.3) 2.0.2 > MesosClusterScheduler generate bad

[jira] [Updated] (SPARK-16963) Change Source API so that sources do not need to keep unbounded state

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16963: Fix Version/s: (was: 2.0.3) 2.0.2 > Change Source API so that sources do

[jira] [Resolved] (SPARK-18148) Misleading Error Message for Aggregation Without Window/GroupBy

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18148. - Resolution: Fixed Assignee: Jiang Xingbo Fix Version/s: 2.1.0

[jira] [Resolved] (SPARK-18189) task not serializable with groupByKey() + mapGroups() + map

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18189. - Resolution: Fixed Fix Version/s: 2.1.0 2.0.3 > task not serializable

[jira] [Resolved] (SPARK-18107) Insert overwrite statement runs much slower in spark-sql than it does in hive-client

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18107. - Resolution: Fixed Assignee: Liang-Chi Hsieh Fix Version/s: 2.1.0 > Insert

[jira] [Created] (SPARK-18192) Support all file formats in structured streaming

2016-11-01 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18192: --- Summary: Support all file formats in structured streaming Key: SPARK-18192 URL: https://issues.apache.org/jira/browse/SPARK-18192 Project: Spark Issue Type:

[jira] [Commented] (SPARK-18191) Port RDD API to use commit protocol

2016-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15624483#comment-15624483 ] Reynold Xin commented on SPARK-18191: - cc [~jiangxb1987] want to take this one? > Port RDD API to

[jira] [Created] (SPARK-18191) Port RDD API to use commit protocol

2016-11-01 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18191: --- Summary: Port RDD API to use commit protocol Key: SPARK-18191 URL: https://issues.apache.org/jira/browse/SPARK-18191 Project: Spark Issue Type: Sub-task

[jira] [Resolved] (SPARK-18024) Introduce a commit protocol API along with OutputCommitter implementation

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18024. - Resolution: Fixed Fix Version/s: 2.1.0 > Introduce a commit protocol API along with

[jira] [Updated] (SPARK-16827) Stop reporting spill metrics as shuffle metrics

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16827: Target Version/s: 2.0.3, 2.1.0 Fix Version/s: (was: 2.0.2)

[jira] [Resolved] (SPARK-18087) Optimize insert to not require REPAIR TABLE

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18087. - Resolution: Fixed Assignee: Eric Liang Fix Version/s: 2.1.0 > Optimize insert to

[jira] [Updated] (SPARK-17637) Packed scheduling for Spark tasks across executors

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17637: Affects Version/s: (was: 2.1.0) Target Version/s: 2.1.0 > Packed scheduling for Spark

[jira] [Updated] (SPARK-14393) monotonicallyIncreasingId not monotonically increasing with downstream coalesce

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-14393: Target Version/s: 2.1.0 > monotonicallyIncreasingId not monotonically increasing with downstream

[jira] [Updated] (SPARK-12469) Data Property Accumulators for Spark (formerly Consistent Accumulators)

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-12469: Target Version/s: (was: 2.1.0) > Data Property Accumulators for Spark (formerly Consistent

[jira] [Updated] (SPARK-18189) task not serializable with groupByKey() + mapGroups() + map

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18189: Target Version/s: 2.1.0 > task not serializable with groupByKey() + mapGroups() + map >

[jira] [Updated] (SPARK-18189) task not serializable with groupByKey() + mapGroups() + map

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18189: Target Version/s: 2.0.3, 2.1.0 (was: 2.1.0) > task not serializable with groupByKey() +

[jira] [Updated] (SPARK-18189) task not serializable with groupByKey() + mapGroups() + map

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18189: Description: just run the following code {code} val a =

[jira] [Updated] (SPARK-18173) data source tables should support truncating partition

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18173: Issue Type: Sub-task (was: New Feature) Parent: SPARK-17861 > data source tables should

[jira] [Updated] (SPARK-18184) INSERT [INTO|OVERWRITE] TABLE ... PARTITION for Datasource tables cannot handle partitions with custom locations

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18184: Issue Type: Sub-task (was: Bug) Parent: SPARK-17861 > INSERT [INTO|OVERWRITE] TABLE ...

[jira] [Updated] (SPARK-17992) HiveClient.getPartitionsByFilter throws an exception for some unsupported filters when hive.metastore.try.direct.sql=false

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17992: Issue Type: Sub-task (was: Bug) Parent: SPARK-17861 > HiveClient.getPartitionsByFilter

[jira] [Updated] (SPARK-18183) INSERT OVERWRITE TABLE ... PARTITION will overwrite the entire Datasource table instead of just the specified partition

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18183: Issue Type: Sub-task (was: Bug) Parent: SPARK-17861 > INSERT OVERWRITE TABLE ...

[jira] [Updated] (SPARK-18087) Optimize insert to not require REPAIR TABLE

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18087: Target Version/s: 2.1.0 > Optimize insert to not require REPAIR TABLE >

[jira] [Resolved] (SPARK-18143) History Server is broken because of the refactoring work in Structured Streaming

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18143. - Resolution: Fixed Assignee: Shixiong Zhu Fix Version/s: 2.1.0

[jira] [Resolved] (SPARK-18103) Rename *FileCatalog to *FileProvider

2016-10-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18103. - Resolution: Fixed Assignee: Eric Liang Fix Version/s: 2.1.0 > Rename

[jira] [Updated] (SPARK-17791) Join reordering using star schema detection

2016-10-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17791: Assignee: Ioana Delaney > Join reordering using star schema detection >

[jira] [Updated] (SPARK-17626) TPC-DS performance improvements using star-schema heuristics

2016-10-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17626: Target Version/s: 2.2.0 > TPC-DS performance improvements using star-schema heuristics >

[jira] [Updated] (SPARK-17791) Join reordering using star schema detection

2016-10-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17791: Target Version/s: 2.2.0 > Join reordering using star schema detection >

[jira] [Updated] (SPARK-18138) Remove support for Python 2.6, Hadoop 2.6-, Java 7, and Scala 2.10

2016-10-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18138: Priority: Blocker (was: Major) > Remove support for Python 2.6, Hadoop 2.6-, Java 7, and Scala

[jira] [Created] (SPARK-18138) Remove support for Python 2.6, Hadoop 2.6-, Java 7, and Scala 2.10

2016-10-27 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18138: --- Summary: Remove support for Python 2.6, Hadoop 2.6-, Java 7, and Scala 2.10 Key: SPARK-18138 URL: https://issues.apache.org/jira/browse/SPARK-18138 Project: Spark

[jira] [Updated] (SPARK-18126) getIteratorZipWithIndex accepts negative value as index.

2016-10-26 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18126: Issue Type: Improvement (was: Bug) > getIteratorZipWithIndex accepts negative value as index. >

[jira] [Resolved] (SPARK-18126) getIteratorZipWithIndex accepts negative value as index.

2016-10-26 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18126. - Resolution: Fixed Assignee: Miao Wang Fix Version/s: 2.1.0 >

[jira] [Resolved] (SPARK-18094) Move group analytics test cases from `SQLQuerySuite` into a query file test

2016-10-26 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18094. - Resolution: Fixed Assignee: Jiang Xingbo Fix Version/s: 2.1.0 > Move group

[jira] [Resolved] (SPARK-18063) Failed to infer constraints over multiple aliases

2016-10-26 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18063. - Resolution: Fixed Assignee: Jiang Xingbo Fix Version/s: 2.1.0

[jira] [Commented] (SPARK-17495) Hive hash implementation

2016-10-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15599246#comment-15599246 ] Reynold Xin commented on SPARK-17495: - [~tejasp] I am going to reopen this. I just realized the

[jira] [Reopened] (SPARK-17495) Hive hash implementation

2016-10-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin reopened SPARK-17495: - > Hive hash implementation > > > Key: SPARK-17495 >

[jira] [Updated] (SPARK-17698) Join predicates should not contain filter clauses

2016-10-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17698: Fix Version/s: 2.0.2 > Join predicates should not contain filter clauses >

[jira] [Updated] (SPARK-18038) Move output partitioning definition from UnaryNodeExec to its children

2016-10-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18038: Target Version/s: 2.1.0 > Move output partitioning definition from UnaryNodeExec to its children >

[jira] [Updated] (SPARK-18038) Move output partitioning definition from UnaryNodeExec to its children

2016-10-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18038: Assignee: Tejas Patil > Move output partitioning definition from UnaryNodeExec to its children >

[jira] [Resolved] (SPARK-928) Add support for Unsafe-based serializer in Kryo 2.22

2016-10-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-928. --- Resolution: Fixed Assignee: Sandeep Singh Fix Version/s: 2.1.0 > Add support for

[jira] [Resolved] (SPARK-18051) Custom PartitionCoalescer cause serialization exception

2016-10-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18051. - Resolution: Fixed Assignee: Weichen Xu Fix Version/s: 2.1.0 > Custom

[jira] [Updated] (SPARK-18013) R cross join API similar to python and Scala

2016-10-21 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18013: Issue Type: Sub-task (was: Bug) Parent: SPARK-17298 > R cross join API similar to python

[jira] [Updated] (SPARK-17946) Python crossJoin API similar to Scala

2016-10-21 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17946: Issue Type: Sub-task (was: Bug) Parent: SPARK-17298 > Python crossJoin API similar to

[jira] [Resolved] (SPARK-18042) OutputWriter needs to return the path of the file written

2016-10-21 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18042. - Resolution: Fixed Fix Version/s: 2.1.0 > OutputWriter needs to return the path of the

[jira] [Commented] (SPARK-16216) CSV data source does not write date and timestamp correctly

2016-10-21 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15596417#comment-15596417 ] Reynold Xin commented on SPARK-16216: - Good to know! If you want to be backward compatible, just set

[jira] [Commented] (SPARK-16216) CSV data source does not write date and timestamp correctly

2016-10-21 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595892#comment-15595892 ] Reynold Xin commented on SPARK-16216: - [~barrybecker4] Hey Berry - can you try setting

[jira] [Commented] (SPARK-17829) Stable format for offset log

2016-10-21 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15594371#comment-15594371 ] Reynold Xin commented on SPARK-17829: - I like option 3! (in reality it is a more general version of

[jira] [Assigned] (SPARK-15472) Add support for writing partitioned `csv`, `json`, `text` formats in Structured Streaming

2016-10-21 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin reassigned SPARK-15472: --- Assignee: Reynold Xin > Add support for writing partitioned `csv`, `json`, `text` formats

[jira] [Commented] (SPARK-15472) Add support for writing partitioned `csv`, `json`, `text` formats in Structured Streaming

2016-10-21 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15594369#comment-15594369 ] Reynold Xin commented on SPARK-15472: - This actually will be subsumed by SPARK-17924. > Add support

[jira] [Updated] (SPARK-18042) OutputWriter needs to return the path of the file written

2016-10-21 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18042: Description: This patch adds a new "path" method on OutputWriter that returns the path of the

[jira] [Updated] (SPARK-17924) Consolidate streaming and batch write path

2016-10-21 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17924: Description: Structured streaming and normal SQL operation currently have two separate write

[jira] [Updated] (SPARK-18042) OutputWriter needs to return the path of the file written

2016-10-21 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18042: Description: Without this we won't be able to actually use the normal OutputWriter in streaming.

[jira] [Created] (SPARK-18042) OutputWriter needs to return the path of the file written

2016-10-21 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18042: --- Summary: OutputWriter needs to return the path of the file written Key: SPARK-18042 URL: https://issues.apache.org/jira/browse/SPARK-18042 Project: Spark

[jira] [Commented] (SPARK-18038) Move output partitioning definition from UnaryNodeExec to its children

2016-10-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593583#comment-15593583 ] Reynold Xin commented on SPARK-18038: - It definitely does. > Move output partitioning definition

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592839#comment-15592839 ] Reynold Xin commented on SPARK-10915: - The current implementation of collect_list isn't going to work

[jira] [Resolved] (SPARK-18021) Refactor file name specification for data sources

2016-10-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18021. - Resolution: Fixed Fix Version/s: 2.1.0 > Refactor file name specification for data

[jira] [Commented] (SPARK-10915) Add support for UDAFs in Python

2016-10-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592544#comment-15592544 ] Reynold Xin commented on SPARK-10915: - But if you need strict ordering guarantees, materializing them

[jira] [Resolved] (SPARK-15780) Support mapValues on KeyValueGroupedDataset

2016-10-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-15780. - Resolution: Fixed Assignee: Koert Kuipers Fix Version/s: 2.1.0 > Support

<    5   6   7   8   9   10   11   12   13   14   >