[jira] [Commented] (SPARK-15472) Add support for writing partitioned `csv`, `json`, `text` formats in Structured Streaming

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15624475#comment-15624475 ] Apache Spark commented on SPARK-15472: -- User 'lw-lin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18025) Port streaming to use the commit protocol API

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18025: Assignee: Apache Spark > Port streaming to use the commit protocol API >

[jira] [Commented] (SPARK-18025) Port streaming to use the commit protocol API

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15624420#comment-15624420 ] Apache Spark commented on SPARK-18025: -- User 'rxin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18025) Port streaming to use the commit protocol API

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18025: Assignee: (was: Apache Spark) > Port streaming to use the commit protocol API >

[jira] [Resolved] (SPARK-18024) Introduce a commit protocol API along with OutputCommitter implementation

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18024. - Resolution: Fixed Fix Version/s: 2.1.0 > Introduce a commit protocol API along with

[jira] [Updated] (SPARK-16827) Stop reporting spill metrics as shuffle metrics

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16827: Target Version/s: 2.0.3, 2.1.0 Fix Version/s: (was: 2.0.2)

[jira] [Commented] (SPARK-18187) CompactibleFileStreamLog should not rely on "compactInterval" to detect a compaction batch

2016-10-31 Thread Liwei Lin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15624263#comment-15624263 ] Liwei Lin commented on SPARK-18187: --- hi [~zsxwing] how are we planning to fix this? thanks >

[jira] [Commented] (SPARK-18190) Fix R version to not the latest in AppVeyor

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15624236#comment-15624236 ] Apache Spark commented on SPARK-18190: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Assigned] (SPARK-18190) Fix R version to not the latest in AppVeyor

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18190: Assignee: (was: Apache Spark) > Fix R version to not the latest in AppVeyor >

[jira] [Assigned] (SPARK-18190) Fix R version to not the latest in AppVeyor

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18190: Assignee: Apache Spark > Fix R version to not the latest in AppVeyor >

[jira] [Updated] (SPARK-18160) spark.files should not passed to driver in yarn-cluster mode

2016-10-31 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-18160: --- Summary: spark.files should not passed to driver in yarn-cluster mode (was: SparkContext.addFile

[jira] [Updated] (SPARK-18160) spark.files should not be passed to driver in yarn-cluster mode

2016-10-31 Thread Jeff Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated SPARK-18160: --- Summary: spark.files should not be passed to driver in yarn-cluster mode (was: spark.files should

[jira] [Commented] (SPARK-18190) Fix R version to not the latest in AppVeyor

2016-10-31 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15624176#comment-15624176 ] Shivaram Venkataraman commented on SPARK-18190: --- Yeah using a fixed version sounds good to

[jira] [Created] (SPARK-18190) Fix R version to not the latest in AppVeyor

2016-10-31 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-18190: Summary: Fix R version to not the latest in AppVeyor Key: SPARK-18190 URL: https://issues.apache.org/jira/browse/SPARK-18190 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-18087) Optimize insert to not require REPAIR TABLE

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18087. - Resolution: Fixed Assignee: Eric Liang Fix Version/s: 2.1.0 > Optimize insert to

[jira] [Commented] (SPARK-17055) add labelKFold to CrossValidator

2016-10-31 Thread Vincent (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15624114#comment-15624114 ] Vincent commented on SPARK-17055: - [~srowen] May I ask the reason why we close this issue? It'd be

[jira] [Updated] (SPARK-17637) Packed scheduling for Spark tasks across executors

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17637: Affects Version/s: (was: 2.1.0) Target Version/s: 2.1.0 > Packed scheduling for Spark

[jira] [Updated] (SPARK-14393) monotonicallyIncreasingId not monotonically increasing with downstream coalesce

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-14393: Target Version/s: 2.1.0 > monotonicallyIncreasingId not monotonically increasing with downstream

[jira] [Commented] (SPARK-18167) Flaky test when hive partition pruning is enabled

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623959#comment-15623959 ] Apache Spark commented on SPARK-18167: -- User 'ericl' has created a pull request for this issue:

[jira] [Commented] (SPARK-18024) Introduce a commit protocol API along with OutputCommitter implementation

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623957#comment-15623957 ] Apache Spark commented on SPARK-18024: -- User 'rxin' has created a pull request for this issue:

[jira] [Updated] (SPARK-18189) task not serializable with groupByKey() + mapGroups() + map

2016-10-31 Thread Ergin Seyfe (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ergin Seyfe updated SPARK-18189: Component/s: SQL > task not serializable with groupByKey() + mapGroups() + map >

[jira] [Updated] (SPARK-12469) Data Property Accumulators for Spark (formerly Consistent Accumulators)

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-12469: Target Version/s: (was: 2.1.0) > Data Property Accumulators for Spark (formerly Consistent

[jira] [Updated] (SPARK-18189) task not serializable with groupByKey() + mapGroups() + map

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18189: Target Version/s: 2.1.0 > task not serializable with groupByKey() + mapGroups() + map >

[jira] [Updated] (SPARK-18189) task not serializable with groupByKey() + mapGroups() + map

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18189: Target Version/s: 2.0.3, 2.1.0 (was: 2.1.0) > task not serializable with groupByKey() +

[jira] [Updated] (SPARK-18189) task not serializable with groupByKey() + mapGroups() + map

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18189: Description: just run the following code {code} val a =

[jira] [Assigned] (SPARK-18189) task not serializable with groupByKey() + mapGroups() + map

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18189: Assignee: (was: Apache Spark) > task not serializable with groupByKey() + mapGroups()

[jira] [Commented] (SPARK-18189) task not serializable with groupByKey() + mapGroups() + map

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623869#comment-15623869 ] Apache Spark commented on SPARK-18189: -- User 'seyfe' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18189) task not serializable with groupByKey() + mapGroups() + map

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18189: Assignee: Apache Spark > task not serializable with groupByKey() + mapGroups() + map >

[jira] [Updated] (SPARK-18173) data source tables should support truncating partition

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18173: Issue Type: Sub-task (was: New Feature) Parent: SPARK-17861 > data source tables should

[jira] [Updated] (SPARK-18184) INSERT [INTO|OVERWRITE] TABLE ... PARTITION for Datasource tables cannot handle partitions with custom locations

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18184: Issue Type: Sub-task (was: Bug) Parent: SPARK-17861 > INSERT [INTO|OVERWRITE] TABLE ...

[jira] [Updated] (SPARK-17992) HiveClient.getPartitionsByFilter throws an exception for some unsupported filters when hive.metastore.try.direct.sql=false

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-17992: Issue Type: Sub-task (was: Bug) Parent: SPARK-17861 > HiveClient.getPartitionsByFilter

[jira] [Updated] (SPARK-18183) INSERT OVERWRITE TABLE ... PARTITION will overwrite the entire Datasource table instead of just the specified partition

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18183: Issue Type: Sub-task (was: Bug) Parent: SPARK-17861 > INSERT OVERWRITE TABLE ...

[jira] [Updated] (SPARK-18087) Optimize insert to not require REPAIR TABLE

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18087: Target Version/s: 2.1.0 > Optimize insert to not require REPAIR TABLE >

[jira] [Created] (SPARK-18189) task not serializable with groupByKey() + mapGroups() + map

2016-10-31 Thread Yang Yang (JIRA)
Yang Yang created SPARK-18189: - Summary: task not serializable with groupByKey() + mapGroups() + map Key: SPARK-18189 URL: https://issues.apache.org/jira/browse/SPARK-18189 Project: Spark Issue

[jira] [Assigned] (SPARK-18184) INSERT [INTO|OVERWRITE] TABLE ... PARTITION for Datasource tables cannot handle partitions with custom locations

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18184: Assignee: (was: Apache Spark) > INSERT [INTO|OVERWRITE] TABLE ... PARTITION for

[jira] [Commented] (SPARK-18183) INSERT OVERWRITE TABLE ... PARTITION will overwrite the entire Datasource table instead of just the specified partition

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623809#comment-15623809 ] Apache Spark commented on SPARK-18183: -- User 'ericl' has created a pull request for this issue:

[jira] [Commented] (SPARK-18184) INSERT [INTO|OVERWRITE] TABLE ... PARTITION for Datasource tables cannot handle partitions with custom locations

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623811#comment-15623811 ] Apache Spark commented on SPARK-18184: -- User 'ericl' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18184) INSERT [INTO|OVERWRITE] TABLE ... PARTITION for Datasource tables cannot handle partitions with custom locations

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18184: Assignee: Apache Spark > INSERT [INTO|OVERWRITE] TABLE ... PARTITION for Datasource

[jira] [Assigned] (SPARK-18183) INSERT OVERWRITE TABLE ... PARTITION will overwrite the entire Datasource table instead of just the specified partition

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18183: Assignee: Apache Spark > INSERT OVERWRITE TABLE ... PARTITION will overwrite the entire

[jira] [Assigned] (SPARK-18183) INSERT OVERWRITE TABLE ... PARTITION will overwrite the entire Datasource table instead of just the specified partition

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18183: Assignee: (was: Apache Spark) > INSERT OVERWRITE TABLE ... PARTITION will overwrite

[jira] [Created] (SPARK-18188) Add checksum for block in Spark

2016-10-31 Thread Davies Liu (JIRA)
Davies Liu created SPARK-18188: -- Summary: Add checksum for block in Spark Key: SPARK-18188 URL: https://issues.apache.org/jira/browse/SPARK-18188 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-18187) CompactibleFileStreamLog should not rely on "compactInterval" to detect a compaction batch

2016-10-31 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-18187: Summary: CompactibleFileStreamLog should not rely on "compactInterval" to detect a compaction batch Key: SPARK-18187 URL: https://issues.apache.org/jira/browse/SPARK-18187

[jira] [Updated] (SPARK-18167) Flaky test when hive partition pruning is enabled

2016-10-31 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-18167: - Assignee: Eric Liang > Flaky test when hive partition pruning is enabled >

[jira] [Resolved] (SPARK-18167) Flaky test when hive partition pruning is enabled

2016-10-31 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-18167. -- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15701

[jira] [Resolved] (SPARK-18030) Flaky test: org.apache.spark.sql.streaming.FileStreamSourceSuite

2016-10-31 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-18030. -- Resolution: Fixed Fix Version/s: 2.1.0 2.0.2 > Flaky test:

[jira] [Resolved] (SPARK-18143) History Server is broken because of the refactoring work in Structured Streaming

2016-10-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18143. - Resolution: Fixed Assignee: Shixiong Zhu Fix Version/s: 2.1.0

[jira] [Commented] (SPARK-17732) ALTER TABLE DROP PARTITION should support comparators

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623707#comment-15623707 ] Apache Spark commented on SPARK-17732: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Assigned] (SPARK-18186) Migrate HiveUDAFFunction to TypedImperativeAggregate for partial aggregation support

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18186: Assignee: Apache Spark (was: Cheng Lian) > Migrate HiveUDAFFunction to

[jira] [Assigned] (SPARK-18186) Migrate HiveUDAFFunction to TypedImperativeAggregate for partial aggregation support

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18186: Assignee: Cheng Lian (was: Apache Spark) > Migrate HiveUDAFFunction to

[jira] [Commented] (SPARK-18186) Migrate HiveUDAFFunction to TypedImperativeAggregate for partial aggregation support

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623688#comment-15623688 ] Apache Spark commented on SPARK-18186: -- User 'liancheng' has created a pull request for this issue:

[jira] [Commented] (SPARK-15784) Add Power Iteration Clustering to spark.ml

2016-10-31 Thread Seth Hendrickson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623661#comment-15623661 ] Seth Hendrickson commented on SPARK-15784: -- This seems like it fits the framework of a feature

[jira] [Created] (SPARK-18186) Migrate HiveUDAFFunction to TypedImperativeAggregate for partial aggregation support

2016-10-31 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-18186: -- Summary: Migrate HiveUDAFFunction to TypedImperativeAggregate for partial aggregation support Key: SPARK-18186 URL: https://issues.apache.org/jira/browse/SPARK-18186

[jira] [Assigned] (SPARK-18124) Implement watermarking for handling late data

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18124: Assignee: Apache Spark (was: Michael Armbrust) > Implement watermarking for handling

[jira] [Assigned] (SPARK-18124) Implement watermarking for handling late data

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18124: Assignee: Michael Armbrust (was: Apache Spark) > Implement watermarking for handling

[jira] [Commented] (SPARK-18124) Implement watermarking for handling late data

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623603#comment-15623603 ] Apache Spark commented on SPARK-18124: -- User 'marmbrus' has created a pull request for this issue:

[jira] [Commented] (SPARK-18167) Flaky test when hive partition pruning is enabled

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623499#comment-15623499 ] Apache Spark commented on SPARK-18167: -- User 'ericl' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17964) Enable SparkR with Mesos client mode

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17964: Assignee: (was: Apache Spark) > Enable SparkR with Mesos client mode >

[jira] [Commented] (SPARK-17964) Enable SparkR with Mesos client mode

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623427#comment-15623427 ] Apache Spark commented on SPARK-17964: -- User 'susanxhuynh' has created a pull request for this

[jira] [Assigned] (SPARK-17964) Enable SparkR with Mesos client mode

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17964: Assignee: Apache Spark > Enable SparkR with Mesos client mode >

[jira] [Resolved] (SPARK-17972) Query planning slows down dramatically for large query plans even when sub-trees are cached

2016-10-31 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai resolved SPARK-17972. -- Resolution: Fixed Fix Version/s: 2.1.0 > Query planning slows down dramatically for large query

[jira] [Commented] (SPARK-17972) Query planning slows down dramatically for large query plans even when sub-trees are cached

2016-10-31 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623363#comment-15623363 ] Yin Huai commented on SPARK-17972: -- This issue has been resolved by

[jira] [Updated] (SPARK-17972) Query planning slows down dramatically for large query plans even when sub-trees are cached

2016-10-31 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-17972: - Target Version/s: 2.1.0 (was: 2.0.2, 2.1.0) > Query planning slows down dramatically for large query

[jira] [Created] (SPARK-18185) Should disallow INSERT OVERWRITE TABLE of Datasource tables with dynamic partitions

2016-10-31 Thread Eric Liang (JIRA)
Eric Liang created SPARK-18185: -- Summary: Should disallow INSERT OVERWRITE TABLE of Datasource tables with dynamic partitions Key: SPARK-18185 URL: https://issues.apache.org/jira/browse/SPARK-18185

[jira] [Created] (SPARK-18184) INSERT [INTO|OVERWRITE] TABLE ... PARTITION for Datasource tables cannot handle partitions with custom locations

2016-10-31 Thread Eric Liang (JIRA)
Eric Liang created SPARK-18184: -- Summary: INSERT [INTO|OVERWRITE] TABLE ... PARTITION for Datasource tables cannot handle partitions with custom locations Key: SPARK-18184 URL:

[jira] [Updated] (SPARK-18183) INSERT OVERWRITE TABLE ... PARTITION will overwrite the entire Datasource table instead of just the specified partition

2016-10-31 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-18183: --- Component/s: SQL > INSERT OVERWRITE TABLE ... PARTITION will overwrite the entire Datasource >

[jira] [Created] (SPARK-18183) INSERT OVERWRITE TABLE ... PARTITION will overwrite the entire Datasource table instead of just the specified partition

2016-10-31 Thread Eric Liang (JIRA)
Eric Liang created SPARK-18183: -- Summary: INSERT OVERWRITE TABLE ... PARTITION will overwrite the entire Datasource table instead of just the specified partition Key: SPARK-18183 URL:

[jira] [Commented] (SPARK-18030) Flaky test: org.apache.spark.sql.streaming.FileStreamSourceSuite

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623225#comment-15623225 ] Apache Spark commented on SPARK-18030: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18030) Flaky test: org.apache.spark.sql.streaming.FileStreamSourceSuite

2016-10-31 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-18030: Assignee: Shixiong Zhu (was: Tathagata Das) > Flaky test:

[jira] [Assigned] (SPARK-18182) Expose ReplayListenerBus.replay() overload which accepts Iterator

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18182: Assignee: Josh Rosen (was: Apache Spark) > Expose ReplayListenerBus.replay() overload

[jira] [Commented] (SPARK-18182) Expose ReplayListenerBus.replay() overload which accepts Iterator

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623143#comment-15623143 ] Apache Spark commented on SPARK-18182: -- User 'JoshRosen' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18182) Expose ReplayListenerBus.replay() overload which accepts Iterator

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18182: Assignee: Apache Spark (was: Josh Rosen) > Expose ReplayListenerBus.replay() overload

[jira] [Created] (SPARK-18182) Expose ReplayListenerBus.replay() overload which accepts Iterator

2016-10-31 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-18182: -- Summary: Expose ReplayListenerBus.replay() overload which accepts Iterator Key: SPARK-18182 URL: https://issues.apache.org/jira/browse/SPARK-18182 Project: Spark

[jira] [Commented] (SPARK-18181) Huge managed memory leak (2.7G) when running reduceByKey

2016-10-31 Thread Barry Becker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15623018#comment-15623018 ] Barry Becker commented on SPARK-18181: -- For this case to leak a lot of memory, I bin the numeric

[jira] [Created] (SPARK-18181) Huge managed memory leak (2.7G) when running reduceByKey

2016-10-31 Thread Barry Becker (JIRA)
Barry Becker created SPARK-18181: Summary: Huge managed memory leak (2.7G) when running reduceByKey Key: SPARK-18181 URL: https://issues.apache.org/jira/browse/SPARK-18181 Project: Spark

[jira] [Updated] (SPARK-18177) Add missing 'subsamplingRate' of pyspark GBTClassifier

2016-10-31 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18177: -- Priority: Minor (was: Major) > Add missing 'subsamplingRate' of pyspark GBTClassifier

[jira] [Updated] (SPARK-18177) Add missing 'subsamplingRate' of pyspark GBTClassifier

2016-10-31 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18177: -- Issue Type: New Feature (was: Bug) > Add missing 'subsamplingRate' of pyspark

[jira] [Updated] (SPARK-18177) Add missing 'subsamplingRate' of pyspark GBTClassifier

2016-10-31 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18177: -- Shepherd: Joseph K. Bradley Assignee: zhengruifeng > Add missing 'subsamplingRate'

[jira] [Commented] (SPARK-16522) [MESOS] Spark application throws exception on exit

2016-10-31 Thread Michael Gummelt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622992#comment-15622992 ] Michael Gummelt commented on SPARK-16522: - This JIRA was for a bug in Mesos. If you're getting

[jira] [Commented] (SPARK-14363) Executor OOM due to a memory leak in Sorter

2016-10-31 Thread Barry Becker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622975#comment-15622975 ] Barry Becker commented on SPARK-14363: -- I am hitting this issue in 1.6.2. In fact, I can make a case

[jira] [Commented] (SPARK-15784) Add Power Iteration Clustering to spark.ml

2016-10-31 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622915#comment-15622915 ] Joseph K. Bradley commented on SPARK-15784: --- [~wangmiao1981] Sorry for the slow response here.

[jira] [Commented] (SPARK-18024) Introduce a commit protocol API along with OutputCommitter implementation

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622872#comment-15622872 ] Apache Spark commented on SPARK-18024: -- User 'rxin' has created a pull request for this issue:

[jira] [Commented] (SPARK-18143) History Server is broken because of the refactoring work in Structured Streaming

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622740#comment-15622740 ] Apache Spark commented on SPARK-18143: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Commented] (SPARK-15616) Metastore relation should fallback to HDFS size of partitions that are involved in Query if statistics are not available.

2016-10-31 Thread Lianhui Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622645#comment-15622645 ] Lianhui Wang commented on SPARK-15616: -- For 2.0, I have created a new branch

[jira] [Created] (SPARK-18180) pyspark.sql.Row does not serialize well to json

2016-10-31 Thread Miguel Cabrera (JIRA)
Miguel Cabrera created SPARK-18180: -- Summary: pyspark.sql.Row does not serialize well to json Key: SPARK-18180 URL: https://issues.apache.org/jira/browse/SPARK-18180 Project: Spark Issue

[jira] [Commented] (SPARK-17695) Deserialization error when using DataFrameReader.json on JSON line that contains an empty JSON object

2016-10-31 Thread Miguel Cabrera (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622415#comment-15622415 ] Miguel Cabrera commented on SPARK-17695: Hi, is there a way to prevent this? besides not using

[jira] [Assigned] (SPARK-18179) Throws analysis exception with a proper message for unsupported argument types in reflect/java_method function

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18179: Assignee: (was: Apache Spark) > Throws analysis exception with a proper message for

[jira] [Commented] (SPARK-18179) Throws analysis exception with a proper message for unsupported argument types in reflect/java_method function

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622367#comment-15622367 ] Apache Spark commented on SPARK-18179: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Assigned] (SPARK-18179) Throws analysis exception with a proper message for unsupported argument types in reflect/java_method function

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18179: Assignee: Apache Spark > Throws analysis exception with a proper message for unsupported

[jira] [Created] (SPARK-18179) Throws analysis exception with a proper message for unsupported argument types in reflect/java_method function

2016-10-31 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-18179: Summary: Throws analysis exception with a proper message for unsupported argument types in reflect/java_method function Key: SPARK-18179 URL:

[jira] [Created] (SPARK-18178) Importing Pandas Tables with Missing Values

2016-10-31 Thread Kevin Mader (JIRA)
Kevin Mader created SPARK-18178: --- Summary: Importing Pandas Tables with Missing Values Key: SPARK-18178 URL: https://issues.apache.org/jira/browse/SPARK-18178 Project: Spark Issue Type: Bug

[jira] [Comment Edited] (SPARK-16740) joins.LongToUnsafeRowMap crashes with NegativeArraySizeException

2016-10-31 Thread Harish (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15620885#comment-15620885 ] Harish edited comment on SPARK-16740 at 10/31/16 12:38 PM: --- Thank you. I

[jira] [Resolved] (SPARK-882) Have link for feedback/suggestions in docs

2016-10-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-882. - Resolution: Not A Problem > Have link for feedback/suggestions in docs >

[jira] [Updated] (SPARK-18176) Kafka010 .createRDD() scala API should expect scala Map

2016-10-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-18176: -- Priority: Minor (was: Major) > Kafka010 .createRDD() scala API should expect scala Map >

[jira] [Resolved] (SPARK-17055) add labelKFold to CrossValidator

2016-10-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17055. --- Resolution: Won't Fix > add labelKFold to CrossValidator > > >

[jira] [Commented] (SPARK-18125) Spark generated code causes CompileException when groupByKey, reduceGroups and map(_._2) are used

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15621681#comment-15621681 ] Apache Spark commented on SPARK-18125: -- User 'viirya' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18125) Spark generated code causes CompileException when groupByKey, reduceGroups and map(_._2) are used

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18125: Assignee: Apache Spark > Spark generated code causes CompileException when groupByKey,

[jira] [Assigned] (SPARK-18125) Spark generated code causes CompileException when groupByKey, reduceGroups and map(_._2) are used

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18125: Assignee: (was: Apache Spark) > Spark generated code causes CompileException when

[jira] [Commented] (SPARK-18112) Spark2.x does not support read data from Hive 2.x metastore

2016-10-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15621678#comment-15621678 ] Sean Owen commented on SPARK-18112: --- Yes, this may be an instance where Hive 2 will have to shim to be

[jira] [Commented] (SPARK-18177) Add missing 'subsamplingRate' of pyspark GBTClassifier

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15621651#comment-15621651 ] Apache Spark commented on SPARK-18177: -- User 'zhengruifeng' has created a pull request for this

[jira] [Assigned] (SPARK-18177) Add missing 'subsamplingRate' of pyspark GBTClassifier

2016-10-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18177: Assignee: Apache Spark > Add missing 'subsamplingRate' of pyspark GBTClassifier >

  1   2   >