[jira] [Assigned] (SPARK-18475) Be able to provide higher parallelization for StructuredStreaming Kafka Source

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18475: Assignee: (was: Apache Spark) > Be able to provide higher parallelization for

[jira] [Created] (SPARK-18652) Include the example data with the pyspark package

2016-11-30 Thread Shuai Lin (JIRA)
Shuai Lin created SPARK-18652: - Summary: Include the example data with the pyspark package Key: SPARK-18652 URL: https://issues.apache.org/jira/browse/SPARK-18652 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-18652) Include the example data with the pyspark package

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18652: Assignee: Apache Spark > Include the example data with the pyspark package >

[jira] [Commented] (SPARK-17608) Long type has incorrect serialization/deserialization

2016-11-30 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709192#comment-15709192 ] Shivaram Venkataraman commented on SPARK-17608: --- [~iamthomaspowell] would you be able to

[jira] [Updated] (SPARK-16589) Chained cartesian produces incorrect number of records

2016-11-30 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-16589: - Labels: correctness (was: ) > Chained cartesian produces incorrect number of records >

[jira] [Assigned] (SPARK-18652) Include the example data with the pyspark package

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18652: Assignee: (was: Apache Spark) > Include the example data with the pyspark package >

[jira] [Commented] (SPARK-18652) Include the example data with the pyspark package

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709159#comment-15709159 ] Apache Spark commented on SPARK-18652: -- User 'lins05' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18475) Be able to provide higher parallelization for StructuredStreaming Kafka Source

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18475: Assignee: Apache Spark > Be able to provide higher parallelization for

[jira] [Commented] (SPARK-16554) Spark should kill executors when they are blacklisted

2016-11-30 Thread Jose Soltren (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709080#comment-15709080 ] Jose Soltren commented on SPARK-16554: -- This is a nice idea but I'm not sure how feasible it is. If

[jira] [Created] (SPARK-18651) KeyValueGroupedDataset[K, V].reduceGroups cannot handle primitive for V

2016-11-30 Thread koert kuipers (JIRA)
koert kuipers created SPARK-18651: - Summary: KeyValueGroupedDataset[K, V].reduceGroups cannot handle primitive for V Key: SPARK-18651 URL: https://issues.apache.org/jira/browse/SPARK-18651 Project:

[jira] [Updated] (SPARK-18652) Include the example data and third-party licenses in pyspark package

2016-11-30 Thread Shuai Lin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuai Lin updated SPARK-18652: -- Summary: Include the example data and third-party licenses in pyspark package (was: Include the

[jira] [Reopened] (SPARK-18651) KeyValueGroupedDataset[K, V].reduceGroups cannot handle primitive for V

2016-11-30 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] koert kuipers reopened SPARK-18651: --- > KeyValueGroupedDataset[K, V].reduceGroups cannot handle primitive for V >

[jira] [Assigned] (SPARK-18653) Dataset.show() generates incorrect padding for Unicode Character

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18653: Assignee: Apache Spark > Dataset.show() generates incorrect padding for Unicode Character

[jira] [Assigned] (SPARK-18653) Dataset.show() generates incorrect padding for Unicode Character

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18653: Assignee: (was: Apache Spark) > Dataset.show() generates incorrect padding for

[jira] [Commented] (SPARK-18653) Dataset.show() generates incorrect padding for Unicode Character

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709468#comment-15709468 ] Apache Spark commented on SPARK-18653: -- User 'kiszk' has created a pull request for this issue:

[jira] [Updated] (SPARK-18362) Use TextFileFormat in implementation of CSVFileFormat

2016-11-30 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-18362: --- Description: Spark's CSVFileFormat data source uses inefficient methods for reading files during

[jira] [Commented] (SPARK-18644) spark-submit fails to run python scripts with specific names

2016-11-30 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709220#comment-15709220 ] Marcelo Vanzin commented on SPARK-18644: I wonder if this is just a python issue. python has a

[jira] [Updated] (SPARK-18362) Use TextFileFormat in implementation of CSVFileFormat

2016-11-30 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-18362: --- Summary: Use TextFileFormat in implementation of CSVFileFormat (was: Use TextFileFormat in

[jira] [Resolved] (SPARK-18220) ClassCastException occurs when using select query on ORC file

2016-11-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18220. - Resolution: Fixed Fix Version/s: 2.1.0 > ClassCastException occurs when using select

[jira] [Closed] (SPARK-18515) AlterTableDropPartitions fails for non-string columns

2016-11-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun closed SPARK-18515. - Resolution: Duplicate This bug was introduced by the first commit of SPARK-17732, but now it's

[jira] [Issue Comment Deleted] (SPARK-13061) Error in spark rest api application info for job names contains spaces

2016-11-30 Thread Sharad (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sharad updated SPARK-13061: --- Comment: was deleted (was: As Devraj mentioned the correct URL to access job details is of the form -

[jira] [Commented] (SPARK-18654) JacksonParser.makeRootConverter has effectively unreachable code

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709422#comment-15709422 ] Apache Spark commented on SPARK-18654: -- User 'NathanHowell' has created a pull request for this

[jira] [Assigned] (SPARK-18654) JacksonParser.makeRootConverter has effectively unreachable code

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18654: Assignee: Apache Spark > JacksonParser.makeRootConverter has effectively unreachable code

[jira] [Updated] (SPARK-18655) Ignore Structured Streaming 2.0.2 logs in history server

2016-11-30 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-18655: - Fix Version/s: 2.1.0 > Ignore Structured Streaming 2.0.2 logs in history server >

[jira] [Updated] (SPARK-18655) Ignore Structured Streaming 2.0.2 logs in history server

2016-11-30 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-18655: - Priority: Blocker (was: Major) > Ignore Structured Streaming 2.0.2 logs in history server >

[jira] [Updated] (SPARK-18652) Include the example data and third-party licenses in pyspark package

2016-11-30 Thread Shuai Lin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuai Lin updated SPARK-18652: -- Description: Since we already include the python examples in the pyspark package, we should include

[jira] [Commented] (SPARK-13061) Error in spark rest api application info for job names contains spaces

2016-11-30 Thread Sharad (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709325#comment-15709325 ] Sharad commented on SPARK-13061: As Devraj mentioned the correct URL to access job details is of the form

[jira] [Commented] (SPARK-13061) Error in spark rest api application info for job names contains spaces

2016-11-30 Thread Sharad (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709322#comment-15709322 ] Sharad commented on SPARK-13061: As Devraj mentioned the correct URL to access job details is of the form

[jira] [Created] (SPARK-18655) Ignore Structured Streaming 2.0.2 logs in history server

2016-11-30 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-18655: Summary: Ignore Structured Streaming 2.0.2 logs in history server Key: SPARK-18655 URL: https://issues.apache.org/jira/browse/SPARK-18655 Project: Spark

[jira] [Issue Comment Deleted] (SPARK-16551) Accumulator Examples should demonstrate different use case from UDAFs

2016-11-30 Thread Ruiming Zhou (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruiming Zhou updated SPARK-16551: - Comment: was deleted (was: I can look at this issue.) > Accumulator Examples should demonstrate

[jira] [Issue Comment Deleted] (SPARK-16205) dict -> StructType conversion is undocumented

2016-11-30 Thread Ruiming Zhou (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruiming Zhou updated SPARK-16205: - Comment: was deleted (was: I would like to take this if it is still available) > dict ->

[jira] [Commented] (SPARK-18515) AlterTableDropPartitions fails for non-string columns

2016-11-30 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709307#comment-15709307 ] Dongjoon Hyun commented on SPARK-18515: --- Hi, [~hvanhovell]. I closed this issue since this is not

[jira] [Commented] (SPARK-18506) kafka 0.10 with Spark 2.02 auto.offset.reset=earliest will only read from a single partition on a multi partition topic

2016-11-30 Thread Heji Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709430#comment-15709430 ] Heji Kim commented on SPARK-18506: -- Hi Cody. I have tried roughly a similar configuration on GCP with

[jira] [Assigned] (SPARK-18654) JacksonParser.makeRootConverter has effectively unreachable code

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18654: Assignee: (was: Apache Spark) > JacksonParser.makeRootConverter has effectively

[jira] [Resolved] (SPARK-18651) KeyValueGroupedDataset[K, V].reduceGroups cannot handle primitive for V

2016-11-30 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] koert kuipers resolved SPARK-18651. --- Resolution: Not A Bug Fixed in master, still issue in branch-2.0 >

[jira] [Assigned] (SPARK-18097) Can't drop a table from Hive if the schema is corrupt

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18097: Assignee: (was: Apache Spark) > Can't drop a table from Hive if the schema is corrupt

[jira] [Comment Edited] (SPARK-18651) KeyValueGroupedDataset[K, V].reduceGroups cannot handle primitive for V

2016-11-30 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709352#comment-15709352 ] koert kuipers edited comment on SPARK-18651 at 11/30/16 6:38 PM: - Fixed

[jira] [Assigned] (SPARK-18097) Can't drop a table from Hive if the schema is corrupt

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18097: Assignee: Apache Spark > Can't drop a table from Hive if the schema is corrupt >

[jira] [Commented] (SPARK-18097) Can't drop a table from Hive if the schema is corrupt

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709373#comment-15709373 ] Apache Spark commented on SPARK-18097: -- User 'jayadevanmurali' has created a pull request for this

[jira] [Commented] (SPARK-18655) Ignore Structured Streaming 2.0.2 logs in history server

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709454#comment-15709454 ] Apache Spark commented on SPARK-18655: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18655) Ignore Structured Streaming 2.0.2 logs in history server

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18655: Assignee: Shixiong Zhu (was: Apache Spark) > Ignore Structured Streaming 2.0.2 logs in

[jira] [Assigned] (SPARK-18655) Ignore Structured Streaming 2.0.2 logs in history server

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18655: Assignee: Apache Spark (was: Shixiong Zhu) > Ignore Structured Streaming 2.0.2 logs in

[jira] [Commented] (SPARK-18651) KeyValueGroupedDataset[K, V].reduceGroups cannot handle primitive for V

2016-11-30 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709349#comment-15709349 ] koert kuipers commented on SPARK-18651: --- i cannot reproduce the error in master. it seems to work

[jira] [Reopened] (SPARK-18506) kafka 0.10 with Spark 2.02 auto.offset.reset=earliest will only read from a single partition on a multi partition topic

2016-11-30 Thread Heji Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Heji Kim reopened SPARK-18506: -- My team has asked me to reopen this issue to see if there can be any more progress. We have implemented a

[jira] [Created] (SPARK-18654) JacksonParser.makeRootConverter has effectively unreachable code

2016-11-30 Thread Nathan Howell (JIRA)
Nathan Howell created SPARK-18654: - Summary: JacksonParser.makeRootConverter has effectively unreachable code Key: SPARK-18654 URL: https://issues.apache.org/jira/browse/SPARK-18654 Project: Spark

[jira] [Commented] (SPARK-18536) Failed to save to hive table when case class with empty field

2016-11-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709551#comment-15709551 ] Reynold Xin commented on SPARK-18536: - We need to add a PreWriteCheck for Parquet. > Failed to save

[jira] [Created] (SPARK-18653) Dataset.show() generates incorrect padding for Unicode Character

2016-11-30 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-18653: Summary: Dataset.show() generates incorrect padding for Unicode Character Key: SPARK-18653 URL: https://issues.apache.org/jira/browse/SPARK-18653 Project:

[jira] [Commented] (SPARK-16280) Implement histogram_numeric SQL function

2016-11-30 Thread Andy Dang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709318#comment-15709318 ] Andy Dang commented on SPARK-16280: --- What's the status of this issue? > Implement histogram_numeric

[jira] [Comment Edited] (SPARK-18651) KeyValueGroupedDataset[K, V].reduceGroups cannot handle primitive for V

2016-11-30 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709349#comment-15709349 ] koert kuipers edited comment on SPARK-18651 at 11/30/16 6:39 PM: - i

[jira] [Updated] (SPARK-18536) Failed to save to hive table when case class with empty field

2016-11-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18536: Description: {code}import scala.collection.mutable.Queue import org.apache.spark.SparkConf import

[jira] [Created] (SPARK-18656) org.apache.spark.sql.execution.stat.StatFunctions#multipleApproxQuantiles requires too much memory in case of many columns

2016-11-30 Thread Sina Sohangir (JIRA)
Sina Sohangir created SPARK-18656: - Summary: org.apache.spark.sql.execution.stat.StatFunctions#multipleApproxQuantiles requires too much memory in case of many columns Key: SPARK-18656 URL:

[jira] [Updated] (SPARK-18655) Ignore Structured Streaming 2.0.2 logs in history server

2016-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18655: - Target Version/s: 2.1.0 > Ignore Structured Streaming 2.0.2 logs in history server >

[jira] [Updated] (SPARK-18655) Ignore Structured Streaming 2.0.2 logs in history server

2016-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18655: - Fix Version/s: (was: 2.1.0) > Ignore Structured Streaming 2.0.2 logs in history

[jira] [Resolved] (SPARK-16545) Structured Streaming : foreachSink creates the Physical Plan multiple times per TriggerInterval

2016-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-16545. -- Resolution: Later > Structured Streaming : foreachSink creates the Physical Plan

[jira] [Updated] (SPARK-18274) Memory leak in PySpark StringIndexer

2016-11-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-18274: -- Target Version/s: 2.0.3, 2.1.1, 2.2.0 (was: 2.0.3, 2.1.0) > Memory leak in PySpark

[jira] [Resolved] (SPARK-18318) ML, Graph 2.1 QA: API: New Scala APIs, docs

2016-11-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-18318. --- Resolution: Fixed Fix Version/s: 2.2.0 2.1.1 Issue

[jira] [Assigned] (SPARK-18659) Incorrect behaviors in overwrite table for datasource tables

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18659: Assignee: Apache Spark > Incorrect behaviors in overwrite table for datasource tables >

[jira] [Commented] (SPARK-18085) Better History Server scalability for many / large applications

2016-11-30 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709943#comment-15709943 ] Marcelo Vanzin commented on SPARK-18085: I uploaded code for milestone 3 from the document:

[jira] [Updated] (SPARK-18588) KafkaSourceStressForDontFailOnDataLossSuite is flaky

2016-11-30 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-18588: - Target Version/s: 2.1.0 > KafkaSourceStressForDontFailOnDataLossSuite is flaky >

[jira] [Created] (SPARK-18658) Writing to a text DataSource buffers one or more lines in memory

2016-11-30 Thread Nathan Howell (JIRA)
Nathan Howell created SPARK-18658: - Summary: Writing to a text DataSource buffers one or more lines in memory Key: SPARK-18658 URL: https://issues.apache.org/jira/browse/SPARK-18658 Project: Spark

[jira] [Created] (SPARK-18659) Crash in overwrite table partitions due to hive metastore integration

2016-11-30 Thread Eric Liang (JIRA)
Eric Liang created SPARK-18659: -- Summary: Crash in overwrite table partitions due to hive metastore integration Key: SPARK-18659 URL: https://issues.apache.org/jira/browse/SPARK-18659 Project: Spark

[jira] [Resolved] (SPARK-18251) DataSet API | RuntimeException: Null value appeared in non-nullable field when holding Option Case Class

2016-11-30 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian resolved SPARK-18251. Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 15979

[jira] [Updated] (SPARK-18251) DataSet API | RuntimeException: Null value appeared in non-nullable field when holding Option Case Class

2016-11-30 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-18251: --- Assignee: Wenchen Fan > DataSet API | RuntimeException: Null value appeared in non-nullable field >

[jira] [Updated] (SPARK-18659) Incorrect behaviors in overwrite table for datasource tables

2016-11-30 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-18659: --- Description: The first three test cases fail due to a crash in hive client when dropping partitions

[jira] [Assigned] (SPARK-18659) Incorrect behaviors in overwrite table for datasource tables

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18659: Assignee: (was: Apache Spark) > Incorrect behaviors in overwrite table for datasource

[jira] [Commented] (SPARK-18659) Incorrect behaviors in overwrite table for datasource tables

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709938#comment-15709938 ] Apache Spark commented on SPARK-18659: -- User 'ericl' has created a pull request for this issue:

[jira] [Created] (SPARK-18657) Persist UUID across query restart

2016-11-30 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-18657: Summary: Persist UUID across query restart Key: SPARK-18657 URL: https://issues.apache.org/jira/browse/SPARK-18657 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-18563) mapWithState: initialState should have a timeout setting per record

2016-11-30 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-18563: - Component/s: (was: Structured Streaming) DStreams > mapWithState:

[jira] [Commented] (SPARK-18318) ML, Graph 2.1 QA: API: New Scala APIs, docs

2016-11-30 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709811#comment-15709811 ] Joseph K. Bradley commented on SPARK-18318: --- I did a quick check too and did not see anything

[jira] [Commented] (SPARK-18251) DataSet API | RuntimeException: Null value appeared in non-nullable field when holding Option Case Class

2016-11-30 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709869#comment-15709869 ] Cheng Lian commented on SPARK-18251: One more comment about why we shouldn't allow a {{Option\[T <:

[jira] [Issue Comment Deleted] (SPARK-18656) org.apache.spark.sql.execution.stat.StatFunctions#multipleApproxQuantiles requires too much memory in case of many columns

2016-11-30 Thread Sina Sohangir (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sina Sohangir updated SPARK-18656: -- Comment: was deleted (was: Created a PR: https://github.com/apache/spark/pull/16087 ) >

[jira] [Updated] (SPARK-18659) Incorrect behaviors in overwrite table for datasource tables

2016-11-30 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-18659: --- Description: The following test cases fail due to a crash in hive client when dropping partitions

[jira] [Assigned] (SPARK-18656) org.apache.spark.sql.execution.stat.StatFunctions#multipleApproxQuantiles requires too much memory in case of many columns

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18656: Assignee: Apache Spark >

[jira] [Commented] (SPARK-18656) org.apache.spark.sql.execution.stat.StatFunctions#multipleApproxQuantiles requires too much memory in case of many columns

2016-11-30 Thread Sina Sohangir (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709930#comment-15709930 ] Sina Sohangir commented on SPARK-18656: --- Create a PR: https://github.com/apache/spark/pull/16087

[jira] [Commented] (SPARK-18656) org.apache.spark.sql.execution.stat.StatFunctions#multipleApproxQuantiles requires too much memory in case of many columns

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709929#comment-15709929 ] Apache Spark commented on SPARK-18656: -- User 'sinasohangirsc' has created a pull request for this

[jira] [Comment Edited] (SPARK-18656) org.apache.spark.sql.execution.stat.StatFunctions#multipleApproxQuantiles requires too much memory in case of many columns

2016-11-30 Thread Sina Sohangir (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709930#comment-15709930 ] Sina Sohangir edited comment on SPARK-18656 at 11/30/16 10:03 PM: --

[jira] [Resolved] (SPARK-18546) UnsafeShuffleWriter corrupts encrypted shuffle files when merging

2016-11-30 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-18546. Resolution: Fixed Fix Version/s: 2.1.1 > UnsafeShuffleWriter corrupts encrypted

[jira] [Updated] (SPARK-18659) Incorrect behaviors in overwrite table for datasource tables

2016-11-30 Thread Eric Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Liang updated SPARK-18659: --- Summary: Incorrect behaviors in overwrite table for datasource tables (was: Crash in overwrite

[jira] [Assigned] (SPARK-18656) org.apache.spark.sql.execution.stat.StatFunctions#multipleApproxQuantiles requires too much memory in case of many columns

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18656: Assignee: (was: Apache Spark) >

[jira] [Created] (SPARK-18660) Parquet complains "Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl

2016-11-30 Thread Yin Huai (JIRA)
Yin Huai created SPARK-18660: Summary: Parquet complains "Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl " Key: SPARK-18660

[jira] [Commented] (SPARK-18645) spark-daemon.sh arguments error lead to throws Unrecognized option

2016-11-30 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15708003#comment-15708003 ] Yuming Wang commented on SPARK-18645: - I will pull request for this issue later. > spark-daemon.sh

[jira] [Commented] (SPARK-17608) Long type has incorrect serialization/deserialization

2016-11-30 Thread Thomas Powell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15708082#comment-15708082 ] Thomas Powell commented on SPARK-17608: --- Yes the confusing thing at the moment is the roundtripping

[jira] [Created] (SPARK-18645) spark-daemon.sh arguments error lead to throws Unrecognized option

2016-11-30 Thread Yuming Wang (JIRA)
Yuming Wang created SPARK-18645: --- Summary: spark-daemon.sh arguments error lead to throws Unrecognized option Key: SPARK-18645 URL: https://issues.apache.org/jira/browse/SPARK-18645 Project: Spark

[jira] [Commented] (SPARK-18471) In treeAggregate, generate (big) zeros instead of sending them.

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15708004#comment-15708004 ] Apache Spark commented on SPARK-18471: -- User 'AnthonyTruchet' has created a pull request for this

[jira] [Resolved] (SPARK-18366) Add handleInvalid to Pyspark for QuantileDiscretizer and Bucketizer

2016-11-30 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-18366. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 15817

[jira] [Updated] (SPARK-18366) Add handleInvalid to Pyspark for QuantileDiscretizer and Bucketizer

2016-11-30 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-18366: --- Fix Version/s: (was: 2.1.0) 2.1.1 > Add handleInvalid to Pyspark for

[jira] [Resolved] (SPARK-18612) Leaked broadcasted variable Mllib

2016-11-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18612. --- Resolution: Fixed Fix Version/s: 2.1.1 Issue resolved by pull request 16040

[jira] [Updated] (SPARK-18612) Leaked broadcasted variable Mllib

2016-11-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-18612: -- Assignee: Anthony Truchet > Leaked broadcasted variable Mllib > - > >

[jira] [Created] (SPARK-18644) spark-submit fails to run python scripts with specific names

2016-11-30 Thread Jussi Jousimo (JIRA)
Jussi Jousimo created SPARK-18644: - Summary: spark-submit fails to run python scripts with specific names Key: SPARK-18644 URL: https://issues.apache.org/jira/browse/SPARK-18644 Project: Spark

[jira] [Created] (SPARK-18646) ExecutorClassLoader for spark-shell does not honor spark.executor.userClassPathFirst

2016-11-30 Thread Min Shen (JIRA)
Min Shen created SPARK-18646: Summary: ExecutorClassLoader for spark-shell does not honor spark.executor.userClassPathFirst Key: SPARK-18646 URL: https://issues.apache.org/jira/browse/SPARK-18646

[jira] [Updated] (SPARK-18366) Add handleInvalid to Pyspark for QuantileDiscretizer and Bucketizer

2016-11-30 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-18366: --- Assignee: Sandeep Singh > Add handleInvalid to Pyspark for QuantileDiscretizer and

[jira] [Updated] (SPARK-18646) ExecutorClassLoader for spark-shell does not honor spark.executor.userClassPathFirst

2016-11-30 Thread Min Shen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Shen updated SPARK-18646: - Component/s: (was: Spark Core) Spark Shell > ExecutorClassLoader for spark-shell

[jira] [Created] (SPARK-18647) do not put provider in table properties for Hive serde table

2016-11-30 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-18647: --- Summary: do not put provider in table properties for Hive serde table Key: SPARK-18647 URL: https://issues.apache.org/jira/browse/SPARK-18647 Project: Spark

[jira] [Assigned] (SPARK-18645) spark-daemon.sh arguments error lead to throws Unrecognized option

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18645: Assignee: Apache Spark > spark-daemon.sh arguments error lead to throws Unrecognized

[jira] [Commented] (SPARK-18645) spark-daemon.sh arguments error lead to throws Unrecognized option

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15708219#comment-15708219 ] Apache Spark commented on SPARK-18645: -- User 'wangyum' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18645) spark-daemon.sh arguments error lead to throws Unrecognized option

2016-11-30 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18645: Assignee: (was: Apache Spark) > spark-daemon.sh arguments error lead to throws

[jira] [Resolved] (SPARK-17932) Failed to run SQL "show table extended like table_name" in Spark2.0.0

2016-11-30 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-17932. --- Resolution: Fixed Assignee: Jiang Xingbo Fix Version/s: 2.2.0 >

[jira] [Comment Edited] (SPARK-18097) Can't drop a table from Hive if the schema is corrupt

2016-11-30 Thread Thomas Sebastian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654685#comment-15654685 ] Thomas Sebastian edited comment on SPARK-18097 at 11/30/16 11:39 AM: -

[jira] [Commented] (SPARK-18551) Add functionality to delete event logs from the History Server UI

2016-11-30 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15708360#comment-15708360 ] Steve Loughran commented on SPARK-18551: Which JIRA are you using here? FWIW, one issue with

[jira] [Commented] (SPARK-18374) Incorrect words in StopWords/english.txt

2016-11-30 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15708181#comment-15708181 ] Sean Owen commented on SPARK-18374: --- I think you can proceed to remove things like "won" but also

  1   2   >