[jira] [Commented] (SPARK-17307) Document what all access is needed on S3 bucket when trying to save a model

2016-09-06 Thread Aseem Bansal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1540#comment-1540 ] Aseem Bansal commented on SPARK-17307: -- Not adding it there would be fine. But there needs to be

[jira] [Assigned] (SPARK-17410) Move Hive-generated Stats Info to HiveClientImpl

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17410: Assignee: (was: Apache Spark) > Move Hive-generated Stats Info to HiveClientImpl >

[jira] [Commented] (SPARK-17284) Remove statistics-related table properties from SHOW CREATE TABLE

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15466688#comment-15466688 ] Apache Spark commented on SPARK-17284: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17410) Move Hive-generated Stats Info to HiveClientImpl

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17410: Assignee: Apache Spark > Move Hive-generated Stats Info to HiveClientImpl >

[jira] [Reopened] (SPARK-11301) filter on partitioned column is case sensitive even the context is case insensitive

2016-09-06 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reopened SPARK-11301: --- Please see the followings.

[jira] [Resolved] (SPARK-17369) MetastoreRelation toJSON throws exception

2016-09-06 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17369. - Resolution: Fixed Fix Version/s: 2.0.1 > MetastoreRelation toJSON throws exception >

[jira] [Commented] (SPARK-11301) filter on partitioned column is case sensitive even the context is case insensitive

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1546#comment-1546 ] Apache Spark commented on SPARK-11301: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Commented] (SPARK-17405) Simple aggregation query OOMing after SPARK-16525

2016-09-06 Thread Qifan Pu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15466676#comment-15466676 ] Qifan Pu commented on SPARK-17405: -- [~joshrosen] Thanks for reporting. I haven't been able to reproduce

[jira] [Created] (SPARK-17410) Move Hive-generated Stats Info to HiveClientImpl

2016-09-06 Thread Xiao Li (JIRA)
Xiao Li created SPARK-17410: --- Summary: Move Hive-generated Stats Info to HiveClientImpl Key: SPARK-17410 URL: https://issues.apache.org/jira/browse/SPARK-17410 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-17396) Threads number keep increasing when query on external CSV partitioned table

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15466879#comment-15466879 ] Sean Owen commented on SPARK-17396: --- Yeah [~rdblue] is right on the mark then. I agree, I wasn't clear

[jira] [Issue Comment Deleted] (SPARK-12844) Spark documentation should be more precise about the algebraic properties of functions in various transformations

2016-09-06 Thread Jagadeesan A S (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jagadeesan A S updated SPARK-12844: --- Comment: was deleted (was: Started working on this.) > Spark documentation should be more

[jira] [Updated] (SPARK-17397) Show example of what to do when awaitTermination() throws an Exception

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-17397: -- Component/s: Documentation Issue Type: Improvement (was: Question) Summary: Show example

[jira] [Commented] (SPARK-17397) what to do when awaitTermination() throws?

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15466898#comment-15466898 ] Sean Owen commented on SPARK-17397: --- I think it's reasonable to show the try { awaitTermination() }

[jira] [Commented] (SPARK-17400) MinMaxScaler.transform() outputs DenseVector by default, which causes poor performance

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15466905#comment-15466905 ] Sean Owen commented on SPARK-17400: --- [~mlnick] is right -- scaling any sparse representation is going

[jira] [Commented] (SPARK-12844) Spark documentation should be more precise about the algebraic properties of functions in various transformations

2016-09-06 Thread Jagadeesan A S (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15466916#comment-15466916 ] Jagadeesan A S commented on SPARK-12844: The algebraic properties have already been taken care by

[jira] [Commented] (SPARK-17400) MinMaxScaler.transform() outputs DenseVector by default, which causes poor performance

2016-09-06 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15466957#comment-15466957 ] Nick Pentreath commented on SPARK-17400: Could you explain further why you want to min-max scale

[jira] [Resolved] (SPARK-17361) file-based external table without path should not be created

2016-09-06 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17361. - Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14921

[jira] [Created] (SPARK-17419) Mesos virtual network support

2016-09-06 Thread Michael Gummelt (JIRA)
Michael Gummelt created SPARK-17419: --- Summary: Mesos virtual network support Key: SPARK-17419 URL: https://issues.apache.org/jira/browse/SPARK-17419 Project: Spark Issue Type: Task

[jira] [Commented] (SPARK-17403) Fatal Error: Scan cached strings

2016-09-06 Thread Ruben Hernando (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468528#comment-15468528 ] Ruben Hernando commented on SPARK-17403: I'm sorry I can't share the data. This is a 2 tables

[jira] [Updated] (SPARK-17377) Joining Datasets read and aggregated from a partitioned Parquet file gives wrong results

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-17377: --- Description: Reproduction: 1) Read two Datasets from a partitioned Parquet file with different

[jira] [Commented] (SPARK-17316) Don't block StandaloneSchedulerBackend.executorRemoved

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468566#comment-15468566 ] Apache Spark commented on SPARK-17316: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17377) Joining Datasets read and aggregated from a partitioned Parquet file gives wrong results

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-17377: -- Assignee: Davies Liu > Joining Datasets read and aggregated from a partitioned Parquet file

[jira] [Commented] (SPARK-17377) Joining Datasets read and aggregated from a partitioned Parquet file gives wrong results

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468583#comment-15468583 ] Davies Liu commented on SPARK-17377: Tested this with latest master and 2.0 on databricks[1], they

[jira] [Resolved] (SPARK-17299) TRIM/LTRIM/RTRIM strips characters other than spaces

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17299. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull

[jira] [Commented] (SPARK-17403) Fatal Error: Scan cached strings

2016-09-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468449#comment-15468449 ] Davies Liu commented on SPARK-17403: [~rhernando] Could you pull out the string column (SL_RD_ColR_N)

[jira] [Commented] (SPARK-17417) Fix # of partitions for RDD while checkpointing - Currently limited by 10000(%05d)

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468519#comment-15468519 ] Sean Owen commented on SPARK-17417: --- I'd bump the padding to allow 10 digits, because that would

[jira] [Commented] (SPARK-17420) Install rmarkdown R package on Jenkins machines

2016-09-06 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468539#comment-15468539 ] Shivaram Venkataraman commented on SPARK-17420: --- This came up in

[jira] [Resolved] (SPARK-17378) Upgrade snappy-java to 1.1.2.6

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17378. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 1.6.3

[jira] [Commented] (SPARK-17417) Fix # of partitions for RDD while checkpointing - Currently limited by 10000(%05d)

2016-09-06 Thread Dhruve Ashar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468622#comment-15468622 ] Dhruve Ashar commented on SPARK-17417: -- Thanks for the suggestion. I'll work on the changes and

[jira] [Created] (SPARK-17422) Update Ganglia project with new license

2016-09-06 Thread Luciano Resende (JIRA)
Luciano Resende created SPARK-17422: --- Summary: Update Ganglia project with new license Key: SPARK-17422 URL: https://issues.apache.org/jira/browse/SPARK-17422 Project: Spark Issue Type:

[jira] [Commented] (SPARK-17368) Scala value classes create encoder problems and break at runtime

2016-09-06 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468833#comment-15468833 ] Jakob Odersky commented on SPARK-17368: --- So I thought about this a bit more and although it is

[jira] [Updated] (SPARK-17422) Update Ganglia project with new license

2016-09-06 Thread Luciano Resende (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luciano Resende updated SPARK-17422: Description: It seems that Ganglia is now BSD licensed http://ganglia.info/ and

[jira] [Comment Edited] (SPARK-17368) Scala value classes create encoder problems and break at runtime

2016-09-06 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468833#comment-15468833 ] Jakob Odersky edited comment on SPARK-17368 at 9/6/16 10:57 PM: So I

[jira] [Commented] (SPARK-16026) Cost-based Optimizer framework

2016-09-06 Thread Srinath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468891#comment-15468891 ] Srinath commented on SPARK-16026: - I have a couple of comments/questions on the proposal. Regarding the

[jira] [Created] (SPARK-17424) Dataset job fails from unsound substitution in ScalaReflect

2016-09-06 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-17424: - Summary: Dataset job fails from unsound substitution in ScalaReflect Key: SPARK-17424 URL: https://issues.apache.org/jira/browse/SPARK-17424 Project: Spark Issue

[jira] [Assigned] (SPARK-17396) Threads number keep increasing when query on external CSV partitioned table

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17396: Assignee: (was: Apache Spark) > Threads number keep increasing when query on external

[jira] [Assigned] (SPARK-17396) Threads number keep increasing when query on external CSV partitioned table

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17396: Assignee: Apache Spark > Threads number keep increasing when query on external CSV

[jira] [Commented] (SPARK-17296) Spark SQL: cross join + two joins = BUG

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468928#comment-15468928 ] Apache Spark commented on SPARK-17296: -- User 'hvanhovell' has created a pull request for this issue:

[jira] [Commented] (SPARK-17421) Warnings about "MaxPermSize" parameter when building with Maven and Java 8

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468943#comment-15468943 ] Apache Spark commented on SPARK-17421: -- User 'frreiss' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17421) Warnings about "MaxPermSize" parameter when building with Maven and Java 8

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17421: Assignee: (was: Apache Spark) > Warnings about "MaxPermSize" parameter when building

[jira] [Assigned] (SPARK-17421) Warnings about "MaxPermSize" parameter when building with Maven and Java 8

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17421: Assignee: Apache Spark > Warnings about "MaxPermSize" parameter when building with Maven

[jira] [Resolved] (SPARK-15891) Make YARN logs less noisy

2016-09-06 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-15891. Resolution: Fixed Assignee: Marcelo Vanzin Fix Version/s: 2.1.0 > Make

[jira] [Created] (SPARK-17423) Support IGNORE NULLS option in Window functions

2016-09-06 Thread Tim Chan (JIRA)
Tim Chan created SPARK-17423: Summary: Support IGNORE NULLS option in Window functions Key: SPARK-17423 URL: https://issues.apache.org/jira/browse/SPARK-17423 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-17253) Left join where ON clause does not reference the right table produces analysis error

2016-09-06 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-17253. --- Resolution: Duplicate Assignee: Herman van Hovell Fix Version/s:

[jira] [Resolved] (SPARK-17296) Spark SQL: cross join + two joins = BUG

2016-09-06 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-17296. --- Resolution: Fixed Fix Version/s: 2.1.0 > Spark SQL: cross join + two joins =

[jira] [Assigned] (SPARK-17296) Spark SQL: cross join + two joins = BUG

2016-09-06 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell reassigned SPARK-17296: - Assignee: Herman van Hovell > Spark SQL: cross join + two joins = BUG >

[jira] [Updated] (SPARK-17424) Dataset job fails from unsound substitution in ScalaReflect

2016-09-06 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-17424: -- Description: I have a job that uses datasets in 1.6.1 and is failing with this error: {code} 16/09/02

[jira] [Commented] (SPARK-17396) Threads number keep increasing when query on external CSV partitioned table

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468929#comment-15468929 ] Apache Spark commented on SPARK-17396: -- User 'rdblue' has created a pull request for this issue:

[jira] [Commented] (SPARK-17396) Threads number keep increasing when query on external CSV partitioned table

2016-09-06 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15468932#comment-15468932 ] Ryan Blue commented on SPARK-17396: --- I opened a PR with a fix. It still uses a ForkJoinPool because the

[jira] [Resolved] (SPARK-17356) A large Metadata filed in Alias can cause OOM when calling TreeNode.toJSON

2016-09-06 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17356. - Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 > A large Metadata filed

[jira] [Commented] (SPARK-5091) Hooks for PySpark tasks

2016-09-06 Thread Semet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15466855#comment-15466855 ] Semet commented on SPARK-5091: -- It is a better option to use virtualenv and proper installation with pip,

[jira] [Updated] (SPARK-17356) A large Metadata filed in Alias can cause OOM when calling TreeNode.toJSON

2016-09-06 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-17356: Assignee: Sean Zhong > A large Metadata filed in Alias can cause OOM when calling TreeNode.toJSON

[jira] [Assigned] (SPARK-11301) filter on partitioned column is case sensitive even the context is case insensitive

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11301: Assignee: Apache Spark (was: Wenchen Fan) > filter on partitioned column is case

[jira] [Assigned] (SPARK-11301) filter on partitioned column is case sensitive even the context is case insensitive

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11301: Assignee: Wenchen Fan (was: Apache Spark) > filter on partitioned column is case

[jira] [Assigned] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17425: Assignee: Apache Spark > Override sameResult in HiveTableScanExec to make ReusedExchange

[jira] [Assigned] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17425: Assignee: (was: Apache Spark) > Override sameResult in HiveTableScanExec to make

[jira] [Commented] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReusedExchange work in text format table

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469249#comment-15469249 ] Apache Spark commented on SPARK-17425: -- User 'watermen' has created a pull request for this issue:

[jira] [Updated] (SPARK-17425) Override sameResult in HiveTableScanExec to make ReuseExchange work in text format table

2016-09-06 Thread Yadong Qi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated SPARK-17425: -- Summary: Override sameResult in HiveTableScanExec to make ReuseExchange work in text format table

[jira] [Updated] (SPARK-17279) better error message for exceptions during ScalaUDF execution

2016-09-06 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-17279: Fix Version/s: 2.0.1 > better error message for exceptions during ScalaUDF execution >

[jira] [Resolved] (SPARK-17372) Running a file stream on a directory with partitioned subdirs throw NotSerializableException/StackOverflowError

2016-09-06 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-17372. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by

[jira] [Commented] (SPARK-17110) Pyspark with locality ANY throw java.io.StreamCorruptedException

2016-09-06 Thread Tomer Kaftan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469295#comment-15469295 ] Tomer Kaftan commented on SPARK-17110: -- Thanks all who helped out with this! > Pyspark with

[jira] [Commented] (SPARK-17381) Memory leak org.apache.spark.sql.execution.ui.SQLTaskMetrics

2016-09-06 Thread Joao Duarte (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467102#comment-15467102 ] Joao Duarte commented on SPARK-17381: - Well, the application is stable after 24h+ (and running). If

[jira] [Resolved] (SPARK-11301) filter on partitioned column is case sensitive even the context is case insensitive

2016-09-06 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-11301. - Resolution: Fixed Fix Version/s: (was: 1.6.0) 1.6.2 > filter on

[jira] [Commented] (SPARK-8813) Combine files when there're many small files in table

2016-09-06 Thread Harsh J (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467084#comment-15467084 ] Harsh J commented on SPARK-8813: Note that this was done instead by SPARK-13664 and should me marked

[jira] [Commented] (SPARK-17307) Document what all access is needed on S3 bucket when trying to save a model

2016-09-06 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467247#comment-15467247 ] Steve Loughran commented on SPARK-17307: It's not yet in there. If you got the SPARK-7481 JIRA

[jira] [Comment Edited] (SPARK-8813) Combine files when there're many small files in table

2016-09-06 Thread Harsh J (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467084#comment-15467084 ] Harsh J edited comment on SPARK-8813 at 9/6/16 10:37 AM: - Note that this was done

[jira] [Commented] (SPARK-17381) Memory leak org.apache.spark.sql.execution.ui.SQLTaskMetrics

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467108#comment-15467108 ] Sean Owen commented on SPARK-17381: --- The issue is that it's maintaining min/max stats for columns,

[jira] [Updated] (SPARK-17381) Memory leak org.apache.spark.sql.execution.ui.SQLTaskMetrics

2016-09-06 Thread Joao Duarte (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joao Duarte updated SPARK-17381: Issue Type: Improvement (was: Bug) > Memory leak

[jira] [Updated] (SPARK-17356) A large Metadata filed in Alias can cause OOM when calling TreeNode.toJSON

2016-09-06 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-17356: Fix Version/s: 1.6.3 > A large Metadata filed in Alias can cause OOM when calling TreeNode.toJSON

[jira] [Commented] (SPARK-17381) Memory leak org.apache.spark.sql.execution.ui.SQLTaskMetrics

2016-09-06 Thread Joao Duarte (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467224#comment-15467224 ] Joao Duarte commented on SPARK-17381: - Oh, I see. I'll change the Issue type from Bug to Improvement

[jira] [Reopened] (SPARK-8813) Combine files when there're many small files in table

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reopened SPARK-8813: -- > Combine files when there're many small files in table >

[jira] [Resolved] (SPARK-8813) Combine files when there're many small files in table

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-8813. -- Resolution: Duplicate Assignee: (was: Michael Armbrust) Fix Version/s: (was:

[jira] [Commented] (SPARK-17381) Memory leak org.apache.spark.sql.execution.ui.SQLTaskMetrics

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467322#comment-15467322 ] Sean Owen commented on SPARK-17381: --- Yeah, I didn't mean disable particular types of stats for

[jira] [Created] (SPARK-17411) Cannot set fromOffsets in createDirectStream function

2016-09-06 Thread Piotr Milanowski (JIRA)
Piotr Milanowski created SPARK-17411: Summary: Cannot set fromOffsets in createDirectStream function Key: SPARK-17411 URL: https://issues.apache.org/jira/browse/SPARK-17411 Project: Spark

[jira] [Assigned] (SPARK-17306) QuantileSummaries doesn't compress

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17306: Assignee: (was: Apache Spark) > QuantileSummaries doesn't compress >

[jira] [Commented] (SPARK-17306) QuantileSummaries doesn't compress

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467466#comment-15467466 ] Apache Spark commented on SPARK-17306: -- User 'srowen' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17306) QuantileSummaries doesn't compress

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17306: Assignee: Apache Spark > QuantileSummaries doesn't compress >

[jira] [Comment Edited] (SPARK-17412) FsHistoryProviderSuite - FAILED

2016-09-06 Thread Amita Chaudhary (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467479#comment-15467479 ] Amita Chaudhary edited comment on SPARK-17412 at 9/6/16 2:04 PM: - yes, it

[jira] [Commented] (SPARK-17413) spark-shell loses gnu readline support after suspend and continue

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467498#comment-15467498 ] Sean Owen commented on SPARK-17413: --- Is this a Spark issue though? > spark-shell loses gnu readline

[jira] [Closed] (SPARK-17411) Cannot set fromOffsets in createDirectStream function

2016-09-06 Thread Piotr Milanowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Milanowski closed SPARK-17411. Resolution: Fixed Duplicate of https://issues.apache.org/jira/browse/SPARK-16950 > Cannot

[jira] [Commented] (SPARK-17412) FsHistoryProviderSuite - FAILED

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467470#comment-15467470 ] Sean Owen commented on SPARK-17412: --- Does this fail consistently? It seems to pass in master. It could

[jira] [Commented] (SPARK-17411) Cannot set fromOffsets in createDirectStream function

2016-09-06 Thread Piotr Milanowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467469#comment-15467469 ] Piotr Milanowski commented on SPARK-17411: -- I'll just add that I am using Python 3.5 and

[jira] [Commented] (SPARK-17412) FsHistoryProviderSuite - FAILED

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467501#comment-15467501 ] Sean Owen commented on SPARK-17412: --- It's not failing in the Spark CI environment, which suggests the

[jira] [Commented] (SPARK-3261) KMeans clusterer can return duplicate cluster centers

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467564#comment-15467564 ] Apache Spark commented on SPARK-3261: - User 'srowen' has created a pull request for this issue:

[jira] [Commented] (SPARK-6235) Address various 2G limits

2016-09-06 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467563#comment-15467563 ] Apache Spark commented on SPARK-6235: - User 'witgo' has created a pull request for this issue:

[jira] [Reopened] (SPARK-17411) Cannot set fromOffsets in createDirectStream function

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reopened SPARK-17411: --- "Fixed" is not the right resolution -- Duplicate is more useful. > Cannot set fromOffsets in

[jira] [Resolved] (SPARK-17411) Cannot set fromOffsets in createDirectStream function

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17411. --- Resolution: Duplicate > Cannot set fromOffsets in createDirectStream function >

[jira] [Created] (SPARK-17413) spark-shell loses gnu readline support after suspend and continue

2016-09-06 Thread Carl Zmola (JIRA)
Carl Zmola created SPARK-17413: -- Summary: spark-shell loses gnu readline support after suspend and continue Key: SPARK-17413 URL: https://issues.apache.org/jira/browse/SPARK-17413 Project: Spark

[jira] [Commented] (SPARK-17412) FsHistoryProviderSuite - FAILED

2016-09-06 Thread Amita Chaudhary (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467479#comment-15467479 ] Amita Chaudhary commented on SPARK-17412: - yes, it is failing consistently for me, is there any

[jira] [Commented] (SPARK-17413) spark-shell loses gnu readline support after suspend and continue

2016-09-06 Thread Carl Zmola (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467511#comment-15467511 ] Carl Zmola commented on SPARK-17413: I don't know. Is there an upstream project that I can check

[jira] [Resolved] (SPARK-17374) Improves the error message when fails to parse some json file lines in DataFrameReader

2016-09-06 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-17374. - Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14929

[jira] [Commented] (SPARK-17413) spark-shell loses gnu readline support after suspend and continue

2016-09-06 Thread Carl Zmola (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467544#comment-15467544 ] Carl Zmola commented on SPARK-17413: Our comments crossed paths. It doesn't work in scala 2.9.2. or

[jira] [Updated] (SPARK-17374) Improves the error message when fails to parse some json file lines in DataFrameReader

2016-09-06 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-17374: Assignee: Sean Zhong > Improves the error message when fails to parse some json file lines in >

[jira] [Created] (SPARK-17412) FsHistoryProviderSuite - FAILED

2016-09-06 Thread Amita Chaudhary (JIRA)
Amita Chaudhary created SPARK-17412: --- Summary: FsHistoryProviderSuite - FAILED Key: SPARK-17412 URL: https://issues.apache.org/jira/browse/SPARK-17412 Project: Spark Issue Type: Test

[jira] [Updated] (SPARK-17306) QuantileSummaries doesn't compress

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-17306: -- Summary: QuantileSummaries doesn't compress (was: Memory leak in QuantileSummaries) >

[jira] [Commented] (SPARK-17413) spark-shell loses gnu readline support after suspend and continue

2016-09-06 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467527#comment-15467527 ] Sean Owen commented on SPARK-17413: --- I was going to say the Scala shell though that seems to work (in

[jira] [Commented] (SPARK-17413) spark-shell loses gnu readline support after suspend and continue

2016-09-06 Thread Carl Zmola (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467525#comment-15467525 ] Carl Zmola commented on SPARK-17413: The problem exists upstream with the Scala shell. I will file a

[jira] [Comment Edited] (SPARK-17413) spark-shell loses gnu readline support after suspend and continue

2016-09-06 Thread Carl Zmola (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467511#comment-15467511 ] Carl Zmola edited comment on SPARK-17413 at 9/6/16 2:21 PM: I don't know. Is

[jira] [Created] (SPARK-17414) Set type is not supported for creating data frames

2016-09-06 Thread Emre Colak (JIRA)
Emre Colak created SPARK-17414: -- Summary: Set type is not supported for creating data frames Key: SPARK-17414 URL: https://issues.apache.org/jira/browse/SPARK-17414 Project: Spark Issue Type:

[jira] [Commented] (SPARK-17396) Threads number keep increasing when query on external CSV partitioned table

2016-09-06 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15467956#comment-15467956 ] Ryan Blue commented on SPARK-17396: --- I'll put together a patch for this with a shared executor service.

  1   2   >