[jira] [Created] (SPARK-18869) Add lp and pp to plan nodes for getting logical plans and physical plans

2016-12-14 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18869: --- Summary: Add lp and pp to plan nodes for getting logical plans and physical plans Key: SPARK-18869 URL: https://issues.apache.org/jira/browse/SPARK-18869 Project:

[jira] [Assigned] (SPARK-18854) getNodeNumbered and generateTreeString are not consistent

2016-12-14 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin reassigned SPARK-18854: --- Assignee: Reynold Xin > getNodeNumbered and generateTreeString are not consistent >

[jira] [Resolved] (SPARK-18854) getNodeNumbered and generateTreeString are not consistent

2016-12-14 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18854. - Resolution: Fixed Fix Version/s: 2.1.0 2.0.3 Target

[jira] [Commented] (SPARK-18853) Project (UnaryNode) is way too aggressive in estimating statistics

2016-12-14 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749405#comment-15749405 ] Reynold Xin commented on SPARK-18853: - Let's do that separately (I thought about doing it but it

[jira] [Resolved] (SPARK-18730) Ask the build script to link to Jenkins test report page instead of full console output page when posting to GitHub

2016-12-14 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18730. - Resolution: Fixed Fix Version/s: 2.2.0 2.1.0 > Ask the build script to

[jira] [Commented] (SPARK-18853) Project (UnaryNode) is way too aggressive in estimating statistics

2016-12-14 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748964#comment-15748964 ] Reynold Xin commented on SPARK-18853: - Can you say more? Are you talking about deeply nested arrays?

[jira] [Updated] (SPARK-18853) Project (UnaryNode) is way too aggressive in estimating statistics

2016-12-14 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18853: Description: We currently define statistics in UnaryNode: {code} override def statistics:

[jira] [Created] (SPARK-18856) Newly created catalog table assumed to have 0 rows and 0 bytes

2016-12-13 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18856: --- Summary: Newly created catalog table assumed to have 0 rows and 0 bytes Key: SPARK-18856 URL: https://issues.apache.org/jira/browse/SPARK-18856 Project: Spark

[jira] [Updated] (SPARK-18854) getNodeNumbered and generateTreeString are not consistent

2016-12-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18854: Target Version/s: 2.0.3, 2.1.1, 2.2.0 (was: 2.1.1, 2.2.0) > getNodeNumbered and

[jira] [Updated] (SPARK-18854) getNodeNumbered and generateTreeString are not consistent

2016-12-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18854: Description: This is a bug introduced by subquery handling. generateTreeString numbers trees

[jira] [Updated] (SPARK-18854) getNodeNumbered and generateTreeString are not consistent

2016-12-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18854: Description: This is a bug introduced by subquery handling. generateTreeString numbers trees

[jira] [Commented] (SPARK-18854) getNodeNumbered and generateTreeString are not consistent

2016-12-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747080#comment-15747080 ] Reynold Xin commented on SPARK-18854: - To test this, introduce a subquery and call

[jira] [Updated] (SPARK-18854) getNodeNumbered and generateTreeString are not consistent

2016-12-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18854: Description: This is a bug introduced by subquery handling. generateTreeString numbers trees

[jira] [Created] (SPARK-18854) getNodeNumbered and generateTreeString are not consistent

2016-12-13 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18854: --- Summary: getNodeNumbered and generateTreeString are not consistent Key: SPARK-18854 URL: https://issues.apache.org/jira/browse/SPARK-18854 Project: Spark

[jira] [Commented] (SPARK-18854) getNodeNumbered and generateTreeString are not consistent

2016-12-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15747078#comment-15747078 ] Reynold Xin commented on SPARK-18854: - cc [~smilegator] > getNodeNumbered and generateTreeString are

[jira] [Updated] (SPARK-18853) Project (UnaryNode) is way too aggressive in estimating statistics

2016-12-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18853: Summary: Project (UnaryNode) is way too aggressive in estimating statistics (was: Project is way

[jira] [Created] (SPARK-18853) Project is way too aggressive in estimating statistics

2016-12-13 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18853: --- Summary: Project is way too aggressive in estimating statistics Key: SPARK-18853 URL: https://issues.apache.org/jira/browse/SPARK-18853 Project: Spark Issue

[jira] [Commented] (SPARK-18676) Spark 2.x query plan data size estimation can crash join queries versus 1.x

2016-12-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746414#comment-15746414 ] Reynold Xin commented on SPARK-18676: - That's the other option I was considering. It'd be good to

[jira] [Commented] (SPARK-18676) Spark 2.x query plan data size estimation can crash join queries versus 1.x

2016-12-12 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15743362#comment-15743362 ] Reynold Xin commented on SPARK-18676: - Can we just increase the size by 5X if it is a Parquet or ORC

[jira] [Updated] (SPARK-18815) NPE when collecting column stats for string/binary column having only null values

2016-12-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18815: Issue Type: Sub-task (was: Bug) Parent: SPARK-16026 > NPE when collecting column stats

[jira] [Resolved] (SPARK-18815) NPE when collecting column stats for string/binary column having only null values

2016-12-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18815. - Resolution: Fixed Assignee: Zhenhua Wang Fix Version/s: 2.2.0

[jira] [Commented] (SPARK-18814) CheckAnalysis rejects TPCDS query 32

2016-12-09 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15737074#comment-15737074 ] Reynold Xin commented on SPARK-18814: - cc [~hvanhovell] and [~nsyca] > CheckAnalysis rejects TPCDS

[jira] [Commented] (SPARK-18278) Support native submission of spark jobs to a kubernetes cluster

2016-12-08 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15734033#comment-15734033 ] Reynold Xin commented on SPARK-18278: - In the past few days I've given this a lot of thought. I'm

[jira] [Updated] (SPARK-18774) Ignore non-existing files when ignoreCorruptFiles is enabled

2016-12-08 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18774: Fix Version/s: 2.1.1 > Ignore non-existing files when ignoreCorruptFiles is enabled >

[jira] [Resolved] (SPARK-18760) Provide consistent format output for all file formats

2016-12-08 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18760. - Resolution: Fixed Fix Version/s: 2.2.0 2.1.1 > Provide consistent

[jira] [Updated] (SPARK-3359) `sbt/sbt unidoc` doesn't work with Java 8

2016-12-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-3359: --- Fix Version/s: (was: 2.1.1) 2.1.0 > `sbt/sbt unidoc` doesn't work with Java 8

[jira] [Updated] (SPARK-18615) Switch to multi-line doc to avoid a genjavadoc bug for backticks

2016-12-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18615: Fix Version/s: (was: 2.1.1) 2.1.0 > Switch to multi-line doc to avoid a

[jira] [Updated] (SPARK-18685) Fix all tests in ExecutorClassLoaderSuite to pass on Windows

2016-12-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18685: Fix Version/s: (was: 2.1.1) 2.1.0 > Fix all tests in

[jira] [Updated] (SPARK-18645) spark-daemon.sh arguments error lead to throws Unrecognized option

2016-12-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18645: Fix Version/s: (was: 2.1.1) 2.1.0 > spark-daemon.sh arguments error lead to

[jira] [Updated] (SPARK-18762) Web UI should be http:4040 instead of https:4040

2016-12-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18762: Fix Version/s: (was: 2.1.1) 2.1.0 > Web UI should be http:4040 instead of

[jira] [Updated] (SPARK-18546) UnsafeShuffleWriter corrupts encrypted shuffle files when merging

2016-12-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18546: Fix Version/s: (was: 2.1.1) 2.1.0 > UnsafeShuffleWriter corrupts encrypted

[jira] [Updated] (SPARK-18774) Ignore non-existing files when ignoreCorruptFiles is enabled

2016-12-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18774: Fix Version/s: (was: 2.1.1) 2.2.0 > Ignore non-existing files when

[jira] [Resolved] (SPARK-18774) Ignore non-existing files when ignoreCorruptFiles is enabled

2016-12-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18774. - Resolution: Fixed Fix Version/s: 2.1.1 > Ignore non-existing files when

[jira] [Updated] (SPARK-18745) java.lang.IndexOutOfBoundsException running query 68 Spark SQL on (100TB)

2016-12-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18745: Target Version/s: (was: 2.1.0) > java.lang.IndexOutOfBoundsException running query 68 Spark SQL

[jira] [Resolved] (SPARK-18654) JacksonParser.makeRootConverter has effectively unreachable code

2016-12-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18654. - Resolution: Fixed Assignee: Nathan Howell Fix Version/s: 2.2.0 >

[jira] [Created] (SPARK-18775) Limit the max number of records written per file

2016-12-07 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18775: --- Summary: Limit the max number of records written per file Key: SPARK-18775 URL: https://issues.apache.org/jira/browse/SPARK-18775 Project: Spark Issue Type:

[jira] [Created] (SPARK-18760) Provide consistent format output for all file formats

2016-12-06 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18760: --- Summary: Provide consistent format output for all file formats Key: SPARK-18760 URL: https://issues.apache.org/jira/browse/SPARK-18760 Project: Spark Issue

[jira] [Closed] (SPARK-11482) Maven repo in IsolatedClientLoader should be configurable.

2016-12-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-11482. --- Resolution: Later > Maven repo in IsolatedClientLoader should be configurable. >

[jira] [Closed] (SPARK-7263) Add new shuffle manager which stores shuffle blocks in Parquet

2016-12-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-7263. -- Resolution: Later > Add new shuffle manager which stores shuffle blocks in Parquet >

[jira] [Closed] (SPARK-8398) Consistently expose Hadoop Configuration/JobConf parameters for Hadoop input/output formats

2016-12-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-8398. -- Resolution: Later > Consistently expose Hadoop Configuration/JobConf parameters for Hadoop >

[jira] [Resolved] (SPARK-16948) Use metastore schema instead of inferring schema for ORC in HiveMetastoreCatalog

2016-12-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-16948. - Resolution: Fixed Assignee: Eric Liang Fix Version/s: 2.1.0 > Use metastore

[jira] [Updated] (SPARK-18681) Throw Filtering is supported only on partition keys of type string exception

2016-12-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18681: Target Version/s: 2.1.0 > Throw Filtering is supported only on partition keys of type string

[jira] [Commented] (SPARK-18209) More robust view canonicalization without full SQL expansion

2016-12-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726694#comment-15726694 ] Reynold Xin commented on SPARK-18209: - I took a look at the change quickly and here are my high level

[jira] [Comment Edited] (SPARK-18209) More robust view canonicalization without full SQL expansion

2016-12-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15726694#comment-15726694 ] Reynold Xin edited comment on SPARK-18209 at 12/6/16 9:01 PM: -- I took a look

[jira] [Resolved] (SPARK-18555) na.fill miss up original values in long integers

2016-12-05 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18555. - Resolution: Fixed Assignee: Song Jun Fix Version/s: 2.2.0 > na.fill miss up

[jira] [Updated] (SPARK-18284) Scheme of DataFrame generated from RDD is diffrent between master and 2.0

2016-12-05 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18284: Fix Version/s: (was: 2.1.0) 2.2.0 > Scheme of DataFrame generated from RDD

[jira] [Updated] (SPARK-18284) Scheme of DataFrame generated from RDD is different between master and 2.0

2016-12-05 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18284: Summary: Scheme of DataFrame generated from RDD is different between master and 2.0 (was: Scheme

[jira] [Commented] (SPARK-18539) Cannot filter by nonexisting column in parquet file

2016-12-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15721488#comment-15721488 ] Reynold Xin commented on SPARK-18539: - Why don't we fix the parquet reader so it can tolerate

[jira] [Created] (SPARK-18714) Add a simple time function to SparkSession

2016-12-04 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18714: --- Summary: Add a simple time function to SparkSession Key: SPARK-18714 URL: https://issues.apache.org/jira/browse/SPARK-18714 Project: Spark Issue Type: New

[jira] [Updated] (SPARK-18714) SparkSession.time - a simple timer function

2016-12-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18714: Summary: SparkSession.time - a simple timer function (was: Add a simple time function to

[jira] [Resolved] (SPARK-18702) input_file_block_start and input_file_block_length function

2016-12-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18702. - Resolution: Fixed Fix Version/s: 2.2.0 > input_file_block_start and

[jira] [Created] (SPARK-18702) input_file_block_start and input_file_block_length function

2016-12-03 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18702: --- Summary: input_file_block_start and input_file_block_length function Key: SPARK-18702 URL: https://issues.apache.org/jira/browse/SPARK-18702 Project: Spark

[jira] [Commented] (SPARK-8007) Support resolving virtual columns in DataFrames

2016-12-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15718790#comment-15718790 ] Reynold Xin commented on SPARK-8007: spark_partition_id() is available in PySpark starting 1.6. It's

[jira] [Resolved] (SPARK-18362) Use TextFileFormat in implementation of CSVFileFormat

2016-12-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18362. - Resolution: Fixed Fix Version/s: 2.2.0 > Use TextFileFormat in implementation of

[jira] [Resolved] (SPARK-18695) Bump master branch version to 2.2.0-SNAPSHOT

2016-12-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18695. - Resolution: Fixed Fix Version/s: 2.2.0 > Bump master branch version to 2.2.0-SNAPSHOT >

[jira] [Commented] (SPARK-18278) Support native submission of spark jobs to a kubernetes cluster

2016-12-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15717149#comment-15717149 ] Reynold Xin commented on SPARK-18278: - Is there a way to get this working without the project having

[jira] [Updated] (SPARK-18695) Bump master branch version to 2.2.0-SNAPSHOT

2016-12-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18695: Summary: Bump master branch version to 2.2.0-SNAPSHOT (was: Bump master branch version to 2.2.0)

[jira] [Created] (SPARK-18695) Bump master branch version to 2.2.0

2016-12-02 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18695: --- Summary: Bump master branch version to 2.2.0 Key: SPARK-18695 URL: https://issues.apache.org/jira/browse/SPARK-18695 Project: Spark Issue Type: Task

[jira] [Resolved] (SPARK-18690) Backward compatibility of unbounded frames

2016-12-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18690. - Resolution: Fixed Assignee: Maciej Szymkiewicz Fix Version/s: 2.1.0 > Backward

[jira] [Closed] (SPARK-11705) Eliminate unnecessary Cartesian Join

2016-12-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-11705. --- Resolution: Cannot Reproduce > Eliminate unnecessary Cartesian Join >

[jira] [Updated] (SPARK-16845) org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificOrdering" grows beyond 64 KB

2016-12-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16845: Component/s: (was: Java API) >

[jira] [Updated] (SPARK-18661) Creating a partitioned datasource table should not scan all files for table

2016-12-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18661: Issue Type: Sub-task (was: Bug) Parent: SPARK-17861 > Creating a partitioned datasource

[jira] [Updated] (SPARK-18679) Regression in file listing performance

2016-12-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18679: Issue Type: Sub-task (was: Bug) Parent: SPARK-17861 > Regression in file listing

[jira] [Updated] (SPARK-18659) Incorrect behaviors in overwrite table for datasource tables

2016-12-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18659: Issue Type: Sub-task (was: Bug) Parent: SPARK-17861 > Incorrect behaviors in overwrite

[jira] [Resolved] (SPARK-18640) Fix minor synchronization issue in TaskSchedulerImpl.runningTasksByExecutors

2016-12-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18640. - Resolution: Fixed Fix Version/s: 2.1.0 2.0.3 Target

[jira] [Commented] (SPARK-18640) Fix minor synchronization issue in TaskSchedulerImpl.runningTasksByExecutors

2016-12-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15714189#comment-15714189 ] Reynold Xin commented on SPARK-18640: - [~andrewor14] how come you didn't close the ticket? > Fix

[jira] [Resolved] (SPARK-17213) Parquet String Pushdown for Non-Eq Comparisons Broken

2016-12-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-17213. - Resolution: Fixed Fix Version/s: 2.1.0 > Parquet String Pushdown for Non-Eq Comparisons

[jira] [Resolved] (SPARK-18658) Writing to a text DataSource buffers one or more lines in memory

2016-12-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18658. - Resolution: Fixed Fix Version/s: 2.2.0 > Writing to a text DataSource buffers one or more

[jira] [Resolved] (SPARK-18663) Simplify CountMinSketch aggregate implementation

2016-12-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18663. - Resolution: Fixed Fix Version/s: 2.2.0 > Simplify CountMinSketch aggregate implementation

[jira] [Resolved] (SPARK-18639) Build only a single pip package

2016-12-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18639. - Resolution: Fixed Fix Version/s: 2.1.0 > Build only a single pip package >

[jira] [Updated] (SPARK-18617) Close "kryo auto pick" feature for Spark Streaming

2016-12-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18617: Fix Version/s: 2.0.3 > Close "kryo auto pick" feature for Spark Streaming >

[jira] [Resolved] (SPARK-18666) Remove the codes checking deprecated config spark.sql.unsafe.enabled

2016-12-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18666. - Resolution: Fixed Assignee: Liang-Chi Hsieh Fix Version/s: 2.1.0 > Remove the

[jira] [Updated] (SPARK-18658) Writing to a text DataSource buffers one or more lines in memory

2016-12-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18658: Affects Version/s: (was: 2.0.2) > Writing to a text DataSource buffers one or more lines in

[jira] [Updated] (SPARK-18658) Writing to a text DataSource buffers one or more lines in memory

2016-12-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18658: Target Version/s: 2.2.0 > Writing to a text DataSource buffers one or more lines in memory >

[jira] [Updated] (SPARK-18658) Writing to a text DataSource buffers one or more lines in memory

2016-12-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18658: Issue Type: Sub-task (was: Improvement) Parent: SPARK-18352 > Writing to a text

[jira] [Updated] (SPARK-18658) Writing to a text DataSource buffers one or more lines in memory

2016-12-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18658: Assignee: Nathan Howell > Writing to a text DataSource buffers one or more lines in memory >

[jira] [Commented] (SPARK-16026) Cost-based Optimizer framework

2016-11-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710618#comment-15710618 ] Reynold Xin commented on SPARK-16026: - [~ZenWzh] can we start working on operator cardinality

[jira] [Created] (SPARK-18663) Simplify CountMinSketch aggregate implementation

2016-11-30 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18663: --- Summary: Simplify CountMinSketch aggregate implementation Key: SPARK-18663 URL: https://issues.apache.org/jira/browse/SPARK-18663 Project: Spark Issue Type:

[jira] [Commented] (SPARK-18536) Failed to save to hive table when case class with empty field

2016-11-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709551#comment-15709551 ] Reynold Xin commented on SPARK-18536: - We need to add a PreWriteCheck for Parquet. > Failed to save

[jira] [Updated] (SPARK-18536) Failed to save to hive table when case class with empty field

2016-11-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18536: Description: {code}import scala.collection.mutable.Queue import org.apache.spark.SparkConf import

[jira] [Resolved] (SPARK-18220) ClassCastException occurs when using select query on ORC file

2016-11-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18220. - Resolution: Fixed Fix Version/s: 2.1.0 > ClassCastException occurs when using select

[jira] [Resolved] (SPARK-18617) Close "kryo auto pick" feature for Spark Streaming

2016-11-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18617. - Resolution: Fixed Assignee: Genmao Yu Fix Version/s: 2.1.0 > Close "kryo auto

[jira] [Resolved] (SPARK-18145) Update documentation for hive partition management in 2.1

2016-11-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18145. - Resolution: Fixed Assignee: Eric Liang Fix Version/s: 2.1.0 > Update

[jira] [Resolved] (SPARK-17861) Store data source partitions in metastore and push partition pruning into metastore

2016-11-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-17861. - Resolution: Fixed Fix Version/s: 2.1.0 > Store data source partitions in metastore and

[jira] [Resolved] (SPARK-18632) AggregateFunction should not ImplicitCastInputTypes

2016-11-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18632. - Resolution: Fixed Fix Version/s: 2.2.0 > AggregateFunction should not

[jira] [Created] (SPARK-18639) Build only a single pip package

2016-11-29 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-18639: --- Summary: Build only a single pip package Key: SPARK-18639 URL: https://issues.apache.org/jira/browse/SPARK-18639 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-18429) SQL aggregate function for CountMinSketch

2016-11-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18429: Issue Type: Sub-task (was: New Feature) Parent: SPARK-16026 > SQL aggregate function for

[jira] [Resolved] (SPARK-18429) SQL aggregate function for CountMinSketch

2016-11-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18429. - Resolution: Fixed Assignee: Zhenhua Wang Fix Version/s: 2.2.0 > SQL aggregate

[jira] [Updated] (SPARK-18429) SQL aggregate function for CountMinSketch

2016-11-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18429: Summary: SQL aggregate function for CountMinSketch (was: implement a new Aggregate for

[jira] [Updated] (SPARK-18632) AggregateFunction should not ImplicitCastInputTypes

2016-11-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18632: Target Version/s: 2.2.0 > AggregateFunction should not ImplicitCastInputTypes >

[jira] [Commented] (SPARK-17204) Spark 2.0 off heap RDD persistence with replication factor 2 leads to in-memory data corruption

2016-11-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15706069#comment-15706069 ] Reynold Xin commented on SPARK-17204: - local-cluster is different from the local mode. It is a local

[jira] [Commented] (SPARK-17204) Spark 2.0 off heap RDD persistence with replication factor 2 leads to in-memory data corruption

2016-11-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15706030#comment-15706030 ] Reynold Xin commented on SPARK-17204: - Can you try repro this using the local-cluster mode? > Spark

[jira] [Commented] (SPARK-18352) Parse normal, multi-line JSON files (not just JSON Lines)

2016-11-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15705944#comment-15705944 ] Reynold Xin commented on SPARK-18352: - I've asked [~joshrosen] to do that only for the text format,

[jira] [Updated] (SPARK-15689) Data source API v2

2016-11-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-15689: Labels: releasenotes (was: ) > Data source API v2 > -- > > Key:

[jira] [Updated] (SPARK-18350) Support session local timezone

2016-11-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18350: Labels: releasenotes (was: ) > Support session local timezone > -- >

[jira] [Updated] (SPARK-18352) Parse normal, multi-line JSON files (not just JSON Lines)

2016-11-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18352: Labels: releasenotes (was: ) > Parse normal, multi-line JSON files (not just JSON Lines) >

[jira] [Updated] (SPARK-16475) Broadcast Hint for SQL Queries

2016-11-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-16475: Labels: releasenotes (was: ) > Broadcast Hint for SQL Queries > -- >

[jira] [Updated] (SPARK-18590) R - Include package vignettes and help pages, build source package in Spark distribution

2016-11-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18590: Issue Type: New Feature (was: Bug) > R - Include package vignettes and help pages, build source

[jira] [Updated] (SPARK-18590) R - Include package vignettes and help pages, build source package in Spark distribution

2016-11-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18590: Priority: Major (was: Blocker) > R - Include package vignettes and help pages, build source

[jira] [Updated] (SPARK-18590) R - Include package vignettes and help pages, build source package in Spark distribution

2016-11-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18590: Target Version/s: (was: 2.1.0) > R - Include package vignettes and help pages, build source

<    1   2   3   4   5   6   7   8   9   10   >