[jira] [Updated] (SPARK-21404) Simple Vectorized Python UDFs using Arrow

2017-10-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-21404: Issue Type: Sub-task (was: Improvement) Parent: SPARK-22216 > Simple Vectorized Python

[jira] [Commented] (SPARK-22216) Improving PySpark/Pandas interoperability

2017-10-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16195596#comment-16195596 ] Reynold Xin commented on SPARK-22216: - What you'd want to do is to move those to become subtasks. I

[jira] [Assigned] (SPARK-15689) Data source API v2

2017-10-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin reassigned SPARK-15689: --- Assignee: Wenchen Fan > Data source API v2 > -- > > Key:

[jira] [Resolved] (SPARK-22160) Allow changing sample points per partition in range shuffle exchange

2017-09-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-22160. - Resolution: Fixed Fix Version/s: 2.3.0 > Allow changing sample points per partition in

[jira] [Closed] (SPARK-15687) Columnar execution engine

2017-09-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-15687. --- Resolution: Later Actually closing this, since with whole-stage code generation, it's unclear what

[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-09-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16185166#comment-16185166 ] Reynold Xin commented on SPARK-21190: - OK it would be great to have a better error message, e.g.

[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-09-28 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16185155#comment-16185155 ] Reynold Xin commented on SPARK-21190: - Where did we settle on 0-arg UDFs? I think we should just

[jira] [Created] (SPARK-22160) Allow changing sample points per partition in range shuffle exchange

2017-09-28 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-22160: --- Summary: Allow changing sample points per partition in range shuffle exchange Key: SPARK-22160 URL: https://issues.apache.org/jira/browse/SPARK-22160 Project: Spark

[jira] [Created] (SPARK-22159) spark.sql.execution.arrow.enable and spark.sql.codegen.aggregate.map.twolevel.enable -> enabled

2017-09-28 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-22159: --- Summary: spark.sql.execution.arrow.enable and spark.sql.codegen.aggregate.map.twolevel.enable -> enabled Key: SPARK-22159 URL: https://issues.apache.org/jira/browse/SPARK-22159

[jira] [Created] (SPARK-22153) Rename ShuffleExchange -> ShuffleExchangeExec

2017-09-27 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-22153: --- Summary: Rename ShuffleExchange -> ShuffleExchangeExec Key: SPARK-22153 URL: https://issues.apache.org/jira/browse/SPARK-22153 Project: Spark Issue Type: Task

[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-09-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16177886#comment-16177886 ] Reynold Xin commented on SPARK-21190: - Maybe create an umbrella ticket so it is easier to link. >

[jira] [Commented] (SPARK-21914) Running examples as tests in SQL builtin function documentation

2017-09-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16157542#comment-16157542 ] Reynold Xin commented on SPARK-21914: - Thanks for pinging me. Given the test coverage for this is

[jira] [Commented] (SPARK-21867) Support async spilling in UnsafeShuffleWriter

2017-08-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16148090#comment-16148090 ] Reynold Xin commented on SPARK-21867: - This makes sense. The devil is in the details though (e.g. how

[jira] [Commented] (SPARK-15689) Data source API v2

2017-08-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16145309#comment-16145309 ] Reynold Xin commented on SPARK-15689: - That seems like an issue orthogonal to the API described here.

[jira] [Commented] (SPARK-15689) Data source API v2

2017-08-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138607#comment-16138607 ] Reynold Xin commented on SPARK-15689: - Not the author but my guess is that the other approach

[jira] [Updated] (SPARK-21778) Simpler Dataset.sample API in Scala / Java

2017-08-17 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-21778: Summary: Simpler Dataset.sample API in Scala / Java (was: Simpler Dataset.sample API in Scala) >

[jira] [Created] (SPARK-21779) Simpler Dataset.sample API in Python

2017-08-17 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-21779: --- Summary: Simpler Dataset.sample API in Python Key: SPARK-21779 URL: https://issues.apache.org/jira/browse/SPARK-21779 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-21780) Simpler Dataset.sample API in R

2017-08-17 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-21780: --- Summary: Simpler Dataset.sample API in R Key: SPARK-21780 URL: https://issues.apache.org/jira/browse/SPARK-21780 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-21778) Simpler Dataset.sample API in Scala

2017-08-17 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-21778: --- Summary: Simpler Dataset.sample API in Scala Key: SPARK-21778 URL: https://issues.apache.org/jira/browse/SPARK-21778 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-21777) Simpler Dataset.sample API

2017-08-17 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-21777: --- Summary: Simpler Dataset.sample API Key: SPARK-21777 URL: https://issues.apache.org/jira/browse/SPARK-21777 Project: Spark Issue Type: New Feature

[jira] [Created] (SPARK-21726) Check for structural integrity of the plan in QO in test mode

2017-08-14 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-21726: --- Summary: Check for structural integrity of the plan in QO in test mode Key: SPARK-21726 URL: https://issues.apache.org/jira/browse/SPARK-21726 Project: Spark

[jira] [Commented] (SPARK-21726) Check for structural integrity of the plan in QO in test mode

2017-08-14 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126115#comment-16126115 ] Reynold Xin commented on SPARK-21726: - cc [~viirya] would you be interested in doing this? > Check

[jira] [Updated] (SPARK-21699) Remove unused getTableOption in ExternalCatalog

2017-08-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-21699: Fix Version/s: 2.2.1 > Remove unused getTableOption in ExternalCatalog >

[jira] [Resolved] (SPARK-21699) Remove unused getTableOption in ExternalCatalog

2017-08-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-21699. - Resolution: Fixed Fix Version/s: 2.3.0 > Remove unused getTableOption in ExternalCatalog

[jira] [Created] (SPARK-21699) Remove unused getTableOption in ExternalCatalog

2017-08-10 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-21699: --- Summary: Remove unused getTableOption in ExternalCatalog Key: SPARK-21699 URL: https://issues.apache.org/jira/browse/SPARK-21699 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-21669) Internal API for collecting metrics/stats during FileFormatWriter jobs

2017-08-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-21669. - Resolution: Fixed Assignee: Adrian Ionescu Fix Version/s: 2.3.0 > Internal API

[jira] [Resolved] (SPARK-21551) pyspark's collect fails when getaddrinfo is too slow

2017-08-09 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-21551. - Resolution: Fixed Assignee: peay Fix Version/s: 2.3.0 > pyspark's collect fails

[jira] [Closed] (SPARK-21362) Add JDBCDialect for Apache Drill

2017-08-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-21362. --- Resolution: Won't Fix See my comment on github ... > Add JDBCDialect for Apache Drill >

[jira] [Created] (SPARK-21644) LocalLimit.maxRows is defined incorrectly

2017-08-04 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-21644: --- Summary: LocalLimit.maxRows is defined incorrectly Key: SPARK-21644 URL: https://issues.apache.org/jira/browse/SPARK-21644 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-21634) Change OneRowRelation from a case object to case class

2017-08-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114689#comment-16114689 ] Reynold Xin commented on SPARK-21634: - Done in https://github.com/apache/spark/pull/18839 > Change

[jira] [Created] (SPARK-21634) Change OneRowRelation from a case object to case class

2017-08-03 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-21634: --- Summary: Change OneRowRelation from a case object to case class Key: SPARK-21634 URL: https://issues.apache.org/jira/browse/SPARK-21634 Project: Spark Issue

[jira] [Commented] (SPARK-21619) Fail the execution of canonicalized plans explicitly

2017-08-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113598#comment-16113598 ] Reynold Xin commented on SPARK-21619: - Just look at structured streaming. That eould be one example.

[jira] [Commented] (SPARK-21619) Fail the execution of canonicalized plans explicitly

2017-08-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113593#comment-16113593 ] Reynold Xin commented on SPARK-21619: - Just generate different physical plan? > Fail the

[jira] [Commented] (SPARK-21619) Fail the execution of canonicalized plans explicitly

2017-08-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113579#comment-16113579 ] Reynold Xin commented on SPARK-21619: - Ok so we are good with this one. Sorry I don't see why this

[jira] [Commented] (SPARK-21619) Fail the execution of canonicalized plans explicitly

2017-08-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113538#comment-16113538 ] Reynold Xin commented on SPARK-21619: - Also self-joins are very difficult to handle. They have

[jira] [Commented] (SPARK-21619) Fail the execution of canonicalized plans explicitly

2017-08-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113536#comment-16113536 ] Reynold Xin commented on SPARK-21619: - Mark that's a great point but you are going into the

[jira] [Commented] (SPARK-21619) Fail the execution of canonicalized plans explicitly

2017-08-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113516#comment-16113516 ] Reynold Xin commented on SPARK-21619: - Sorry I don't understand your question or point at all. Why

[jira] [Commented] (SPARK-21619) Fail the execution of canonicalized plans explicitly

2017-08-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113498#comment-16113498 ] Reynold Xin commented on SPARK-21619: - Canonicalized plan is used for semantic comparison. This has

[jira] [Created] (SPARK-21619) Fail the execution of canonicalized plans explicitly

2017-08-02 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-21619: --- Summary: Fail the execution of canonicalized plans explicitly Key: SPARK-21619 URL: https://issues.apache.org/jira/browse/SPARK-21619 Project: Spark Issue

[jira] [Commented] (SPARK-21551) pyspark's collect fails when getaddrinfo is too slow

2017-07-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103543#comment-16103543 ] Reynold Xin commented on SPARK-21551: - Sure. > pyspark's collect fails when getaddrinfo is too slow

[jira] [Commented] (SPARK-21551) pyspark's collect fails when getaddrinfo is too slow

2017-07-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16103536#comment-16103536 ] Reynold Xin commented on SPARK-21551: - Do you want to submit a pull request? > pyspark's collect

[jira] [Resolved] (SPARK-21485) API Documentation for Spark SQL functions

2017-07-26 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-21485. - Resolution: Fixed Assignee: Hyukjin Kwon Fix Version/s: 2.3.0 > API

[jira] [Resolved] (SPARK-12957) Derive and propagate data constrains in logical plan

2017-07-26 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-12957. - Resolution: Fixed Fix Version/s: 2.0.0 > Derive and propagate data constrains in logical

[jira] [Commented] (SPARK-21485) API Documentation for Spark SQL functions

2017-07-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095600#comment-16095600 ] Reynold Xin commented on SPARK-21485: - Pretty cool. Would be great to just generate the function list

[jira] [Commented] (SPARK-19842) Informational Referential Integrity Constraints Support in Spark

2017-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16093887#comment-16093887 ] Reynold Xin commented on SPARK-19842: - Are you guys doing any work here? > Informational Referential

[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications

2017-07-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089332#comment-16089332 ] Reynold Xin commented on SPARK-18085: - [~vanzin] That's actually not true anymore. It was re-licensed

[jira] [Commented] (SPARK-9686) Spark Thrift server doesn't return correct JDBC metadata

2017-07-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16089328#comment-16089328 ] Reynold Xin commented on SPARK-9686: The best way to advance the issue is probably for somebody to

[jira] [Updated] (SPARK-20236) Overwrite a partitioned data source table should only overwrite related partitions

2017-07-14 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-20236: Labels: releasenotes (was: ) > Overwrite a partitioned data source table should only overwrite

[jira] [Commented] (SPARK-18085) SPIP: Better History Server scalability for many / large applications

2017-07-12 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16084277#comment-16084277 ] Reynold Xin commented on SPARK-18085: - You should email dev@ to notify the list about a new SPIP.

[jira] [Updated] (SPARK-18085) SPIP: Better History Server scalability for many / large applications

2017-07-12 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-18085: Summary: SPIP: Better History Server scalability for many / large applications (was: Better

[jira] [Commented] (SPARK-18085) Better History Server scalability for many / large applications

2017-07-12 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16083612#comment-16083612 ] Reynold Xin commented on SPARK-18085: - That sounds good to me. I don't actually think the SPIP

[jira] [Comment Edited] (SPARK-20641) Key-value store abstraction and implementation for storing application data

2017-07-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16083452#comment-16083452 ] Reynold Xin edited comment on SPARK-20641 at 7/12/17 5:06 AM: -- BTW why are

[jira] [Commented] (SPARK-18085) Better History Server scalability for many / large applications

2017-07-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16083453#comment-16083453 ] Reynold Xin commented on SPARK-18085: - This is just large enough to warrant / deserve a SPIP to be

[jira] [Comment Edited] (SPARK-20641) Key-value store abstraction and implementation for storing application data

2017-07-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16083452#comment-16083452 ] Reynold Xin edited comment on SPARK-20641 at 7/12/17 5:05 AM: -- BTW why are

[jira] [Commented] (SPARK-20641) Key-value store abstraction and implementation for storing application data

2017-07-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16083452#comment-16083452 ] Reynold Xin commented on SPARK-20641: - BTW why are we not using RocksDB? I saw that you just

[jira] [Resolved] (SPARK-21358) Argument of repartitionandsortwithinpartitions at pyspark

2017-07-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-21358. - Resolution: Fixed Assignee: chie hayashida Fix Version/s: 2.3.0 > Argument of

[jira] [Commented] (SPARK-21349) Make TASK_SIZE_TO_WARN_KB configurable

2017-07-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081128#comment-16081128 ] Reynold Xin commented on SPARK-21349: - cc [~cloud_fan] Shouldn't task metric just be a single

[jira] [Commented] (SPARK-18085) Better History Server scalability for many / large applications

2017-07-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16077606#comment-16077606 ] Reynold Xin commented on SPARK-18085: - [~vanzin] seems like this should have a SPIP? Looks super

[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-07-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16077484#comment-16077484 ] Reynold Xin commented on SPARK-21190: - [~bryanc] Sorry I don't think it makes sense to not introduce

[jira] [Updated] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-07-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-21190: Description: *Background and Motivation* Python is one of the most popular programming languages

[jira] [Resolved] (SPARK-21323) Rename sql.catalyst.plans.logical.statsEstimation.Range to ValueInterval

2017-07-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-21323. - Resolution: Fixed Assignee: Gengliang Wang Fix Version/s: 2.3.0 > Rename

[jira] [Commented] (SPARK-15533) Deprecate Dataset.explode

2017-07-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071325#comment-16071325 ] Reynold Xin commented on SPARK-15533: - Just use a star. On Sat, Jul 1, 2017 at 9:33 AM Sagara

[jira] [Commented] (SPARK-21274) Implement EXCEPT ALL and INTERSECT ALL

2017-06-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070890#comment-16070890 ] Reynold Xin commented on SPARK-21274: - Do you want to submit a pull request? > Implement EXCEPT ALL

[jira] [Created] (SPARK-21273) Decouple stats propagation from logical plan

2017-06-30 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-21273: --- Summary: Decouple stats propagation from logical plan Key: SPARK-21273 URL: https://issues.apache.org/jira/browse/SPARK-21273 Project: Spark Issue Type:

[jira] [Closed] (SPARK-21270) Improvement for memory config.

2017-06-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-21270. --- Resolution: Won't Fix While I absolutely would love to see this feature, I don't think this is

[jira] [Resolved] (SPARK-17924) Consolidate streaming and batch write path

2017-06-30 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-17924. - Resolution: Fixed Fix Version/s: 2.3.0 > Consolidate streaming and batch write path >

[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-06-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16069490#comment-16069490 ] Reynold Xin commented on SPARK-21190: - That makes a lot of sense. So to design APIs similar to a lot

[jira] [Closed] (SPARK-18199) Support appending to Parquet files

2017-06-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-18199. --- Resolution: Invalid I'm closing this as invalid. It is not a good idea to append to an existing

[jira] [Commented] (SPARK-14220) Build and test Spark against Scala 2.12

2017-06-26 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063984#comment-16063984 ] Reynold Xin commented on SPARK-14220: - If all those issues have been released than it would be easy.

[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-06-26 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063515#comment-16063515 ] Reynold Xin commented on SPARK-21190: - [~icexelloss] Thanks. Your proposal brings up a good point,

[jira] [Commented] (SPARK-18016) Code Generation: Constant Pool Past Limit for Wide/Nested Dataset

2017-06-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16062207#comment-16062207 ] Reynold Xin commented on SPARK-18016: - Was this merged in 2.1? If yes we should revert it from

[jira] [Updated] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-06-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-21190: Attachment: SPIPVectorizedUDFsforPython (1).pdf > SPIP: Vectorized UDFs in Python >

[jira] [Updated] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-06-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-21190: Description: *Background and Motivation* Python is one of the most popular programming languages

[jira] [Closed] (SPARK-20817) Benchmark.getProcessorName() returns "Unknown processor" on ppc and 390 platforms

2017-06-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-20817. --- Resolution: Won't Fix See github discussions. > Benchmark.getProcessorName() returns "Unknown

[jira] [Updated] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-06-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-21190: Description: *Background and Motivation* Python is one of the most popular programming languages

[jira] [Commented] (SPARK-14220) Build and test Spark against Scala 2.12

2017-06-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060549#comment-16060549 ] Reynold Xin commented on SPARK-14220: - Making it build isn't that much work, but getting the API to

[jira] [Updated] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-06-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-21190: Description: *Background and Motivation* Python is one of the most popular programming languages

[jira] [Updated] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-06-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-21190: Summary: SPIP: Vectorized UDFs in Python (was: SPIP: Vectorized UDFs for Python) > SPIP:

[jira] [Assigned] (SPARK-21190) SPIP: Vectorized UDFs for Python

2017-06-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin reassigned SPARK-21190: --- Assignee: Reynold Xin > SPIP: Vectorized UDFs for Python >

[jira] [Updated] (SPARK-21190) SPIP: Vectorized UDFs for Python

2017-06-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-21190: Description: *Background and Motivation* Python is one of the most popular programming languages

[jira] [Created] (SPARK-21190) SPIP: Vectorized UDFs for Python

2017-06-23 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-21190: --- Summary: SPIP: Vectorized UDFs for Python Key: SPARK-21190 URL: https://issues.apache.org/jira/browse/SPARK-21190 Project: Spark Issue Type: New Feature

[jira] [Updated] (SPARK-14220) Build and test Spark against Scala 2.12

2017-06-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-14220: Target Version/s: (was: 2.3.0) > Build and test Spark against Scala 2.12 >

[jira] [Commented] (SPARK-14220) Build and test Spark against Scala 2.12

2017-06-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060469#comment-16060469 ] Reynold Xin commented on SPARK-14220: - I just removed the target version given the amount of work.

[jira] [Commented] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2017-06-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060455#comment-16060455 ] Reynold Xin commented on SPARK-21187: - Does Pandas support array / struct / map? > Complete support

[jira] [Updated] (SPARK-13534) Implement Apache Arrow serializer for Spark DataFrame for use in DataFrame.toPandas

2017-06-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-13534: Issue Type: Sub-task (was: New Feature) Parent: SPARK-21187 > Implement Apache Arrow

[jira] [Commented] (SPARK-13534) Implement Apache Arrow serializer for Spark DataFrame for use in DataFrame.toPandas

2017-06-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060328#comment-16060328 ] Reynold Xin commented on SPARK-13534: - Was this done? I thought there are still other data types that

[jira] [Resolved] (SPARK-21103) QueryPlanConstraints should be part of LogicalPlan

2017-06-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-21103. - Resolution: Fixed Fix Version/s: 2.3.0 > QueryPlanConstraints should be part of

[jira] [Commented] (SPARK-21102) Refresh command is too aggressive in parsing

2017-06-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16054600#comment-16054600 ] Reynold Xin commented on SPARK-21102: - Can you submit a pull request so we can discuss the details of

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2017-06-15 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16051013#comment-16051013 ] Reynold Xin commented on SPARK-1: - But this ticket has nothing to do with SQL? > DataFrame

[jira] [Commented] (SPARK-13333) DataFrame filter + randn + unionAll has bad interaction

2017-06-15 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050730#comment-16050730 ] Reynold Xin commented on SPARK-1: - What's left in this ticket? Didn't we fix it already? If it is

[jira] [Resolved] (SPARK-21092) Wire SQLConf in logical plan and expressions

2017-06-14 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-21092. - Resolution: Fixed Fix Version/s: 2.3.0 > Wire SQLConf in logical plan and expressions >

[jira] [Created] (SPARK-21103) QueryPlanConstraints should be part of LogicalPlan

2017-06-14 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-21103: --- Summary: QueryPlanConstraints should be part of LogicalPlan Key: SPARK-21103 URL: https://issues.apache.org/jira/browse/SPARK-21103 Project: Spark Issue Type:

[jira] [Created] (SPARK-21102) Refresh command is too aggressive in parsing

2017-06-14 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-21102: --- Summary: Refresh command is too aggressive in parsing Key: SPARK-21102 URL: https://issues.apache.org/jira/browse/SPARK-21102 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-21102) Refresh command is too aggressive in parsing

2017-06-14 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-21102: Labels: starter (was: ) > Refresh command is too aggressive in parsing >

[jira] [Resolved] (SPARK-21091) Move constraint code into QueryPlanConstraints

2017-06-14 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-21091. - Resolution: Fixed Fix Version/s: 2.3.0 > Move constraint code into QueryPlanConstraints >

[jira] [Created] (SPARK-21092) Wire SQLConf in logical plan and expressions

2017-06-14 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-21092: --- Summary: Wire SQLConf in logical plan and expressions Key: SPARK-21092 URL: https://issues.apache.org/jira/browse/SPARK-21092 Project: Spark Issue Type: New

[jira] [Created] (SPARK-21091) Move constraint code into QueryPlanConstraints

2017-06-14 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-21091: --- Summary: Move constraint code into QueryPlanConstraints Key: SPARK-21091 URL: https://issues.apache.org/jira/browse/SPARK-21091 Project: Spark Issue Type:

[jira] [Updated] (SPARK-21059) LikeSimplification can NPE on null pattern

2017-06-12 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-21059: Summary: LikeSimplification can NPE on null pattern (was: LikeSimplification an NPE on null

[jira] [Created] (SPARK-21059) LikeSimplification an NPE on null pattern

2017-06-12 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-21059: --- Summary: LikeSimplification an NPE on null pattern Key: SPARK-21059 URL: https://issues.apache.org/jira/browse/SPARK-21059 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-21043) Add unionByName API to Dataset

2017-06-09 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-21043: Description: It would be useful to add unionByName which resolves columns by name, in addition to

<    1   2   3   4   5   6   7   8   9   10   >