[jira] [Updated] (SPARK-14717) Scala, Python APIs for Dataset.unpersist differ in default blocking value

2016-04-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14717: --- Assignee: Felix Cheung > Scala, Python APIs for Dataset.unpersist differ in default blocking value >

[jira] [Closed] (SPARK-13179) pyspark row name collision 'count'

2016-04-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu closed SPARK-13179. -- Resolution: Won't Fix > pyspark row name collision 'count' > -- > >

[jira] [Resolved] (SPARK-14491) refactor object operator framework to make it easy to eliminate serializations

2016-04-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14491. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12260 [https://github.

[jira] [Resolved] (SPARK-14614) Add `bround` function

2016-04-18 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14614. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12376 [https://github.

[jira] [Commented] (SPARK-13352) BlockFetch does not scale well on large block

2016-04-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15245209#comment-15245209 ] Davies Liu commented on SPARK-13352: corrected, thanks > BlockFetch does not scale w

[jira] [Comment Edited] (SPARK-13352) BlockFetch does not scale well on large block

2016-04-17 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15234559#comment-15234559 ] Davies Liu edited comment on SPARK-13352 at 4/18/16 6:40 AM: -

[jira] [Assigned] (SPARK-14669) Some SQL metrics is broken when whole-stage codegen enabled

2016-04-15 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-14669: -- Assignee: Davies Liu > Some SQL metrics is broken when whole-stage codegen enabled > -

[jira] [Created] (SPARK-14669) Some SQL metrics is broken when whole-stage codegen enabled

2016-04-15 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14669: -- Summary: Some SQL metrics is broken when whole-stage codegen enabled Key: SPARK-14669 URL: https://issues.apache.org/jira/browse/SPARK-14669 Project: Spark Issu

[jira] [Assigned] (SPARK-14607) Partition pruning is case sensitive even with HiveContext

2016-04-13 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-14607: -- Assignee: Davies Liu > Partition pruning is case sensitive even with HiveContext > ---

[jira] [Resolved] (SPARK-14484) Fail to create parquet filter if the column name does not match exactly

2016-04-13 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14484. Resolution: Fixed Assignee: Davies Liu > Fail to create parquet filter if the column name doe

[jira] [Resolved] (SPARK-14607) Partition pruning is case sensitive even with HiveContext

2016-04-13 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14607. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12371 [https://github.

[jira] [Created] (SPARK-14607) Partition pruning is case sensitive even with HiveContext

2016-04-13 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14607: -- Summary: Partition pruning is case sensitive even with HiveContext Key: SPARK-14607 URL: https://issues.apache.org/jira/browse/SPARK-14607 Project: Spark Issue T

[jira] [Resolved] (SPARK-14581) Improve filter push down

2016-04-13 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14581. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12342 [https://github.

[jira] [Commented] (SPARK-14600) Push predicates through Expand

2016-04-13 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15239688#comment-15239688 ] Davies Liu commented on SPARK-14600: cc [~cloud_fan] > Push predicates through Expan

[jira] [Created] (SPARK-14600) Push predicates through Expand

2016-04-13 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14600: -- Summary: Push predicates through Expand Key: SPARK-14600 URL: https://issues.apache.org/jira/browse/SPARK-14600 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-14582) Increase the parallelism for small tables

2016-04-12 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14582: -- Summary: Increase the parallelism for small tables Key: SPARK-14582 URL: https://issues.apache.org/jira/browse/SPARK-14582 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-14578) Can't load a json dataset with nested wide schema

2016-04-12 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14578. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12338 [https://github.

[jira] [Created] (SPARK-14581) Improve filter push down

2016-04-12 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14581: -- Summary: Improve filter push down Key: SPARK-14581 URL: https://issues.apache.org/jira/browse/SPARK-14581 Project: Spark Issue Type: Improvement Compon

[jira] [Resolved] (SPARK-14363) Executor OOM due to a memory leak in Sorter

2016-04-12 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14363. Resolution: Fixed Fix Version/s: 1.6.2 2.0.0 Issue resolved by pull reque

[jira] [Resolved] (SPARK-14544) Spark UI is very slow in recent Chrome

2016-04-12 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14544. Resolution: Fixed Fix Version/s: 2.0.0 > Spark UI is very slow in recent Chrome > --

[jira] [Created] (SPARK-14578) Can't load a json dataset with nested wide schema

2016-04-12 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14578: -- Summary: Can't load a json dataset with nested wide schema Key: SPARK-14578 URL: https://issues.apache.org/jira/browse/SPARK-14578 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-14562) Improve constraints propagation in Union

2016-04-12 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14562. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12328 [https://github.

[jira] [Created] (SPARK-14562) Improve constraints propagation in Union

2016-04-12 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14562: -- Summary: Improve constraints propagation in Union Key: SPARK-14562 URL: https://issues.apache.org/jira/browse/SPARK-14562 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-14544) Spark UI is very slow in recent Chrome

2016-04-11 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14544: -- Summary: Spark UI is very slow in recent Chrome Key: SPARK-14544 URL: https://issues.apache.org/jira/browse/SPARK-14544 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-14541) SQL function: IFNULL, NULLIF, NVL and NVL2

2016-04-11 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14541: -- Summary: SQL function: IFNULL, NULLIF, NVL and NVL2 Key: SPARK-14541 URL: https://issues.apache.org/jira/browse/SPARK-14541 Project: Spark Issue Type: New Featur

[jira] [Updated] (SPARK-14471) The alias created in SELECT could be used in GROUP BY and followed expressions

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14471: --- Description: This query should be able to run: {code} select a a1, a1 + 1 as b, count(1) from t gro

[jira] [Updated] (SPARK-14471) The alias created in SELECT could be used in GROUP BY and followed expressions

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14471: --- Summary: The alias created in SELECT could be used in GROUP BY and followed expressions (was: The al

[jira] [Updated] (SPARK-14538) Increase the default stack size of spark shell

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14538: --- Assignee: (was: Davies Liu) > Increase the default stack size of spark shell > --

[jira] [Commented] (SPARK-14526) The catalog of SQLContext should not be case-sensitive

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235617#comment-15235617 ] Davies Liu commented on SPARK-14526: It seems like a feature for a long time. It mak

[jira] [Created] (SPARK-14538) Increase the default stack size of spark shell

2016-04-11 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14538: -- Summary: Increase the default stack size of spark shell Key: SPARK-14538 URL: https://issues.apache.org/jira/browse/SPARK-14538 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-14454) Better exception handling while marking tasks as failed

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14454: --- Fix Version/s: 1.6.2 > Better exception handling while marking tasks as failed >

[jira] [Updated] (SPARK-13352) BlockFetch does not scale well on large block

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-13352: --- Fix Version/s: 1.6.2 > BlockFetch does not scale well on large block > --

[jira] [Resolved] (SPARK-14502) Add optimization for Binary Comparison Simplification

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14502. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12267 [https://github.

[jira] [Resolved] (SPARK-14528) SameResult on Union is broken

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14528. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12295 [https://github.

[jira] [Updated] (SPARK-14526) The catalog of SQLContext should not be case-sensitive

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14526: --- Priority: Blocker (was: Major) > The catalog of SQLContext should not be case-sensitive > -

[jira] [Resolved] (SPARK-14524) In SparkSQL, it can't be select column of String type because of UTF8String when setting more than 32G for executors.

2016-04-11 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14524. Resolution: Duplicate Assignee: Davies Liu > In SparkSQL, it can't be select column of String

[jira] [Created] (SPARK-14528) SameResult on Union is broken

2016-04-11 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14528: -- Summary: SameResult on Union is broken Key: SPARK-14528 URL: https://issues.apache.org/jira/browse/SPARK-14528 Project: Spark Issue Type: Bug Component

[jira] [Created] (SPARK-14526) The catalog of SQLContext should not be case-sensitive

2016-04-10 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14526: -- Summary: The catalog of SQLContext should not be case-sensitive Key: SPARK-14526 URL: https://issues.apache.org/jira/browse/SPARK-14526 Project: Spark Issue Typ

[jira] [Updated] (SPARK-13352) BlockFetch does not scale well on large block

2016-04-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-13352: --- Fix Version/s: (was: 1.6.2) > BlockFetch does not scale well on large block > ---

[jira] [Updated] (SPARK-14242) avoid too many copies in network when a network frame is large

2016-04-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14242: --- Fix Version/s: 1.6.2 > avoid too many copies in network when a network frame is large > -

[jira] [Resolved] (SPARK-13352) BlockFetch does not scale well on large block

2016-04-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-13352. Resolution: Fixed Fix Version/s: 2.0.0 1.6.2 > BlockFetch does not scale

[jira] [Comment Edited] (SPARK-13352) BlockFetch does not scale well on large block

2016-04-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15234559#comment-15234559 ] Davies Liu edited comment on SPARK-13352 at 4/11/16 6:35 AM: -

[jira] [Commented] (SPARK-13352) BlockFetch does not scale well on large block

2016-04-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15234559#comment-15234559 ] Davies Liu commented on SPARK-13352: The result is much better now: {code} 50M

[jira] [Updated] (SPARK-13352) BlockFetch does not scale well on large block

2016-04-10 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-13352: --- Assignee: Zhang, Liye > BlockFetch does not scale well on large block > -

[jira] [Resolved] (SPARK-14217) Vectorized parquet reader produces wrong result if data used dictionary encoding fallback

2016-04-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14217. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12279 [https://github.

[jira] [Resolved] (SPARK-14419) Improve the HashedRelation for key fit within Long

2016-04-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14419. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12190 [https://github.

[jira] [Resolved] (SPARK-14454) Better exception handling while marking tasks as failed

2016-04-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14454. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12234 [https://github.

[jira] [Resolved] (SPARK-14448) Improvements to ColumnVector

2016-04-08 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14448. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12225 [https://github.

[jira] [Created] (SPARK-14484) Fail to create parquet filter if the column name does not match exactly

2016-04-08 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14484: -- Summary: Fail to create parquet filter if the column name does not match exactly Key: SPARK-14484 URL: https://issues.apache.org/jira/browse/SPARK-14484 Project: Spark

[jira] [Commented] (SPARK-8632) Poor Python UDF performance because of RDD caching

2016-04-07 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231585#comment-15231585 ] Davies Liu commented on SPARK-8632: --- [~bijay697] Python UDFs had been improved a lot rec

[jira] [Commented] (SPARK-14476) Show table name or path in string of DataSourceScan

2016-04-07 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231354#comment-15231354 ] Davies Liu commented on SPARK-14476: cc [~lian cheng] > Show table name or path in s

[jira] [Created] (SPARK-14476) Show table name or path in string of DataSourceScan

2016-04-07 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14476: -- Summary: Show table name or path in string of DataSourceScan Key: SPARK-14476 URL: https://issues.apache.org/jira/browse/SPARK-14476 Project: Spark Issue Type: N

[jira] [Created] (SPARK-14471) The alias created in SELECT could be used in GROUP BY

2016-04-07 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14471: -- Summary: The alias created in SELECT could be used in GROUP BY Key: SPARK-14471 URL: https://issues.apache.org/jira/browse/SPARK-14471 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-12740) grouping()/grouping_id() should work with having and order by

2016-04-07 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-12740. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12235 [https://github.

[jira] [Resolved] (SPARK-13932) CUBE Query with filter (HAVING) and condition (IF) raises an AnalysisException

2016-04-07 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-13932. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12235 [https://github.

[jira] [Commented] (SPARK-13842) Consider __iter__ and __getitem__ methods for pyspark.sql.types.StructType

2016-04-07 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15230799#comment-15230799 ] Davies Liu commented on SPARK-13842: Sounds good to me. > Consider __iter__ and __ge

[jira] [Comment Edited] (SPARK-13932) CUBE Query with filter (HAVING) and condition (IF) raises an AnalysisException

2016-04-07 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15230767#comment-15230767 ] Davies Liu edited comment on SPARK-13932 at 4/7/16 6:22 PM: T

[jira] [Commented] (SPARK-13932) CUBE Query with filter (HAVING) and condition (IF) raises an AnalysisException

2016-04-07 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15230767#comment-15230767 ] Davies Liu commented on SPARK-13932: This will be fixed in https://github.com/apache/

[jira] [Resolved] (SPARK-14223) Cannot project all columns from a parquet files with ~1,100 columns

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14223. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12047 [https://github.

[jira] [Resolved] (SPARK-14310) Fix scan whole stage codegen to determine if batches are produced based on schema

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14310. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12047 [https://github.

[jira] [Resolved] (SPARK-14224) Cannot project all columns from a table with ~1,100 columns

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14224. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12047 [https://github.

[jira] [Assigned] (SPARK-13966) Regression using .withColumn() on a parquet

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-13966: -- Assignee: Davies Liu > Regression using .withColumn() on a parquet > -

[jira] [Commented] (SPARK-13966) Regression using .withColumn() on a parquet

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15229042#comment-15229042 ] Davies Liu commented on SPARK-13966: I checked this on latest master, it works, could

[jira] [Updated] (SPARK-14031) Dataframe to csv IO, system performance enters high CPU state and write operation takes 1 hour to complete

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14031: --- Priority: Critical (was: Minor) > Dataframe to csv IO, system performance enters high CPU state and

[jira] [Resolved] (SPARK-13867) Failed to bind reference when cume_dist is used

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-13867. Resolution: Fixed Assignee: Cheng Lian https://github.com/apache/spark/pull/12040 > Failed t

[jira] [Commented] (SPARK-13346) Using DataFrames iteratively leads to massive query plans, which slows execution

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15228923#comment-15228923 ] Davies Liu commented on SPARK-13346: This is known issue since the beginning of DataF

[jira] [Updated] (SPARK-14317) Clean up hash join

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14317: --- Fix Version/s: 2.0.0 > Clean up hash join > -- > > Key: SPARK-14317 >

[jira] [Commented] (SPARK-14317) Clean up hash join

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15228908#comment-15228908 ] Davies Liu commented on SPARK-14317: https://github.com/apache/spark/pull/12102 > Cl

[jira] [Resolved] (SPARK-14317) Clean up hash join

2016-04-06 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14317. Resolution: Fixed > Clean up hash join > -- > > Key: SPARK-14317 >

[jira] [Created] (SPARK-14419) Improve the HashedRelation for key fit within Long

2016-04-05 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14419: -- Summary: Improve the HashedRelation for key fit within Long Key: SPARK-14419 URL: https://issues.apache.org/jira/browse/SPARK-14419 Project: Spark Issue Type: Im

[jira] [Created] (SPARK-14418) Broadcast.unpersist() in PySpark is not consistent with that in Scala

2016-04-05 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14418: -- Summary: Broadcast.unpersist() in PySpark is not consistent with that in Scala Key: SPARK-14418 URL: https://issues.apache.org/jira/browse/SPARK-14418 Project: Spark

[jira] [Resolved] (SPARK-14353) Dateset Time Windowing API for Python, R, and SQL

2016-04-05 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14353. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12136 [https://github.

[jira] [Resolved] (SPARK-14334) Add toLocalIterator for Dataset

2016-04-04 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14334. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12114 [https://github.

[jira] [Resolved] (SPARK-12981) Dataframe distinct() followed by a filter(udf) in pyspark throws a casting error

2016-04-04 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-12981. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12127 [https://github.

[jira] [Updated] (SPARK-14231) JSON data source fails to infer floats as decimal when precision is bigger than 38 or scale is bigger than precision.

2016-04-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14231: --- Assignee: Hyukjin Kwon > JSON data source fails to infer floats as decimal when precision is bigger

[jira] [Resolved] (SPARK-14231) JSON data source fails to infer floats as decimal when precision is bigger than 38 or scale is bigger than precision.

2016-04-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14231. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12030 [https://github.

[jira] [Updated] (SPARK-13996) Add more not null attributes for Filter codegen

2016-04-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-13996: --- Assignee: Liang-Chi Hsieh > Add more not null attributes for Filter codegen > ---

[jira] [Resolved] (SPARK-13996) Add more not null attributes for Filter codegen

2016-04-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-13996. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 11810 [https://github.

[jira] [Updated] (SPARK-13996) Add more not null attributes for Filter codegen

2016-04-02 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-13996: --- Fix Version/s: (was: 2.1.0) 2.0.0 > Add more not null attributes for Filter co

[jira] [Resolved] (SPARK-14138) Generated SpecificColumnarIterator code can exceed JVM size limit for cached DataFrames

2016-04-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14138. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12108 [https://github.

[jira] [Updated] (SPARK-13674) Add wholestage codegen support to Sample

2016-04-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-13674: --- Fix Version/s: (was: 2.1.0) 2.0.0 > Add wholestage codegen support to Sample >

[jira] [Updated] (SPARK-13674) Add wholestage codegen support to Sample

2016-04-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-13674: --- Assignee: Liang-Chi Hsieh > Add wholestage codegen support to Sample > --

[jira] [Resolved] (SPARK-13674) Add wholestage codegen support to Sample

2016-04-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-13674. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 11517 [https://github.

[jira] [Created] (SPARK-14334) Add toLocalIterator for Dataset

2016-04-01 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14334: -- Summary: Add toLocalIterator for Dataset Key: SPARK-14334 URL: https://issues.apache.org/jira/browse/SPARK-14334 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-13352) BlockFetch does not scale well on large block

2016-04-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1503#comment-1503 ] Davies Liu commented on SPARK-13352: cc [~adav] > BlockFetch does not scale well on

[jira] [Commented] (SPARK-13352) BlockFetch does not scale well on large block

2016-04-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15222199#comment-15222199 ] Davies Liu commented on SPARK-13352: After more investigating, it turned out that the

[jira] [Commented] (SPARK-14333) Duration of task should be the total time (not just computation time)

2016-04-01 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15222103#comment-15222103 ] Davies Liu commented on SPARK-14333: cc [~andrewor14] > Duration of task should be t

[jira] [Created] (SPARK-14333) Duration of task should be the total time (not just computation time)

2016-04-01 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14333: -- Summary: Duration of task should be the total time (not just computation time) Key: SPARK-14333 URL: https://issues.apache.org/jira/browse/SPARK-14333 Project: Spark

[jira] [Resolved] (SPARK-14267) Execute multiple Python UDFs in single batch

2016-03-31 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14267. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12057 [https://github.

[jira] [Created] (SPARK-14317) Clean up hash join

2016-03-31 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14317: -- Summary: Clean up hash join Key: SPARK-14317 URL: https://issues.apache.org/jira/browse/SPARK-14317 Project: Spark Issue Type: Improvement Reporter:

[jira] [Commented] (SPARK-14230) Config the start time (jitter) for streaming jobs

2016-03-30 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15218938#comment-15218938 ] Davies Liu commented on SPARK-14230: For non-window batch, could be supported via tri

[jira] [Commented] (SPARK-14141) Let user specify datatypes of pandas dataframe in toPandas()

2016-03-30 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15218909#comment-15218909 ] Davies Liu commented on SPARK-14141: toLocalIterator is better than collect, but will

[jira] [Commented] (SPARK-13820) TPC-DS Query 10 fails to compile

2016-03-30 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15218906#comment-15218906 ] Davies Liu commented on SPARK-13820: [~jfc...@us.ibm.com] How much modification have

[jira] [Commented] (SPARK-14230) Config the start time (jitter) for streaming jobs

2016-03-30 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15218657#comment-15218657 ] Davies Liu commented on SPARK-14230: This will be supported in structured streaming:

[jira] [Created] (SPARK-14267) Execute multiple Python UDFs in single batch

2016-03-30 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14267: -- Summary: Execute multiple Python UDFs in single batch Key: SPARK-14267 URL: https://issues.apache.org/jira/browse/SPARK-14267 Project: Spark Issue Type: Improvem

[jira] [Resolved] (SPARK-14215) Support chained Python UDF

2016-03-29 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14215. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12014 [https://github.

[jira] [Resolved] (SPARK-14210) Add timing metric for how long the query spent in scan

2016-03-28 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14210. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12007 [https://github.

[jira] [Resolved] (SPARK-14202) python_full_outer_join should use generator expression instead of list comp

2016-03-28 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14202. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11998 [https://github.

[jira] [Created] (SPARK-14215) Support chained Python UDF

2016-03-28 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14215: -- Summary: Support chained Python UDF Key: SPARK-14215 URL: https://issues.apache.org/jira/browse/SPARK-14215 Project: Spark Issue Type: Improvement Comp

[jira] [Resolved] (SPARK-14052) Build BytesToBytesMap in HashedRelation

2016-03-28 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-14052. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11870 [https://github.

<    1   2   3   4   5   6   7   8   9   10   >