[jira] [Comment Edited] (SPARK-13004) Support Non-Volatile Data and Operations

2016-01-26 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118468#comment-15118468 ] Wang, Gang edited comment on SPARK-13004 at 1/27/16 5:34 AM: - Yes, That is

[jira] [Closed] (SPARK-13004) Support Non-Volatile Data and Operations

2016-01-27 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang closed SPARK-13004. -- Resolution: Later Preparing actionable items so close it temporarily as Sean Owen suggested. >

[jira] [Commented] (SPARK-13004) Support Non-Volatile Data and Operations

2016-01-27 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15119970#comment-15119970 ] Wang, Gang commented on SPARK-13004: I have closed it according to your advice, Thanks. > Support

[jira] [Created] (SPARK-13004) Support Non-Volatile Data and Operations

2016-01-26 Thread Wang, Gang (JIRA)
Wang, Gang created SPARK-13004: -- Summary: Support Non-Volatile Data and Operations Key: SPARK-13004 URL: https://issues.apache.org/jira/browse/SPARK-13004 Project: Spark Issue Type: Epic

[jira] [Updated] (SPARK-13004) Support Non-Volatile Data and Operations

2016-01-26 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang updated SPARK-13004: --- Description: Based on our experiments, the SerDe-like operations have some significant negative

[jira] [Updated] (SPARK-13004) Support Non-Volatile Data and Operations

2016-01-26 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang updated SPARK-13004: --- Description: Based on our experiments, the SerDe-like operations have some significant negative

[jira] [Updated] (SPARK-13004) Support Non-Volatile Data and Operations

2016-01-26 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang updated SPARK-13004: --- Description: Based on our experiments, the SerDe-like operations have some significant negative

[jira] [Commented] (SPARK-13004) Support Non-Volatile Data and Operations

2016-01-26 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118288#comment-15118288 ] Wang, Gang commented on SPARK-13004: Yes, That is one of prototype for concept of proof. We are

[jira] [Comment Edited] (SPARK-13004) Support Non-Volatile Data and Operations

2016-01-26 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118288#comment-15118288 ] Wang, Gang edited comment on SPARK-13004 at 1/27/16 12:44 AM: -- Yes, That is

[jira] [Commented] (SPARK-13004) Support Non-Volatile Data and Operations

2016-01-26 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118468#comment-15118468 ] Wang, Gang commented on SPARK-13004: Yes, That is one of our prototypes for concept of proof. We are

[jira] [Issue Comment Deleted] (SPARK-13004) Support Non-Volatile Data and Operations

2016-01-26 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang updated SPARK-13004: --- Comment: was deleted (was: Yes, That is one of our prototypes for concept of proof. We are preparing

[jira] [Updated] (SPARK-23373) Can not execute "count distinct" queries on parquet formatted table

2018-02-09 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang updated SPARK-23373: --- Issue Type: Bug (was: New Feature) > Can not execute "count distinct" queries on parquet formatted

[jira] [Updated] (SPARK-23373) Can not execute "count distinct" queries on parquet formatted table

2018-02-09 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang updated SPARK-23373: --- Description: I failed to run sql "select count(distinct n_name) from nation", table nation is

[jira] [Updated] (SPARK-23373) Can not execute "count distinct" queries on parquet formatted table

2018-02-09 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang updated SPARK-23373: --- Description: I failed to run sql "select count(distinct n_name) from nation", table nation is

[jira] [Created] (SPARK-23373) Can not execute "count distinct" queries on parquet formatted table

2018-02-09 Thread Wang, Gang (JIRA)
Wang, Gang created SPARK-23373: -- Summary: Can not execute "count distinct" queries on parquet formatted table Key: SPARK-23373 URL: https://issues.apache.org/jira/browse/SPARK-23373 Project: Spark

[jira] [Comment Edited] (SPARK-23373) Can not execute "count distinct" queries on parquet formatted table

2018-02-09 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16358355#comment-16358355 ] Wang, Gang edited comment on SPARK-23373 at 2/9/18 1:01 PM: Yes. Seems

[jira] [Commented] (SPARK-23373) Can not execute "count distinct" queries on parquet formatted table

2018-02-09 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16358355#comment-16358355 ] Wang, Gang commented on SPARK-23373: Yes. Seems related to my test environment. While, I tried in a

[jira] [Created] (SPARK-25401) Reorder the required ordering to match the table's output ordering for bucket join

2018-09-11 Thread Wang, Gang (JIRA)
Wang, Gang created SPARK-25401: -- Summary: Reorder the required ordering to match the table's output ordering for bucket join Key: SPARK-25401 URL: https://issues.apache.org/jira/browse/SPARK-25401

[jira] [Created] (SPARK-25411) Implement range partition in Spark

2018-09-11 Thread Wang, Gang (JIRA)
Wang, Gang created SPARK-25411: -- Summary: Implement range partition in Spark Key: SPARK-25411 URL: https://issues.apache.org/jira/browse/SPARK-25411 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-23839) consider bucket join in cost-based JoinReorder rule

2018-04-01 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16421906#comment-16421906 ] Wang, Gang commented on SPARK-23839: Good point! Currently, Spark just take data size into

[jira] [Commented] (SPARK-25411) Implement range partition in Spark

2018-10-23 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661610#comment-16661610 ] Wang, Gang commented on SPARK-25411: [~cloud_fan] How do you think of this feature? In our inner

[jira] [Commented] (SPARK-17570) Avoid Hash and Exchange in Sort Merge join if bucketing factor is multiple for tables

2018-10-08 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642730#comment-16642730 ] Wang, Gang commented on SPARK-17570: Any update? > Avoid Hash and Exchange in Sort Merge join if

[jira] [Updated] (SPARK-25411) Implement range partition in Spark

2018-09-25 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang updated SPARK-25411: --- Attachment: range partition design doc.pdf > Implement range partition in Spark >

[jira] [Commented] (SPARK-25411) Implement range partition in Spark

2018-09-25 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627145#comment-16627145 ] Wang, Gang commented on SPARK-25411: Add a design doc, please help to review. > Implement range

[jira] [Updated] (SPARK-25411) Implement range partition in Spark

2018-09-26 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang updated SPARK-25411: --- Attachment: range partition design doc.pdf > Implement range partition in Spark >

[jira] [Updated] (SPARK-25411) Implement range partition in Spark

2018-09-26 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang updated SPARK-25411: --- Attachment: (was: range partition design doc.pdf) > Implement range partition in Spark >

[jira] [Updated] (SPARK-26672) SinglePartition may not satisfies HashClusteredDistribution/OrderedDistribution

2019-01-22 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang updated SPARK-26672: --- Summary: SinglePartition may not satisfies HashClusteredDistribution/OrderedDistribution (was:

[jira] [Updated] (SPARK-26672) SinglePartition may not satisfies HashClusteredDistribution/OrderedDistribution

2019-01-22 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang updated SPARK-26672: --- Description: If we are loading data to a *bucketed table* TEST_TABLE of which bucket number is not

[jira] [Created] (SPARK-26672) SinglePartition should not satisfies HashClusteredDistribution/OrderedDistribution

2019-01-21 Thread Wang, Gang (JIRA)
Wang, Gang created SPARK-26672: -- Summary: SinglePartition should not satisfies HashClusteredDistribution/OrderedDistribution Key: SPARK-26672 URL: https://issues.apache.org/jira/browse/SPARK-26672

[jira] [Closed] (SPARK-26672) SinglePartition may not satisfies HashClusteredDistribution/OrderedDistribution

2019-01-23 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang closed SPARK-26672. -- > SinglePartition may not satisfies > HashClusteredDistribution/OrderedDistribution >

[jira] [Resolved] (SPARK-26672) SinglePartition may not satisfies HashClusteredDistribution/OrderedDistribution

2019-01-23 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang resolved SPARK-26672. Resolution: Not A Problem This is only a bug in our inner Spark version. It's ok in community

[jira] [Created] (SPARK-26375) Rule PruneFileSourcePartitions should be fired before any other rules based on data size

2018-12-15 Thread Wang, Gang (JIRA)
Wang, Gang created SPARK-26375: -- Summary: Rule PruneFileSourcePartitions should be fired before any other rules based on data size Key: SPARK-26375 URL: https://issues.apache.org/jira/browse/SPARK-26375

[jira] [Updated] (SPARK-26375) Rule PruneFileSourcePartitions should be fired before any other rules based on table statistics

2018-12-15 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang updated SPARK-26375: --- Summary: Rule PruneFileSourcePartitions should be fired before any other rules based on table

[jira] [Updated] (SPARK-26375) Rule PruneFileSourcePartitions should be fired before any other rules based on table statistics

2018-12-15 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang updated SPARK-26375: --- Description: In catalyst, some optimize rules are base on table statistics, like rule ReorderJoin,

[jira] [Commented] (SPARK-25401) Reorder the required ordering to match the table's output ordering for bucket join

2018-12-07 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712873#comment-16712873 ] Wang, Gang commented on SPARK-25401: Yeah. I think so.  And please make sure the outputOrdering of 

[jira] [Updated] (SPARK-25411) Implement range partition in Spark

2018-09-18 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang updated SPARK-25411: --- Description: In our PROD environment, there are some partitioned fact tables, which are all quite

[jira] [Updated] (SPARK-25411) Implement range partition in Spark

2018-09-18 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang updated SPARK-25411: --- Description: In our product environment, there are some partitioned fact tables, which are all

[jira] [Resolved] (SPARK-26375) Rule PruneFileSourcePartitions should be fired before any other rules based on table statistics

2018-12-19 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang resolved SPARK-26375. Resolution: Not A Problem > Rule PruneFileSourcePartitions should be fired before any other rules

[jira] [Commented] (SPARK-26375) Rule PruneFileSourcePartitions should be fired before any other rules based on table statistics

2018-12-19 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724827#comment-16724827 ] Wang, Gang commented on SPARK-26375: Should be okay, filter on partition columns is also regarded as

[jira] [Commented] (SPARK-25411) Implement range partition in Spark

2019-07-15 Thread Wang, Gang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885028#comment-16885028 ] Wang, Gang commented on SPARK-25411: I referred to ORACLE DDLs, quite close to PG.   > Implement

[jira] [Created] (SPARK-29189) Add an option to ignore block locations when listing file

2019-09-20 Thread Wang, Gang (Jira)
Wang, Gang created SPARK-29189: -- Summary: Add an option to ignore block locations when listing file Key: SPARK-29189 URL: https://issues.apache.org/jira/browse/SPARK-29189 Project: Spark Issue

[jira] [Updated] (SPARK-29189) Add an option to ignore block locations when listing file

2019-09-20 Thread Wang, Gang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang updated SPARK-29189: --- Description: In our PROD env, we have a pure Spark cluster, I think this is also pretty common,

[jira] [Created] (SPARK-32464) Support skew handling on join with one side that has no query stage

2020-07-27 Thread Wang, Gang (Jira)
Wang, Gang created SPARK-32464: -- Summary: Support skew handling on join with one side that has no query stage Key: SPARK-32464 URL: https://issues.apache.org/jira/browse/SPARK-32464 Project: Spark

[jira] [Updated] (SPARK-32464) Support skew handling on join that has one side with no query stage

2020-07-27 Thread Wang, Gang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang, Gang updated SPARK-32464: --- Summary: Support skew handling on join that has one side with no query stage (was: Support skew

[jira] [Commented] (SPARK-32464) Support skew handling on join that has one side with no query stage

2020-07-27 Thread Wang, Gang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166071#comment-17166071 ] Wang, Gang commented on SPARK-32464: A PR [https://github.com/apache/spark/pull/29266] > Support