[GitHub] spark pull request: [SPARK-12850] [SQL] Support Bucket Pruning (Pr...

gatorsmile Fri, 29 Jan 2016 11:22:38 -0800

Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/10942#issuecomment-176920834
  
    @cloud-fan I still have a question about the low-level test for verifying 
if pruning works. 
    
    For example, Below is the physical plan of a query 
`hiveContext.table("bucketed_table").filter($"j" === 0).queryExecution.toRdd`, 
where `j` is a bucketing key.
    ```
    Filter (j#21 = 0)
    +- Scan ParquetRelation[j#21,k#22,i#23] InputPaths: 
file:/private/var/folders/4b/sgmfldk15js406vk7lw5llzw0000gn/T/warehouse--3a744e2c-d2a6-4a1a-ba61-1dc4197744d6/bucketed_table,
 PushedFilters: [EqualTo(j,0)]
    ```
    
    ```hiveContext.table("bucketed_table").filter($"j" === 
0).queryExecution.toRdd```. When we running this statement, the filter will 
still remove all the ineligible rows per partition even if bucket pruning does 
not work. Is my understanding right? 
    
    If so, how to verify the pruning works? Thank you!




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-12850] [SQL] Support Bucket Pruning (Pr...

Reply via email to