[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

cloud-fan Sat, 27 Aug 2016 00:06:59 -0700

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/14712
  
    I wanna make sure we are in the same page:
    **top priority**: users analyze the table in Spark and query it in Spark. 
(this must work)
    **second priority**: the table is already analyzed by Hive, users query it 
in Spark.(this should work)
    **low priority**: users analyze the table in Spark and query it in Hive.(it 
will be good if this works, but also fine if this doesn't)
    
    I'm a little hesitant about dealing with the 
`STATS_GENERATED_VIA_STATS_TASK` and `COLUMN_STATS_ACCURATE` flags, are they 
only needed for the third(low priority) target? If they are, I'd like to ignore 
them to simplify the logic.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #14712: [SPARK-17072] [SQL] support table-level statistics gener...

Reply via email to