Github user tigerquoll commented on the issue:

    https://github.com/apache/spark/pull/21306
  
    Sure,
    I am looking at the point of view of supporting Kudu.  Check out 
https://kudu.apache.org/docs/schema_design.html#partitioning for some of the 
details.  In particular 
https://kudu.apache.org/2016/08/23/new-range-partitioning-features.html.
    As kudu is a column store, each column also has attributes associated with 
it such as encoding and compression codecs.
    Apache Kudu - Apache Kudu Schema 
Design<https://kudu.apache.org/docs/schema_design.html#partitioning>
    A new open source Apache Hadoop ecosystem project, Apache Kudu completes 
Hadoop's storage layer to enable fast analytics on fast data
    kudu.apache.org
    
    
    I really think that partitions should be considered part of the table 
schema.  They have an existence above and beyond the definition of a filter 
that matches a record.  Adding an empty partition changes the state of many 
underlying systems.  Many systems that support partitions also have APIs for 
adding and removing partition definitions, some systems require partition 
information to be specified during table creation.  Those systems that support 
changing partitions after creation usually have specific for adding and 
removing partitions.
    
    
    Dale,
    
    ________________________________
    From: Ryan Blue <notificati...@github.com>
    Sent: Tuesday, 4 September 2018 4:20 PM
    To: apache/spark
    Cc: tigerquoll; Comment
    Subject: Re: [apache/spark] [SPARK-24252][SQL] Add catalog registration and 
table catalog APIs. (#21306)
    
    
    Can we support column range partition predicates please?
    
    This has an "apply" transform for passing other functions directly through, 
so that may help if you have additional transforms that aren't committed to 
Spark yet.
    
    As for range partitioning, can you be more specific about what you mean? 
What does that transform function look like? Part of the rationale for the 
existing proposal is that these are all widely used and understood. I want to 
make sure that as we expand the set of validated transforms, we aren't 
introducing confusion.
    
    Also, could you share the use case you intend for this? It would be great 
to hear about uses other than just Iceberg tables.
    
    —
    You are receiving this because you commented.
    Reply to this email directly, view it on 
GitHub<https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fspark%2Fpull%2F21306%23issuecomment-418430089&data=02%7C01%7C%7C335b27fc36b2449d1ac208d612824fa2%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636716748222067761&sdata=yWzFakaWAq5yhYAo%2FuBoFkIXpP9hoh9f1N6xm3XcQOs%3D&reserved=0>,
 or mute the 
thread<https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAH9Fuh3RPZ-hTd5T3e92TX-xmiPHEGv5ks5uXqhEgaJpZM4T8FJh&data=02%7C01%7C%7C335b27fc36b2449d1ac208d612824fa2%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636716748222067761&sdata=wJSnYO69FKZ8ZHbqGNrxxGsjC1W0rR7NIWOAE0EqXTA%3D&reserved=0>.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to