[ 
https://issues.apache.org/jira/browse/ARROW-11762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17387396#comment-17387396
 ] 

Ben Kietzman commented on ARROW-11762:
--------------------------------------

You're correct for those filter expressions, but I was referring to the 
guarantees produced by partitions. Specifically, currently it's legal for a 
HivePartitioning to parse either of {{/a=0/}} or 
{{/a=0/b=__HIVE_DEFAULT_PARTITION__/}} as {{a == 0}} or as {{a == 0 and 
is_null(b)}}. The former guarantee doesn't include explicit information about 
field {{b}}, which we currently consider to be equivalent to specifying that 
it's null. This is not optimal; we'd prefer to be specific

> [C++][Dataset] Refactor Partitioning to explicitly treat null and absent 
> fields identically
> -------------------------------------------------------------------------------------------
>
>                 Key: ARROW-11762
>                 URL: https://issues.apache.org/jira/browse/ARROW-11762
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>    Affects Versions: 3.0.0
>            Reporter: Ben Kietzman
>            Assignee: Weston Pace
>            Priority: Major
>             Fix For: 6.0.0
>
>
> ARROW-10438 adds support for partition expressions with explicit absence of a 
> partition key by including an {{is_null(field_ref("absent key field name"))}} 
> in the conjunction. Whenever possible, this should be preferred to an 
> equivalent conjunction which simply omits an equality expression for the 
> missing field.
> Additionally since an absent partition key and a null partition key is 
> semantically equivalent to a  null valued partition key, we should ensure 
> there is no difference in behavior. Currently, {{equal(field_ref("a"), 
> literal(0))}} and {{and_(equal(field_ref("a"), literal(0)), is_null("b"))}} 
> are formatted differently 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to