aokolnychyi commented on PR #53276:
URL: https://github.com/apache/spark/pull/53276#issuecomment-3614276778

   > Currently VariantAccessInfo represents an access to a variant column. So 
it has a member String columnName. What does a VariantExtraction represent for?
   
   > Although you said "each variant_get expression as separate 
VariantExtraction", if there are multiple variant_gets for same variant column, 
you mean to have multiple VariantExtractions? Currently they are all 
represented by one VariantAccessInfo for the variant column, I think it makes 
more sense.
   
   I expect each `variant_get ` and `try_ variant_get` to be converted to 
`VariantExtraction` with variant column name parts and extraction JSON path. If 
a connector has shredded 2 out 3 requested columns, it can simply mark with 
booleans what it supports and what must be done in Spark. If we use 
`VariantAccessInfo`, then they would have to create a new StructType? Seems 
very complicated and error-prone.
   
   > I think connectors still can read and parse the variant to required type 
even it is not a shredded variant. From the view of Spark and DSv2 API, we 
don't need to know how the connectors fulfill the pushdown requirement.
   
   I feel this is VERY dangerous. I read through the casting logic in Spark. It 
has so many edge cases. There is no way connectors will replicate this 
behavior. We don't want to have inconsistent shredding between connectors. In 
the future, we may add a casting function to `VariantExtraction` that Spark 
would provide. That said, I would not do it now.
   
    
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to