viirya commented on PR #52578:
URL: https://github.com/apache/spark/pull/52578#issuecomment-3398615399

   > Thanks @viirya for the reply! If i understand correctly, the core concern 
is: “If Spark pushes a variant scan into a DSv2 source that doesn’t understand 
it, we’ll see unexpected errors.” I agree — that’s exactly the failure mode we 
should avoid.
   > 
   > Would it be acceptable if I add a planner-side guard so this only happens 
when the source explicitly opts in?
   
   Thanks @huaxingao for the discussion and understanding.
   
   I think we need an explicit DSv2 API to make the contract between Spark and 
datasource implementation around this variant pushdown feature. That is this PR 
proposed to do.
   
   > On tests: I understand my InMemoryTable test might have issues. Can we fix 
it to exercise the planner contract correctly, or do you consider a built-in 
DSv2 Parquet test required for DSv2 change?
   
   That is also what this PR proposed to do, adding dedicated DSv2 Variant 
pushdown tests for both row-based and vectorized-based readers with good test 
coverage.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to