rtpsw commented on PR #12601:
URL: https://github.com/apache/arrow/pull/12601#issuecomment-1113945812

   > As for the PR, I think this is part of the solution but I think we also 
want to handle the case where the last node in the declaration is not a project 
node. And we will want to have a plan in place for round-trip. A simple 
solution would be to just add a project node onto the end (if one is not there 
already) but that would mean our plan wouldn't round-trip cleanly.
   
   Agreed, this would not round-trip cleanly. While this could be fixed by 
creating a special projection-node, which could be removed when going back the 
round-trip, that seems even less elegant.
   
   > Another approach could be to add a vector of names to 
`ConsumingSinkNodeOptions`. The sink node could then take the output schema 
from its input, swap out the names, and pass that on to the consumer. This 
should round-trip nicely.
   
   Yes, sounds like this should work and round-trip nicely. I'll look into 
implementing it. One thing to keep in mind here is that in the future we may 
want to use other sink nodes (e.g., "write" rather than "consuming_sink") in 
Substrait, so I think it makes sense to factor out the names into a reusable 
options class (to be included as a member in `ConsumingsinkNodeOptions`) and 
the schema-renaming operation for it into a reusable function.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to