rtpsw commented on PR #12601: URL: https://github.com/apache/arrow/pull/12601#issuecomment-1113945812
> As for the PR, I think this is part of the solution but I think we also want to handle the case where the last node in the declaration is not a project node. And we will want to have a plan in place for round-trip. A simple solution would be to just add a project node onto the end (if one is not there already) but that would mean our plan wouldn't round-trip cleanly. Agreed, this would not round-trip cleanly. While this could be fixed by creating a special projection-node, which could be removed when going back the round-trip, that seems even less elegant. > Another approach could be to add a vector of names to `ConsumingSinkNodeOptions`. The sink node could then take the output schema from its input, swap out the names, and pass that on to the consumer. This should round-trip nicely. Yes, sounds like this should work and round-trip nicely. I'll look into implementing it. One thing to keep in mind here is that in the future we may want to use other sink nodes (e.g., "write" rather than "consuming_sink") in Substrait, so I think it makes sense to factor out the names into a reusable options class (to be included as a member in `ConsumingsinkNodeOptions`) and the schema-renaming operation for it into a reusable function. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
