seddonm1 commented on pull request #9976: URL: https://github.com/apache/arrow/pull/9976#issuecomment-817216973
@alamb Thanks for the review and I wanted to raise this early as I appreciate the feedback (even if we end up closing this PR). The `RecordBatch` metadata is definitely the easiest mechanism for attaching this data but I do agree with Jorge that the modifications to the Arrow crate are undesirable. As this is ultimately a lineage type capability (by traversing the plan) I have checked how Spark implements it and it will throw an error: `'input_file_name' does not support more than one sources` for queries not against the simple use case (i.e. direct select against a `TableProvider`) - for example if trying to execute it in a SQL query with multiple tables. I will have a go trying to make basic functionality work to ensure I'm not off on a wild goose chase. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
