alamb commented on PR #4800:
URL: https://github.com/apache/arrow-rs/pull/4800#issuecomment-1711981166

   > If the metadata is inconsistent how does it know which metadata to 
preserve?
   
   Right now the `Schema` of the output RecordBatch is the schema that was 
provided by the caller as the first argument to `concat_batches`
   
   To summarize a conversation I had with @tustvold  over slack
   
   1. I believe his core concern with this PR is that making this check more 
lax means that it is likely papering over what some people might perceive as a 
bug in the caller (in this case, inconsistent metadata)
   
   2. An alternative interpretation might be that by checking for exactly the 
same schema for all input batches, the `concat_batches` kernel is imposing a 
particular definition of schema equality and enforcing an invariant that might 
not be what other systems have in mind.  From this point of view, removing the 
`Schema` equality check entirely might be appropriate
   
   I am sure I can fix my particular problem (see 
https://github.com/influxdata/influxdb_iox/pull/8691/files#r1319044861)  other 
level of the stack (e.g in DataFusion) but it didn't feel right to me that 
`concat_batches` was enforcing some particular invariant that is not enforced 
elsewhere


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to