Rafferty97 commented on PR #20604: URL: https://github.com/apache/datafusion/pull/20604#issuecomment-3987662109
> > @Omega359 Ah, that does make sense. > > So, if I understand correctly, the primary pain point is that you can wind up in situations where you've got two dataframes with the same logical schema, but different physical representations (e.g. `Utf8` v `Uft8View`), preventing them from being unioned together without some explicit casts? > > I think this points to a more fundamental tension between physical and logical types I think is worth addressing in a higher-level discussion. I'll have a look around to see if the topic has come up before. > > I am not 100% certain where the issue lies but I don't think I've seen it at the logical layer at all, nor with union. I think it's that I'm merging into a single table files that happen to have been written out with slightly different schemas resulting in `Field::try_merge` failing with `Fail to merge schema field '<field name>' because the from data_type = Utf8 does not equal Utf8View`. Ah right, so you're running into schema conflicts when writing files to disk, then reading them back into a single table? Are you writing out arrow files? I can't imagine any other format would round-trip `Utf8View`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
