sundy-li commented on pull request #9596: URL: https://github.com/apache/arrow/pull/9596#issuecomment-788519719
> Also for memory/performance concerns - I think it makes sense to think about those as well. We will do some extra copies by casting, so `u8 + u8` will be converted if I understand correctly to `c(u8, u16) + c(u8, u16)`, but `u8 + u8 + u8` will involve more copies to something like `c(c(u8, u16)+ c(u8, u16), u32) + c(u8, u32)` etc. as the subexpressions are coerced to u16 already first. This could be optimized out a bit maybe to avoid the double casting, but it shows it can be quite inefficient. > > Also when doing IO and keeping the result in memory will result in more storage costs, longer latency and/or higher memory usage if you don't convert them to use smaller types again. Yes, it is. But correctness is important than efficiency. So it may be worth doing that. I'm a fan of ClickHouse, ClickHouse always pursues the ultimate performance, but it had to do that too. About behavior in mysql: ``` mysql> select 3 / 2, 2 - 3, 127 + 3, 255 + 8; +--------------+-------------+--------------+--------------+ | divide(3, 2) | minus(2, 3) | plus(127, 3) | plus(255, 8) | +--------------+-------------+--------------+--------------+ | 1.5 | -1 | 130 | 263 | +--------------+-------------+--------------+--------------+ 1 row in set (0.01 sec) ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org