sundy-li commented on pull request #9596:
URL: https://github.com/apache/arrow/pull/9596#issuecomment-788519719
> Also for memory/performance concerns - I think it makes sense to think
about those as well. We will do some extra copies by casting, so `u8 + u8` will
be converted if I understand correctly to `c(u8, u16) + c(u8, u16)`, but `u8 +
u8 + u8` will involve more copies to something like `c(c(u8, u16)+ c(u8, u16),
u32) + c(u8, u32)` etc. as the subexpressions are coerced to u16 already first.
This could be optimized out a bit maybe to avoid the double casting, but it
shows it can be quite inefficient.
>
> Also when doing IO and keeping the result in memory will result in more
storage costs, longer latency and/or higher memory usage if you don't convert
them to use smaller types again.
Yes, it is. But correctness is important than efficiency. So it may be worth
doing that. I'm a fan of ClickHouse, ClickHouse always pursues the ultimate
performance, but it had to do that too.
About behavior in mysql:
```
mysql> select 3 / 2, 2 - 3, 127 + 3, 255 + 8;
+--------------+-------------+--------------+--------------+
| divide(3, 2) | minus(2, 3) | plus(127, 3) | plus(255, 8) |
+--------------+-------------+--------------+--------------+
| 1.5 | -1 | 130 | 263 |
+--------------+-------------+--------------+--------------+
1 row in set (0.01 sec)
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]