alamb commented on PR #8779:
URL:
https://github.com/apache/arrow-datafusion/pull/8779#issuecomment-1883576445
> I wonder whether it's possible to change the signature like
Indeed -- I think that sounds like a good plan
I think your implementation will somewhere have to use the relative sizes of
`key` and `values` -- so maybe something like
```rust
pub fn estimated_hashtable_size<K, V>(len: usize) -> usize {
...
}
```
(I think some of the calculations make assumptions on the size of keys --
like in hash join maybe that they are `u64` 🤔 )
If this PR is getting too complicated, perhaps it was a poor choice to
suggest as a good first project. I thought it was going to be a matter of
refactoring 3 copies of code into a single function but that appears not to be
the case
> physical-plan seems depends on physical-expr, so adding the code to
arrow-datafusion/datafusion/physical-plan/src/common.rs and add dependency in
arrow-datafusion/datafusion/physical-expr/Cargo.toml will lead cyclic package
dependency issue on my side.
Perhaps we could move the size calculation into `datafusion_common` (or into
physical-expr)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]