felipecrv commented on code in PR #43795:
URL: https://github.com/apache/arrow/pull/43795#discussion_r1728051117
##########
cpp/src/arrow/chunked_array.cc:
##########
@@ -48,16 +48,22 @@ ChunkedArray::ChunkedArray(ArrayVector chunks,
std::shared_ptr<DataType> type)
type_(std::move(type)),
length_(0),
null_count_(0),
+ device_type_(DeviceAllocationType::kCPU),
Review Comment:
I think we should consider not caching this piece of information as state in
the `ChunkedArray` instance and instead derive it from the chunks when we need
it.
Additionally, one advantage of chunking is the flexibility that it brings
regarding allocation of buffers (they don't have to be contiguous), so now
requiring that all chunks be allocated on the same device seems too rigid.
I proposed a solution to this: chunked arrays producing a
`DeviceAllocationTypeSet` with all the allocation types of the chunks. This set
can be represented by a single 64-bit word in memory (I used C++ `<bitset>`) so
it can be copied and matches very efficiently.
Here is the draft PR:
https://github.com/apache/arrow/pull/43542/files#diff-b4ffb36b29cfaa2cf9be4fab774921b8344efdc595a358b02c3187ba04141f7eR89
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]