felipecrv commented on code in PR #41956:
URL: https://github.com/apache/arrow/pull/41956#discussion_r1628039524
##########
cpp/src/arrow/array/array_nested.cc:
##########
@@ -847,14 +847,13 @@ Result<std::shared_ptr<Array>>
MapArray::FromArraysInternal(
const auto& typed_offsets = checked_cast<const OffsetArrayType&>(*offsets);
BufferVector buffers;
- int64_t null_count;
- if (null_bitmap != nullptr) {
- buffers = BufferVector({std::move(null_bitmap), typed_offsets.values()});
- null_count = null_bitmap->size();
- } else {
- buffers = BufferVector({null_bitmap, typed_offsets.values()});
- null_count = 0;
+ buffers.resize(2);
+ int64_t null_count = 0;
+ if (null_bitmap) {
+ buffers[0] = std::move(null_bitmap);
Review Comment:
The closest to documentation that I can find to cite:
https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rr-sharedptrparam-owner
`clang-tidy` also has a rule that complains about this (but note that we
don't enforce clang-tidy rules in Arrow currently).
I can also explain the mechanics with an example.
This is how we currently have some classes (ctor and factories taking
`const&`):
```cpp
class ListType {
ListType(const std::shared_ptr<DataType>& type)
: value_type_(type) {}
}
std::shared_ptr<ListType> list(const std::shared_ptr<DataType>& type) {
return std::make_shared<ListType>(type);
}
```
When I run:
```cpp
auto value_type = int8(); // creation of a shared_ptr (ref-count = 1)
// value_type is passed as a reference to the shared_ptr
// (a pointer to the smart pointer instance)
auto list_type = list(value_type);
// this in turn will call the ListType constructor by passing a reference
ListType::ListType(value_type);
// which will increment the ref-count of the shared_ptr on this line
: value_type_(type) { // ref-count = 2
// at the end of the scope `list_type` is destroyed
// which calls the `value_type_` destructor, since ref-count = 2,
// it only decrements the ref-count which ends-up at 1
// now value_type is destroyed after the ref-count reaches 0,
// so `DataType::~DataType()` is called
```
If instead we have ctors and factories be like this:
```cpp
class ListType {
ListType(std::shared_ptr<DataType> type)
: value_type_(std::move(type)) {} // note the move
}
std::shared_ptr<ListType> list(std::shared_ptr<DataType> type) {
return std::make_shared<ListType>(std::move(type));
}
```
when I run
```cpp
auto value_type = int8(); // creation of a shared_ptr (ref-count = 1)
// value_type is passed as a && to the shared_ptr because of the std::move
// (it's still a pointer, but it says that `value_type` can become
"moved-from")
auto list_type = list(std::move(value_type));
// this in turn will call the ListType constructor by passing a reference
ListType::ListType(std::move(value_type));
// which will construct value_type_ from an && meaning that
// the ref-count stays the same (1) and `value_type` is now a moved-from
`nullptr`
: value_type_(std::move(type)) { // ref-count = 1
// at the end of the scope `list_type` is destroyed
// which calls the `value_type_` destructor, since ref-count = 1,
// it calls `DataType::~DataType()` and destroys the `Int8Type` instance
// now value_type is destroyed but since it's `nullptr` nothing
// needs to happen
```
Since `shared_ptr`'s ref-count is atomic, the overhead of atomically
updating the integer adds up. The second version does 0 increments and 1
decrement while the first has to increment 1 time and perform 2 decrements.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]