edponce commented on a change in pull request #12055:
URL: https://github.com/apache/arrow/pull/12055#discussion_r840229428



##########
File path: cpp/src/arrow/chunked_array.cc
##########
@@ -147,13 +148,15 @@ bool ChunkedArray::ApproxEquals(const ChunkedArray& other,
 }
 
 Result<std::shared_ptr<Scalar>> ChunkedArray::GetScalar(int64_t index) const {
-  for (const auto& chunk : chunks_) {
-    if (index < chunk->length()) {
-      return chunk->GetScalar(index);
-    }
-    index -= chunk->length();
+  if (!chunk_resolver_) {
+    chunk_resolver_ = internal::make_unique<internal::ChunkResolver>(chunks_);

Review comment:
       For ChunkedArrays with large number of Arrays, there would be a 
noticeable overhead when creating the offsets for the ChunkResolver. If the 
application does not accesses the data multiple times (multiple `GetScalar()`), 
then the overhead may outweigh its potential benefits. On the other hand, using 
the lazy approach requires synchronization mechanisms.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to