mingmwang commented on PR #6065:
URL:
https://github.com/apache/arrow-datafusion/pull/6065#issuecomment-1518940220
From the Flame Graph, it shows the hot path of the method
`slice_and_maybe_filter` should be `SpecFromIter`.
And the hot path of the method of `SpecFromIter::from_iter` should be the
`slice of ArrowArray`, but not the memory allocations of the Vec.
```rust
impl<T> SpecFromIter<T, IntoIter<T>> for Vec<T> {
fn from_iter(iterator: IntoIter<T>) -> Self {
// A common case is passing a vector into a function which
immediately
// re-collects into a vector. We can short circuit this if the
IntoIter
// has not been advanced at all.
// When it has been advanced We can also reuse the memory and move
the data to the front.
// But we only do so when the resulting Vec wouldn't have more
unused capacity
// than creating it through the generic FromIterator implementation
would. That limitation
// is not strictly necessary as Vec's allocation behavior is
intentionally unspecified.
// But it is a conservative choice.
let has_advanced = iterator.buf.as_ptr() as *const _ != iterator.ptr;
if !has_advanced || iterator.len() >= iterator.cap / 2 {
unsafe {
let it = ManuallyDrop::new(iterator);
if has_advanced {
ptr::copy(it.ptr, it.buf.as_ptr(), it.len());
}
return Vec::from_raw_parts(it.buf.as_ptr(), it.len(),
it.cap);
}
}
let mut vec = Vec::new();
// must delegate to spec_extend() since extend() itself delegates
// to spec_from for empty Vecs
vec.spec_extend(iterator);
vec
}
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]