rockyzhengwu opened a new issue, #2875:
URL: https://github.com/apache/arrow-rs/issues/2875
**Describe the bug**
We use ArrowWriter but found memory leak. I wrote a sample code . profile
with bytehound
```
Total: 1
Leaked: 1
Backtrace:
#00 [mem-leak] _start
#01 [libc.so.6] __libc_start_main
#02 [libc.so.6] 7f7666c98d8f
#03 [mem-leak] main
#16 [mem-leak] mem_leak::main [main.rs:15]
#17 [mem-leak] parquet::arrow::arrow_writer::ArrowWriter<W>::close
[mod.rs:234]
#18 [mem-leak] parquet::arrow::arrow_writer::ArrowWriter<W>::flush
[mod.rs:161]
#19 [mem-leak] parquet::arrow::arrow_writer::ArrowWriter<W>::flush_rows
[mod.rs:217]
#20 [mem-leak] parquet::arrow::arrow_writer::write_leaves [mod.rs:273]
#21 [mem-leak]
parquet::file::writer::SerializedRowGroupWriter<W>::next_column [writer.rs:427]
#22 [mem-leak]
parquet::file::writer::SerializedRowGroupWriter<W>::next_column_with_factory
[writer.rs:415]
#23 [mem-leak]
parquet::file::writer::SerializedRowGroupWriter<W>::next_column::{{closure}}
[writer.rs:428]
#24 [mem-leak] parquet::column::writer::get_column_writer [mod.rs:78]
#25 [mem-leak] parquet::column::writer::GenericColumnWriter<E>::new
[mod.rs:225]
#26 [mem-leak] <parquet::column::writer::encoder::ColumnValueEncoderImpl<T>
as parquet::column::writer::encoder::ColumnValueEncoder>::try_new
[encoder.rs:167]
#27 [mem-leak] core::bool::<impl bool>::then [bool.rs:71]
#28 [mem-leak] <parquet::column::writer::encoder::ColumnValueEncoderImpl<T>
as parquet::column::writer::encoder::ColumnValueEncoder>::try_new::{{closure}}
[encoder.rs:167]
#29 [mem-leak]
parquet::encodings::encoding::dict_encoder::DictEncoder<T>::new
[dict_encoder.rs:92]
#30 [mem-leak] parquet::util::interner::Interner<S>::new [interner.rs:56]
#31 [mem-leak] <ahash::random_state::RandomState as
core::default::Default>::default [interner.rs:56]
#32 [mem-leak] ahash::random_state::RandomState::new [random_state.rs:216]
#33 [mem-leak] ahash::random_state::get_fixed_seeds [random_state.rs:216]
#34 [mem-leak] once_cell::race::once_box::OnceBox<T>::get_or_init
[race.rs:256]
#35 [mem-leak] once_cell::race::once_box::OnceBox<T>::get_or_try_init
[race.rs:276]
#36 [mem-leak]
once_cell::race::once_box::OnceBox<T>::get_or_init::{{closure}} [race.rs:256]
#37 [mem-leak] ahash::random_state::get_fixed_seeds::{{closure}}
[random_state.rs:78]
#38 [mem-leak] alloc::boxed::Box<T>::new [random_state.rs:78]
#39 [mem-leak] alloc::alloc::exchange_malloc [alloc.rs:330]
#40 [mem-leak] <alloc::alloc::Global as core::alloc::Allocator>::allocate
[alloc.rs:330]
#41 [mem-leak] alloc::alloc::Global::alloc_impl [alloc.rs:181]
#42 [mem-leak] alloc::alloc::alloc [alloc.rs:181]
```
use
**To Reproduce**
simple code
``` rust
use arrow::array::Int64Array;
use arrow::array::ArrayRef;
use arrow::record_batch::RecordBatch;
use std::sync::Arc;
use parquet::arrow::ArrowWriter;
fn main() {
let col = Arc::new(Int64Array::from_iter_values([1, 2, 3])) as ArrayRef;
let to_write = RecordBatch::try_from_iter([("col", col)]).unwrap();
let mut buffer = Vec::new();
let mut writer = ArrowWriter::try_new(&mut buffer, to_write.schema(),
None).unwrap();
writer.write(&to_write).unwrap();
writer.close().unwrap();
}
```
Rust compiler is build by myself with `debuginfo-level = 1` config.
compile example code with
```
[profile.release]
debug = 1
```
bytehound script
```
let groups = allocations()
.only_leaked()
.group_by_backtrace()
.sort_by_size();
graph().add(groups).save();
fn analyze_group(list) {
let list_all = allocations().only_matching_backtraces(list);
graph()
.add("Leaked", list_all.only_leaked())
.add("Temporary", list_all)
.save();
println("Total: {}", list_all.len());
println("Leaked: {}", list_all.only_leaked().len());
println();
println("Backtrace:");
println(list_all[0].backtrace().strip());
}
analyze_group(groups[0]);
```
**Expected behavior**
<!--
A clear and concise description of what you expected to happen.
-->
**Additional context**
<!--
Add any other context about the problem here.
-->
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]