comphead commented on issue #5108:
URL:
https://github.com/apache/arrow-datafusion/issues/5108#issuecomment-1409599920
Thanks @tustvold , I ran the test with configured mem max set up and spill
enabled.
```
#[tokio::test]
async fn test_huge_sort() -> Result<()> {
let runtime_config =
crate::execution::runtime_env::RuntimeConfig::new()
.with_memory_pool(Arc::new(crate::execution::memory_pool::GreedyMemoryPool::new(1024*1024*1024)))
.with_disk_manager(crate::execution::disk_manager::DiskManagerConfig::new_specified(vec!["/Users/a/spill/".into()]));
let runtime =
Arc::new(crate::execution::runtime_env::RuntimeEnv::new(runtime_config).unwrap());
let ctx = SessionContext::with_config_rt(SessionConfig::new(),
runtime);
ctx.register_parquet(
"lineitem",
"/Users/a/lineitem.parquet",
ParquetReadOptions::default(),
)
.await
.unwrap();
let sql = "select * from lineitem order by l_shipdate";
let dataframe = ctx.sql(sql).await.unwrap();
dataframe.show_limit(10).await?;
//dataframe.write_parquet("/Users/a/lineitem_sorted.parquet",
None).await;
Ok(())
}
```
@andygrove Seems the test still trying to consume only available memory,
without exhausting all machine memory
```
Error: External(ResourcesExhausted("Failed to allocate additional 1419104
bytes for RepartitionExec[3] with 0 bytes already allocated - maximum available
is 496736"))
test dataframe::tests::test_huge_sort ... FAILED
```
@tustvold However diskmanager doesn't spill anything into the folder, is it
expected?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]