alamb opened a new issue, #7419:
URL: https://github.com/apache/arrow-datafusion/issues/7419

   ### Is your feature request related to a problem or challenge?
   
   
   While trying to test https://github.com/apache/arrow-datafusion/pull/7400 
with `datafusion-cli` I found I can't do it with datafuson-cli because 
datafusion-cli doesn't have a memory manager enabled. 
   
   
   
   ### Describe the solution you'd like
   
   I would like to add two new new command line options to datafusion-cli 
   1. `-m` / `--mem-limit` that if set, would set the memory pool size limit. 
If unset no memory pool is used
   2.  `--mem-pool-type=<greedy|fair>`, defaults to `greedy` that specifies the 
pool type: 
[GreedyMemoryPool](https://docs.rs/datafusion/latest/datafusion/execution/memory_pool/struct.GreedyMemoryPool.html#)
 or 
[FairSpillPool](https://docs.rs/datafusion/latest/datafusion/execution/memory_pool/struct.FairSpillPool.html#)
 respectively
   
   Examples of usage
   
   ```shell
   # memory is not limited
   datafusion-cli -c 'select 1, 2 from foo';
   
   # run query with greedy memory pool set to use 10G
   datafusion-cli --memory-limit 10G -c 'select 1, 2 from foo'; 
   
   # run query with greedy memory pool set to use 10G
   datafusion-cli -m 10G -c 'select 1, 2 from foo'; 
   
   # run query with fair memory pool set to use 10G
   datafusion-cli --pool-type=fair -m 10G -c 'select 1, 2 from foo'; 
   ```
   
   
   See 
https://docs.rs/datafusion/latest/datafusion/execution/memory_pool/struct.FairSpillPool.html
 for more details
   
   ### Describe alternatives you've considered
   
   I also thought about setting the pools via `SET` commands (like setting the 
target batch size). However,  I don't think we should allow change memory 
limits via SQL because  memory limits is likely not something a multi-tenant 
system would like to do . It should be setup before the session starts or by 
the runtime system, not the user in SQL
   
   
   ### Additional context
   
   Since this is well specified and is mostly an exercise in figuring out how 
`datafusion-cli` works, I think this would make a good first project


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to