wgtmac commented on PR #968:
URL: https://github.com/apache/parquet-mr/pull/968#issuecomment-1322172984

   > > > IMO, switching `ioThreadPool` and `processThreadPool` the reader 
instance level will make it more flexible.
   > > 
   > > 
   > > I've changed the thread pool so that it is not initialized by default 
but I left them as static members. Ideally, there should be a single IO thread 
pool that handles all the IO for a process and the size of the pool is 
determined by the bandwidthof the underlying storage system. Making them per 
instance is not an issue though. The calling code can decide to set the same 
thread pool for all instances and achieve the same result. Let me update this.
   > > Also, any changes you want to make are fine with me, and the help is 
certainly appreciated !
   > 
   > I'm thinking of merging the thread pools into a single `ioThreadPool` and 
making it settable thru `ParquetReadOptions` (like the allocator is). The work 
being done by the `processThreadPool` is rather small and maybe we can do away 
with it. Adding the pool via `ParquetReadOptions` makes it easier to use with 
`ParquetReader` (used a lot in unit tests). WDYT?
   
   Sorry for my late reply.
   
   Setting the thread pools via `ParquetReadOptions` is a good idea and that is 
exactly the way I want to do them away with static members. Merging 
`ioThreadPool` and `processThreadPool` into a single pool should work if the 
tasks in the `processThreadPool` do not wait for the return of tasks in the 
`ioThreadPool`. I will look into the detail later.
   
   BTW, I don't have the permission to directly update your PR in place as I am 
not yet a maintainer of the repo. I may need to open a new one by copying what 
you have done here and add you as a co-author. WDYT? If that sounds good to 
you, I can proceed. @parthchandra 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to