yma11 commented on issue #5884:
URL: 
https://github.com/apache/incubator-gluten/issues/5884#issuecomment-2153715084

   @zhli1142015 @FelixYBW @zhouyuan @zhztheplayer The code changes are 
available in following PRs: 
[Spark](https://github.com/yma11/spark/pull/4/files), 
[Gluten](https://github.com/yma11/gluten/pull/2), 
[Velox](https://github.com/yma11/velox/pull/1), please take a review. Next step 
I will test it in E2E and add some docs for it. Here are some explanations 
about code change:
   1) New files in `shims/common`: Existing memory allocator listeners such as 
`ManagedAllocationListener` are under package `gluten-data` and native JNIs are 
under `backends-velox`, but because of I need to call these classes/APIs in the 
injects, so I put them in `shims/common`.
   2) Late initialization of file cache: We use `GlutenMemStoreInjects` to get 
the conf of cache and then do initialization after Velox backend initialized 
which assures the native libs are loaded.
   3) Cache size setting: we need to pass a cache size when 
`setAsyncDataCache`, using the default  `int64_t max` will cause a 
`std::bad_alloc`. But the size is sensitive since in Velox, data cache will use 
this value to control the memory allocation. If it is too small, allocation 
failure will happen at native side even Spark doesn't report it at java side. 
As We leverage Spark memory manager to control the memory logic, we'd resolve 
this confliction by giving a large fake size for AsyncDataCache, maybe same as 
offheap size.
   4) SSD cache can't work well in my test as the file cache entry is easily 
larger than `8M` and will cause check failure. 
[Issue](https://github.com/facebookincubator/velox/issues/10098) is reported 
for tracking.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to