Hi, Thanks for the valuable feedback. I have drafted a PR[1] to make RocksDB dependency optional with the maven profile.
[1] https://github.com/apache/amoro/pull/3018 Best, Jinsong On Wed, Jul 10, 2024 at 11:33 AM Qishang Zhong <zhongqish...@gmail.com> wrote: > Thanks for starting this discussion. > > For some extreme cases, rocksdb is still needed to ensure stability under > low memory conditions, even though it is slower. > > > 1. The first one is to add a Maven profile related to RocksDB, allowing > users to manually use this profile to build the project and enable this > feature when needed. > > +1, it is a good idea to reserve the user's to choose. > > > 2. The second method is to provide a bundled package for RocksDB, > allowing > it to be dynamically added at runtime. > > Will it be downloaded over the Internet? This is an unstable factor. > > Best, > Qishang Zhong > > Xavier Bai <x...@apache.org> 于2024年7月9日周二 14:37写道: > > > +1 for option 1 > > > > Jinsong Zhou <jinsongz...@apache.org> 于2024年7月9日周二 14:25写道: > > > > > Hi, > > > > > > Thanks for the input from xuba. Yes, Indeed, at this stage, we may > still > > > need some methods to allow users to add support for RocksDB when > needed. > > > However, we can consider removing it from the default installation > > package. > > > > > > In my opinion, there are two possible methods: > > > 1. The first one is to add a Maven profile related to RocksDB, allowing > > > users to manually use this profile to build the project and enable this > > > feature when needed. > > > 2. The second method is to provide a bundled package for RocksDB, > > allowing > > > it to be dynamically added at runtime. > > > > > > Method one is much easier to implement and we should implement it > first, > > > and implement method two when we needed later. > > > > > > What do you think? > > > > > > Best, > > > Jinsong > > > > > > On Tue, Jul 9, 2024 at 11:09 AM Xavier Bai <x...@apache.org> wrote: > > > > > > > There are still many optimisers in PROD environments that have > rocksDB > > > > storage enabled. Removing dependencies in projects is acceptable, but > > we > > > > should also provide documentation and description of what to do if > > users > > > > want to continue using the feature. For example, there could be > support > > > for > > > > users to add dependencies individually, etc. > > > > > > > > Jinsong Zhou <jinsongz...@apache.org> 于2024年7月8日周一 17:36写道: > > > > > > > > > Hi devs, > > > > > > > > > > Recently, I have been working on reducing the size of the Amoro > > > > > installation package. Considering the Amoro installation package is > > > > almost > > > > > 1GB in size, this task really should be done ASAP. > > > > > > > > > > I found the largest dependent of Amoro is the rocksdb lib (more > than > > > > 50MB). > > > > > It is used to cache some data to disk storage when the memory is > not > > > > > enough. It is originally used to cache iceberg delete records in > > > > > optimizers. But when we have improved the delete records caching > with > > > > bloom > > > > > filter, this feature is really not needed anymore. > > > > > > > > > > So I am considering removing the rocksdb dependencies from the > > project > > > to > > > > > reduce the installation package size. > > > > > > > > > > I am looking forward to hearing any point from anyone regarding > this > > > > > issue. > > > > > > > > > > Best regards, > > > > > Jinsong > > > > > > > > > > > > > > > > > -- > Best Regards, > Qishang Zhong >