Hi,

Thanks for the valuable feedback.
I have drafted a PR[1] to make RocksDB dependency optional with the maven
profile.

[1] https://github.com/apache/amoro/pull/3018

Best,
Jinsong

On Wed, Jul 10, 2024 at 11:33 AM Qishang Zhong <zhongqish...@gmail.com>
wrote:

> Thanks for starting this discussion.
>
> For some extreme cases, rocksdb is still needed to ensure stability under
> low memory conditions, even though it is slower.
>
> > 1. The first one is to add a Maven profile related to RocksDB, allowing
> users to manually use this profile to build the project and enable this
> feature when needed.
>
> +1, it is a good idea to reserve the user's to choose.
>
> > 2. The second method is to provide a bundled package for RocksDB,
> allowing
> it to be dynamically added at runtime.
>
> Will it be downloaded over the Internet? This is an unstable factor.
>
> Best,
> Qishang Zhong
>
> Xavier Bai <x...@apache.org> 于2024年7月9日周二 14:37写道:
>
> > +1 for option 1
> >
> > Jinsong Zhou <jinsongz...@apache.org> 于2024年7月9日周二 14:25写道:
> >
> > > Hi,
> > >
> > > Thanks for the input from xuba. Yes, Indeed, at this stage, we may
> still
> > > need some methods to allow users to add support for RocksDB when
> needed.
> > > However, we can consider removing it from the default installation
> > package.
> > >
> > > In my opinion, there are two possible methods:
> > > 1. The first one is to add a Maven profile related to RocksDB, allowing
> > > users to manually use this profile to build the project and enable this
> > > feature when needed.
> > > 2. The second method is to provide a bundled package for RocksDB,
> > allowing
> > > it to be dynamically added at runtime.
> > >
> > > Method one is much easier to implement and we should implement it
> first,
> > > and implement method two when we needed later.
> > >
> > > What do you think?
> > >
> > > Best,
> > > Jinsong
> > >
> > > On Tue, Jul 9, 2024 at 11:09 AM Xavier Bai <x...@apache.org> wrote:
> > >
> > > > There are still many optimisers in PROD environments that have
> rocksDB
> > > > storage enabled. Removing dependencies in projects is acceptable, but
> > we
> > > > should also provide documentation and description of what to do if
> > users
> > > > want to continue using the feature. For example, there could be
> support
> > > for
> > > > users to add dependencies individually, etc.
> > > >
> > > > Jinsong Zhou <jinsongz...@apache.org> 于2024年7月8日周一 17:36写道:
> > > >
> > > > > Hi devs,
> > > > >
> > > > > Recently, I have been working on reducing the size of the Amoro
> > > > > installation package. Considering the Amoro installation package is
> > > > almost
> > > > > 1GB in size, this task really should be done ASAP.
> > > > >
> > > > > I found the largest dependent of Amoro is the rocksdb lib (more
> than
> > > > 50MB).
> > > > > It is used to cache some data to disk storage when the memory is
> not
> > > > > enough. It is originally used to cache iceberg delete records in
> > > > > optimizers. But when we have improved the delete records caching
> with
> > > > bloom
> > > > > filter, this feature is really not needed anymore.
> > > > >
> > > > > So I am considering removing the rocksdb dependencies from the
> > project
> > > to
> > > > > reduce the installation package size.
> > > > >
> > > > > I  am looking forward to hearing any point from anyone regarding
> this
> > > > > issue.
> > > > >
> > > > > Best regards,
> > > > > Jinsong
> > > > >
> > > >
> > >
> >
>
>
> --
> Best Regards,
> Qishang Zhong
>

Reply via email to