Glad to see everyone’s discussion,Er, the compact topic is also a way to
implement the KV semantic storage, but If the compact index is introduced,
In fact, different from messaging products, it cannot be used as a log
cleaner,and will only be used to reduce the cost of message replay, and if
we want to make the compact topic to be a way to clean up messages like
other messaging products, the changes to the storage layer will also be
very large.

dongeforever <[email protected]> 于2021年9月24日周五 下午8:41写道:

> keep calm.
> Every RocketMQ contributor has the responsibility to maintain and develop
> the architecture.
> Integrating RocksDB has its pros and cons, so does the compaction(not
> compression) topic.
> It's better to continue the discussion if each could show the details
> without emotion.
> It is also ok to start a meeting about this proposal.
>
> In my opinion, KV feature is important for RocketMQ, the question is in
> what a way.
>
> vongosling <[email protected]> 于2021年9月24日周五 下午3:58写道:
>
> > I just wonder if we could not open a new kV module, we really could not
> > solve the question you're mentioned? we could introduce a compression
> topic
> > for the last effective replay. Do we know how hard it is to maintain a
> > RocksDB, which my Facebook friends have talked to me a lot about,
> > their various kV production around RocksDB?
> >
> > Since we open-source RocketMQ, I've tried to help keep the architecture
> as
> > simple as possible, and I've warned you to learn to subtract while adding
> > things. While the code scale is still under control, I think the first
> > thing to really do is to design a good pluggable system that replaces the
> > existing simple SPI and API approach, but unfortunately, I haven't seen
> any
> > progress in that regard so far.
> >
> > Therefore, I think we should carefully examine our technical solutions.
> We
> > should not let the technical solutions just only for streams :-)
> >
> > heng du <[email protected]> 于2021年9月24日周五 下午12:43写道:
> >
> > > Totally agree with this proposal. The KV semantic storage can not only
> > > provide better support for streaming and connect, especially the
> storage
> > of
> > > checkpoints but also can be used to better manage metadata in the
> future.
> > > At the same time, compared to the compact topic, this proposal can
> > > significantly reduce user replay costs and save failure recovery time,
> > and
> > > KV semantic storage can actually be regarded as another index similar
> to
> > > CQ, which can be loaded on demand. In addition, there seems to be an
> > > unvoted RIP-22 proposal, but please don’t care :)
> > >
> > >
> > > vongosling <[email protected]> 于2021年9月23日周四 下午8:35写道:
> > >
> > > > Thanks for your clarify. I have been confused by RIP 22, It seems we
> > have
> > > > occupied 22, right[1]?
> > > >
> > > >
> > > > [1] https://github.com/apache/rocketmq/issues/2937
> > > >
> > > > Amber Liu <[email protected]> 于2021年9月23日周四 下午3:29写道:
> > > >
> > > > > Sorry about the format problem, below is the correct one.
> > > > > RIP-22 Support KV semantic storageStatus
> > > > >
> > > > >    - Current Status: Draft
> > > > >    - Authors: ltamber <https://github.com/ltamber>
> > > > >
> > > > >
> > > > >    - Shepherds: duhengforever <[email protected]>
> > > > >    - Mailing List discussion: [email protected]
> > > > >
> > > > >
> > > > >    - Pull Request: #PR_NUMBER
> > > > >    - Released: <released_version>
> > > > >
> > > > > Background & Motivationwhat do we need to do
> > > > >
> > > > >    - will we add a new module? *no*.
> > > > >    - will we add new APIs? *yes*.
> > > > >
> > > > >
> > > > >    - will we add new feature? *yes*.
> > > > >
> > > > > Why should we do that
> > > > >
> > > > >    - Are there any problems of our current project?
> > > > >      Currently, we can't get/put key-value from/into rocketmq, so
> if
> > we
> > > > use
> > > > >    connector <https://github.com/apache/rocketmq-externals>, like
> > > > >    FileSource, BinlogSource, we can't persist current read
> > > position/dump
> > > > >    position to rocketmq rather than an external meta store like
> > > > >    zookeeper/mysql, this will bring more operator risk by introduce
> > > > another
> > > > >    component. this issue was also in streaming
> > > > >    <https://github.com/apache/rocketmq-streams> scenarios when
> > > developer
> > > > >    want to persist meta info like checkpoint.
> > > > >    - What can we benefit proposed changes?
> > > > >       rocketmq would not rely on external componet such as
> > > zookeeper/etcd
> > > > >    to support meta data storage.
> > > > >
> > > > > Goals
> > > > >
> > > > >    - What problem is this proposal designed to solve?
> > > > >       Design a distribution persistent key-value store,
> application
> > > can
> > > > >    put key-value into broker, and then get the value after a while,
> > in
> > > > the
> > > > >    same time, it can also have the ability like compareAndSet,
> prefix
> > > get
> > > > > and
> > > > >    so on.
> > > > >    - To what degree should we solve the problem?
> > > > >       This RIP must guarantee below point:
> > > > >       1. High availablity: if one broker in the broker group is
> down,
> > > > >    application can put/get key-value through other broker, the
> > > > availablity
> > > > > is
> > > > >    same with the message of rocketmq.
> > > > >       2. High capacity: the amount of key-value may very large, so
> > the
> > > > >    key-value can not store in memory,  we must store the key-value
> in
> > > > disk
> > > > >    device.
> > > > >
> > > > > Non-Goals
> > > > >
> > > > >    - What problem is this proposal NOT designed to solve?
> > > > >       Nothing specific.
> > > > >    - Are there any limits of this proposal?
> > > > >       Nothing specific.
> > > > >
> > > > > ChangesArchitecture
> > > > >
> > > > >
> > > > >
> > > > > We will introduce rocksdb <https://github.com/facebook/rocksdb> to
> > > > persist
> > > > > key-value data, to say it more accurately, we use rocksdb to
> compact
> > > the
> > > > > value with the same key, we will not enable WAL in rocksdb to
> > decrease
> > > > > write amplification (most case), instead we can recover the rocksdb
> > > state
> > > > > and consistency by redo rocketmq commitlog. so the put/get flow
> > showed
> > > on
> > > > > the above figure is:
> > > > > put: the key-value message will put into commitlog first, and then
> > > > through
> > > > > the reputService redo commitlog, the key-value will put to rocksdb
> > > > > asynchronous, until this reput finished broker will not response to
> > > > client.
> > > > > get: application will get key-value from rocksdb thought broker
> > > directly.
> > > > > In addition, if we don't want introduce rocksdb
> > > > > <https://github.com/facebook/rocksdb> and the meta data content
> will
> > > not
> > > > > occupy too many memory, we can also use a key-value store base on
> > > memory
> > > > > map, there will a periodic serialization and persistence thread to
> > > > > guarantee data won't loss if broker restart or system abnormal
> > > shutdown,
> > > > > and the memory state consistency will also guaranteed by redo
> > rocketmq
> > > > > commitlog.
> > > > > Interface Design/Change
> > > > >
> > > > >    - Method signature changes. *No*
> > > > >    - Method behavior changes. *No*
> > > > >
> > > > >
> > > > >    - CLI command changes. *No*
> > > > >    - Log format or content changes.
> > > > >       the properties of the message will add two flag, kv_opType
> > > indicate
> > > > >    the request type is put key-value or get key-value, and key
> > indicate
> > > > the
> > > > >    request key both in put or get operation. In order to pass the
> key
> > > > > through
> > > > >    the network in the request header, we will encode/decode the
> > > key(byte
> > > > > array
> > > > >    format) use base64
> > > > >    <
> https://docs.oracle.com/javase/8/docs/api/java/util/Base64.html>
> > > > >     encoding method.
> > > > >
> > > > >
> > > > > Compatibility, Deprecation, and Migration Plan
> > > > >
> > > > >    - Are backward and forward compatibility taken into
> consideration?
> > > > >       New RequestCode between client and broker are added, so there
> > > are 2
> > > > >    compatibility situations:
> > > > >        1. old client+new broker: old clients won't make request
> with
> > > > >    key-value flag, so broker will not receive key-value request,
> > which
> > > > keep
> > > > >    all things as before.
> > > > >        2. new client+old broker: new clients will send key-value
> > > request,
> > > > >    but the broker don't recognize the request code, and will return
> > > error
> > > > > msg.
> > > > >    so we should upgrade broker first to support this feature.
> > > > >    - Are there deprecated APIs?
> > > > >       Nothing specific.
> > > > >
> > > > >
> > > > >    - How do we do migration?
> > > > >       Nothing specific.
> > > > >
> > > > > Implementation Outline
> > > > >
> > > > > We will implement the proposed changes by two phases.
> > > > > Phase 1
> > > > >
> > > > >    1. Implement reput logic from commitlog to rocksdb.
> > > > >    2. Implement broker support key-value request and response.
> > > > >
> > > > >
> > > > >    1. Implement client support key-value request and response.
> > > > >    2. Implement key-value store use memory map.
> > > > >
> > > > >
> > > > >    1. Implement key-value store use rocksdb.
> > > > >
> > > > > Phase 2
> > > > >
> > > > >    1. Implement prefix get semantics.
> > > > >    2. Implement compareAndSet semantics.
> > > > >
> > > > >
> > > > >    1. Implement rocksdb snapshot export/import.
> > > > >
> > > > >
> > > > > Amber Liu <[email protected]> 于2021年9月23日周四 上午10:10写道:
> > > > >
> > > > > > # RIP-22 Support KV semantic storage
> > > > > >
> > > > > > ## Status
> > > > > > - Current Status: Draft
> > > > > > - Authors: [ltamber](https://github.com/ltamber)
> > > > > > - Shepherds: [duhengforever](mailto:[email protected])
> > > > > > - Mailing List discussion: <[email protected]>
> > > > > > - Pull Request: #PR_NUMBER
> > > > > > - Released: <released_version>
> > > > > > ## Background & Motivation
> > > > > > ### what do we need to do
> > > > > > - will we add a new module? **no**.
> > > > > > - will we add new APIs? **yes**.
> > > > > > - will we add new feature? **yes**.
> > > > > > ### Why should we do that
> > > > > > - Are there any problems of our current project?
> > > > > >   Currently, we can't get/put key-value from/into rocketmq, so if
> > we
> > > > use
> > > > > > [connector](https://github.com/apache/rocketmq-externals), like
> > > > > > FileSource, BinlogSource, we can't persist current read
> > position/dump
> > > > > > position to rocketmq rather than an external meta store like
> > > > > > zookeeper/mysql, this will bring more operator risk by introduce
> > > > another
> > > > > > component. this issue was also in [streaming](
> > > > > > https://github.com/apache/rocketmq-streams) scenarios when
> > developer
> > > > > want
> > > > > > to persist meta info like checkpoint.
> > > > > > - What can we benefit proposed changes?
> > > > > >    rocketmq would not rely on external componet such as
> > > zookeeper/etcd
> > > > to
> > > > > > support meta data storage.
> > > > > > ### Goals
> > > > > > - What problem is this proposal designed to solve?
> > > > > >    Design a distribution persistent key-value store,  application
> > can
> > > > put
> > > > > > key-value into broker, and then get the value after a while, in
> the
> > > > same
> > > > > > time, it can also have the ability like compareAndSet, prefix get
> > and
> > > > so
> > > > > on.
> > > > > > - To what degree should we solve the problem?
> > > > > >    This RIP must guarantee below point:
> > > > > >    1. High availablity: if one broker in the broker group is
> down,
> > > > > > application can put/get key-value through other broker, the
> > > availablity
> > > > > is
> > > > > > same with the message of rocketmq.
> > > > > >    2. High capacity: the amount of key-value may very large, so
> the
> > > > > > key-value can not store in memory,  we must store the key-value
> in
> > > disk
> > > > > > device.
> > > > > > ### Non-Goals
> > > > > > - What problem is this proposal NOT designed to solve?
> > > > > >    Nothing specific.
> > > > > > - Are there any limits of this proposal?
> > > > > >    Nothing specific.
> > > > > > ## Changes
> > > > > > ### Architecture
> > > > > > ![struct.png](
> > > > > >
> https://github.com/ltamber/UsefulTools/raw/master/image/struct.png
> > )
> > > > > > We will introduce [rocksdb](https://github.com/facebook/rocksdb)
> > to
> > > > > > persist key-value data, to say it more accurately, we use rocksdb
> > to
> > > > > > compact the value with the same key, we will not enable WAL in
> > > rocksdb
> > > > to
> > > > > > decrease write amplification (most case), instead we can recover
> > the
> > > > > > rocksdb state and consistency by redo rocketmq commitlog. so the
> > > > put/get
> > > > > > flow showed on the above figure is:
> > > > > > put: the key-value message will put into commitlog first, and
> then
> > > > > through
> > > > > > the `reputService` redo commitlog, the key-value will put to
> > rocksdb
> > > > > > asynchronous, until this reput finished broker will not response
> to
> > > > > client.
> > > > > > get: application will get key-value from rocksdb thought broker
> > > > directly.
> > > > > > In addition, if we don't want introduce [rocksdb](
> > > > > > https://github.com/facebook/rocksdb) and the meta data content
> > will
> > > > not
> > > > > > occupy too many memory, we can also use a key-value store base on
> > > > memory
> > > > > > map, there will a periodic serialization and persistence thread
> to
> > > > > > guarantee data won't loss if broker restart or system abnormal
> > > > shutdown,
> > > > > > and the memory state consistency will also guaranteed by redo
> > > rocketmq
> > > > > > commitlog.
> > > > > > ### Interface Design/Change
> > > > > > - Method signature changes. **No**
> > > > > > - Method behavior changes. **No**
> > > > > > - CLI command changes. **No**
> > > > > > - Log format or content changes.
> > > > > >    the properties of the message will add two flag, `kv_opType`
> > > > indicate
> > > > > > the request type is put key-value or get key-value, and `key`
> > > indicate
> > > > > the
> > > > > > request key both in put or get operation. In order to pass the
> key
> > > > > through
> > > > > > the network in the request header, we will encode/decode the
> > key(byte
> > > > > array
> > > > > > format) use [base64](
> > > > > > https://docs.oracle.com/javase/8/docs/api/java/util/Base64.html)
> > > > > >  encoding method.
> > > > > >   ![serial](
> > > > > >
> https://github.com/ltamber/UsefulTools/raw/master/image/serial.png
> > )
> > > > > > ### Compatibility, Deprecation, and Migration Plan
> > > > > > - Are backward and forward compatibility taken into
> consideration?
> > > > > >    New RequestCode between client and broker are added, so there
> > are
> > > 2
> > > > > > compatibility situations:
> > > > > >     1. old client+new broker: old clients won't make request with
> > > > > > key-value flag, so broker will not receive key-value request,
> which
> > > > keep
> > > > > > all things as before.
> > > > > >     2. new client+old broker: new clients will send key-value
> > > request,
> > > > > but
> > > > > > the broker don't recognize the request code, and will return
> error
> > > msg.
> > > > > so
> > > > > > we should upgrade broker first to support this feature.
> > > > > > - Are there deprecated APIs?
> > > > > >    Nothing specific.
> > > > > > - How do we do migration?
> > > > > >    Nothing specific.
> > > > > > ### Implementation Outline
> > > > > > We will implement the proposed changes by two phases.
> > > > > > #### Phase 1
> > > > > > 1. Implement reput logic from commitlog to rocksdb.
> > > > > > 2. Implement broker support key-value request and response.
> > > > > > 3. Implement client support key-value request and response.
> > > > > > 4. Implement key-value store use memory map.
> > > > > > 5. Implement key-value store use rocksdb.
> > > > > > #### Phase 2
> > > > > > 1. Implement prefix get semantics.
> > > > > > 2. Implement compareAndSet semantics.
> > > > > > 3. Implement rocksdb snapshot export/import.
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Best Regards :-)
> > > >
> > >
> >
> >
> > --
> > Best Regards :-)
> >
>

Reply via email to