maybe we could build openmessaging-kv based on DLedger, and then rocketmq nameserver uses openmessaging-kv to construct raft consensus.
On Thu, Apr 30, 2020 at 3:29 PM 金融通 <[email protected]> wrote: > Hi RocketMQ Community, > > I think it is a good choice to start the evolution of architecture for > RocketMQ with Metadata management architecture upgrade. > > Currently, the metadata consistency of RocketMQ is maintained by full > connection. For example, each broker registers with each nameserver to > ensure that the view of routing information seen between nameservers is the > same, and each consumer instance sends heartbeats (carrying subscription > information) to broker to ensure that the view of subscription information > seen between brokers is the same. However, such consistency maintenance is > weak. Unreliable network and delay may cause inconsistent views, which has > caused a lot of issues. > > On the other hand, after RocketMQ 4.5.0, we have used the Raft protocol > (DLedger) to solve the consistency problem of log replication. DLedger is a > raft-based log storage library. At the beginning of the design, we hoped to > apply it to consistent metadata storage. If the metadata of RocketMQ is > stored as log and the consistency is guaranteed by using the raft protocol > (DLedger), the issue of metadata consistency will be solved. > > So I submitted RIP-18 Metadata management architecture upgrade, which > describes the specific plan in more detail. I hope to hear more voices from > the community. So please tell me your thoughts by replying to this email or > commenting on google docs. > > Best Regards! > Rongtong Jin > > RIP-18 Metadata management architecture upgrade > > https://docs.google.com/document/d/1hQxlbtlMDwNxyVDGsIIUpDNWwfS6hP0PGKY9-A2KUOA/edit?usp=sharing > > > > -----原始邮件----- > > 发件人: "Gosling Von" <[email protected]> > > 发送时间: 2019-01-30 17:54:47 (星期三) > > 收件人: dev <[email protected]> > > 抄送: > > 主题: [DISCUSS] Thought of The Evolution of The Next Decade > Architecture for RocketMQ > > > > Hi, > > > > I would like to say happy new year to everyone, especially for the > guys from the eastern hemisphere. I think that when you see this topic, you > already know what I want to say :-) > > > > After more than 6 years of inspection from the community and market, > Apache RocketMQ has been widely used in the field of financial and > e-commerce online transactions. Known know data has shown that, just in > China, RocketMQ covers more than 40% of the traditional messaging scene. > With the globalization of the community in the past two years, this > development has spread to all of the worlds. However, through continuous > community activities, including technical exchanges with some of the > experts from the Microsoft, Berkeley, etc., coupled with the emergence of > IoT, AI, Blockchain and other scenarios around the world, I began to think > about the architecture evolution for RocketMQ. I hope we could make it as > the data infrastructure of cloud computing era. and we could better serve > in the next decade. > > > > > > First of all, the overall architecture will take the separation of > storage computing and pluggable architecture. Regarding the separation of > storage computing, I know that this is a controversial topic in the > industry. You may also see that Twitter had gave up their messaging > solution EventBus, which serving and storage layers are decoupled. one of > the important reason which is given by "introduces an additional hop". > That's right, usually, you don't need so much. But what I want to express > here is that the value of storage computing separation is just like the > single responsibility in our design pattern, so that focus is more focused. > For example, if messaging engine is deployed in the edge, we could arrange > computing nodes to be deployed on demand. Because it is a computationally > intensive task, we can focus on how to improve computing power and response > speed without concerned about the machine cost, operation and maintenance > cost brought by storage. Another case, RocketMQ storage is regarded as a > kind of time series storage. It not only provides the storage capacity of > single data, but also the capacity of bulk storage, but in any case it is a > data type independent sequential additional storage. Under this > architecture, if you want to realize the current transaction capability, > there are still some complications, especially when you want to make > RocketMQ a one-stop microservice transaction solution. We have already > tried this. Known feedback from the bank is, they have made some > modifications to the storage in the financial system. For example, when the > file storage is replaced with a relational type, NoSQL or NewSQL storage, > the benefit is enhanced maintainability. Enhanced transaction processing > capabilities. In this sense, we could make a pluggable design in RocketMQ > 5.0, by default we will provide the ultimate sequential addition capability > storage, which is also the best storage implementation of the disk seek > algorithm. But it also brings another question, how to improve the query > and processing ability of data. Here I want to share another preliminary > design idea, we could continue to use the data structure such as Commitlog > to store the original data, and then build the index or intermediate > aggregation results based on Commitlog. At present, our index structure is > not well integrated and utilized, from this we could continue to modify and > optimize the index. In addition, we can use DPDK/SPDK and write Pos atomic > increment to achieve the best lock-free design. Considering the data that > has been committed, this series needs to be explored to a large extent, > even including cooperation with some other communities and universities. So > at this level, I think we could make RocketMQ have the separation > deployment capability, while the storage capacity is pluggable and can be > replaced as needed. > > > > Second, support the OpenMessaging standard. I think many guys have > already noticed the new messaging standard drafted by Alibaba, Yahoo and > other company. I am also the chair of this project. In this blueprint of > the standard, a very important problem is solved, that is Interoperability, > this interoperability is not only between different messaging vendors, but > also between the upstream and downstream of the messaging. And this > interoperability is reflected to the user, which is the consistency of the > API or the protocol. Although we think that the API is also a kind of > protocol, I want to emphasize that the consistency of the protocol has been > tried by countless scenes. But so far, I personally have not seen a > particularly versatile and simple solution, whether it is AMQP, MQTT, > including RSocket, which has recently been recognized by everyone, there is > not much innovation to work on this level. And we want to avoid some > repetitive innovations. At this time, the API layer standard is > particularly important, so RocketMQ 5.0 will focus on supporting > OpenMessaging standardization in API testing. In the future of > multi-language, we hope that through this set of APIs, we can completely > solve all the problems that you currently encounter with RocketMQ > multi-language. > > > > The natural support of multiple protocols, I think this is also very > important. So in 5.0, we could reconstruct the remoting module, to provide > a pluggable transport layer protocol support in the computing node. HTTP2.0 > may be our default protocol. On the basis of again, we also consider > integrating TCP-based MQTT, UDP-based CoAP. Of course, we also clearly see > that with the gradual popularization of 5.0G networks, we may have to > actively follow up the needs of the market. Anyway, we could provide the > flexible wire protocol extension when we want to support more concrete > domain protocol. This is something we must consider carefully. > > > > > > A lightweight streaming engine base on messaging is a very natural > thought. I am also an early explorer of streaming, but the so-called > streaming we made in previous years is strictly a pseudo-scene, why is it a > pseudo-scene? Actually, we don’t need to deploy a streaming engine. Instead > of, we could only use the messaging to reach a same effect in most cases. > In addition, in the stream computing scenario, messaging and storage are > very important, so why don't we let the messaging support the scheduling > and calculation of task nodes naturally, and our built-in storage can > better help us better. We only need to provide a lib package, which makes > it easy for messaging to have streaming capabilities. As for the subsequent > SQL processing, CEP, FAAS and etc. I believe that this is the evolution of > this programming model. > > > > We have been talking about it before. RocketMQ is a unified messaging > platform integrating computing, storage and scheduling. Today I share my > rough thought of the evolution of the overall architecture of RocketMQ 5.0. > I also hope to hear the opinions of the community. Including other PMC and > Committer thoughts. Next, we could call for RIP discussion for the details, > I hope more pmc or committers could act as the sheepherder of the RIP, > making landing more reliable in the 2019. > > > > > > Best Regards, > > Von Gosling > > > </[email protected]></[email protected]>
