JiekerTime opened a new issue, #29659: URL: https://github.com/apache/shardingsphere/issues/29659
I've conceptualized a high-availability architecture that I'd like to discuss with the community to see if it's feasible for implementation in SS. If given the green light, I'm keen on taking the lead on this module. Consider a setup with three shards labeled 1, 2, 3, and 4, distributed across three databases (currently not sharded by table, but that's open to future analysis). The data distribution is as follows: | Database 1 | Database 2 | Database 3 | Database 4 | |------------|------------|------------|------------| | 1 | 2 | 3 | 4 | We then create redundant copies of the data: | Database 1 | Database 2 | Database 3 | Database 4 | |------------|------------|------------|------------| | 1, 2, 3 | 2, 3, 4 | 3, 4, 1 | 4, 1, 2 | This redundancy ensures high availability, allowing traffic to be rerouted to databases 1 and 4 if database 2 fails, providing seamless failover. It also opens up opportunities for optimizing scenarios that typically require cross-database joins. In transactional scenarios, changes to shards 1 and 2 would be split into two atomic operations to maintain consistency through two-phase commit (2PC) logic. Each atomic operation would be assigned a globally unique timestamp, ensuring the global order of operations even when updates are executed in parallel across different databases. Atomic operation 1: - Update db1 table1 - Update db3 table1 - Update db4 table1 Atomic operation 2: - Update db1 table2 - Update db2 table2 - Update db4 table2 These atomic operations could be orchestrated using a consensus algorithm. Updates would be queued and ordered by timestamp, with all replicas applying updates in this sequence. We could also potentially integrate MVCC to enhance this system. However, there are clear downsides: - Performance could suffer due to multiple write operations instead of a single one. - Resharding becomes more complex, requiring innovative algorithmic solutions to work with the redundancy logic during scaling operations. - Disabling and re-enabling the high-availability feature would necessitate incremental data synchronization. If we proceed with this feature, my plan is as follows: First, we'll develop for MySQL&&Sharding DB: 1. Support the creation of backup tables during table creation. 2. Automatically back up data upon insertion. 3. Use consensus algorithms for transactional scenarios to ensure atomic commits. 4. Implement a global unique timestamp system for a task waiting queue. 5. Integrate MVCC capabilities. 6. Enable incremental synchronization. 7. Provide resharding support. Second, support all kind of sharding. Finally, we'll extend support for multi-language environments. Of course, other incremental features could be added as needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
