LittleHealth opened a new pull request, #17272:
URL: https://github.com/apache/iotdb/pull/17272

   ## Description
   
   This PR introduces a new consensus implementation `TRaft` for Apache IoTDB, 
with partition-aware log replication for time-series ingestion and a compatible 
extension to consensus request time access.
   
   ### Content1: TRaft protocol implementation (leader/follower replication + 
election comparator)
   
   - Added `TRaftConsensus` and `TRaftServerImpl` as the core runtime for TRaft.
   - Added TRaft-specific data structures:
     - `TRaftLogEntry` (includes `timestamp`, `partitionIndex`, intra-partition 
metadata, Raft index/term)
     - `TRaftFollowerInfo` (tracks per-follower replication partition progress 
and in-flight indices)
     - `TRaftLogStore` (shared persistent log for all followers)
     - `TRaftVoteRequest` / `TRaftVoteResult` (term + TRaft freshness 
dimensions)
   - Implemented TRaft write path:
     - Leader parses request time and builds partitioned log entry.
     - Hot path: directly replicates in-memory entries to followers in the same 
active partition.
     - Cold path: followers catch up from shared disk log with 
partition-complete transition.
   - Implemented ACK handling:
     - Per-follower in-flight index cleanup on ACK.
     - Partition transition only after current partition in-flight set is 
drained.
   - Implemented election comparison rule:
     - Term first, then `partitionIndex`, then `currentPartitionIndexCount`.
   
   Design choice note:
   - Chosen design: single shared persisted log + per-follower progress state.
   - Alternative considered: per-follower log queues.  
     Shared log was selected to reduce memory overhead and avoid duplicate 
persistence.
   
   ### Content2: `IConsensusRequest` time capability extension and write-plan 
compatibility
   
   - Extended `IConsensusRequest` with:
     - `hasTime()` (default `false`)
     - `getTime()` (default throws `UnsupportedOperationException`)
   - This keeps backward compatibility for non-time requests, while allowing 
TRaft to read timestamps via top-level interface.
   - Added/updated time behavior on time-carrying write nodes (including 
recursive subclasses):
     - `InsertRowNode`, `InsertRowsNode`, `InsertTabletNode`
     - `InsertRowsOfOneDeviceNode`, `InsertMultiTabletsNode`
     - `RelationalInsertRowNode`, `RelationalInsertRowsNode`, 
`RelationalInsertTabletNode`
     - `PipeEnrichedInsertNode` delegates `hasTime()` / `getTime()`
   - Fixed previously unimplemented min-time behavior:
     - `InsertRowsOfOneDeviceNode#getMinTime()`
     - `InsertMultiTabletsNode#getMinTime()`
   
   Design choice note:
   - Chosen API: capability check + default fallback (`hasTime`/`getTime`).
   - Alternative considered: changing `getTime()` return type to 
`OptionalLong`.  
     Rejected due to signature conflict with existing `InsertRowNode#getTime()` 
(`long`) and broad compatibility impact.
   
   <hr>
   
   This PR has:
   - [x] been self-reviewed.
     - [ ] concurrent read
     - [x] concurrent write
     - [x] concurrent read and write
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods.
   - [ ] added or updated version, __license__, or notice information
   - [x] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [x] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for code coverage.
   - [ ] added integration tests.
   - [x] been tested in a test IoTDB cluster.
   
   <hr>
   
   ##### Key changed/added classes (or packages if there are too many classes) 
in this PR
   
   - `iotdb-core/consensus/src/main/java/org/apache/iotdb/consensus/traft/`
     - `TRaftConsensus`
     - `TRaftServerImpl`
     - `TRaftLogEntry`
     - `TRaftFollowerInfo`
     - `TRaftLogStore`
     - `TRaftNodeRegistry`
     - `TRaftRequestParser`
     - `TRaftVoteRequest`
     - `TRaftVoteResult`
     - `TRaftRole`
   - 
`iotdb-core/consensus/src/main/java/org/apache/iotdb/consensus/common/request/`
     - `IConsensusRequest`
     - `IndexedConsensusRequest`
     - `BatchIndexedConsensusRequest`
     - `DeserializedBatchIndexedConsensusRequest`
   - 
`iotdb-core/datanode/src/main/java/org/apache/iotdb/db/queryengine/plan/planner/plan/node/write/`
     - `InsertNode`
     - `InsertRowNode`
     - `InsertRowsNode`
     - `InsertRowsOfOneDeviceNode`
     - `InsertTabletNode`
     - `InsertMultiTabletsNode`
     - `RelationalInsertRowNode`
     - `RelationalInsertRowsNode`
     - `RelationalInsertTabletNode`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to