[
https://issues.apache.org/jira/browse/IGNITE-18474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexander Lapin updated IGNITE-18474:
-------------------------------------
Description:
It is counterintuitive, but at the moment usual {{select * from table where pk
= 1}} query produces even more raft WriteCommands, than insert query.
The reason is: we have numberOfPartitions*TxCleanupCommand + FinishTxCommand
for each select query. With the default number of partitons 25 and 1-node
installation we will have 26 synced writes to rocksDB per query. Together with
https://issues.apache.org/jira/browse/IGNITE-18475 it blows up our latency in
10times per simple select by primary key query.
Possible solution:
- Detect, that we have only select query in transaction and use read-only path
for these type of queries
h3. Implementation Notes
Core idea is to extend tx meta with an information about the fact whether data
modificatoin was involved or not during transaction. Within code it means that
we should check ReplicaRequest.requestType (some casts involved, becase
replicaRequest is actually a method of SingleRowReplicaRequest,
MultipleRowReplicaRequest) inside both enlistInTx and set some sort of boolean
InternalTransaction#dataModificationIntent to true. Please pay attetion that
should be parition or in other words primary replica specific state. Meaning
that each enlisted partition/primaryReplica will have it's own
dataModificationIntent flag.
That flags should be propagated along with TxFinishReplicaRequest and
txCleanupReplicaRequest in order to adjust
PartitionReplicaListener#processTxCleanupAction meaning that it's not nessesary
to replicate txCleanupCommand in there were no data modifications involved.
was:
It is counterintuitive, but at the moment usual {{select * from table where pk
= 1}} query produces even more raft WriteCommands, than insert query.
The reason is: we have numberOfPartitions*TxCleanupCommand + FinishTxCommand
for each select query. With the default number of partitons 25 and 1-node
installation we will have 26 synced writes to rocksDB per query. Together with
https://issues.apache.org/jira/browse/IGNITE-18475 it blows up our latency in
10times per simple select by primary key query.
Possible solution:
- Detect, that we have only select query in transaction and use read-only path
for these type of queries
> Read sql queries has significant number of RAFT write commands
> --------------------------------------------------------------
>
> Key: IGNITE-18474
> URL: https://issues.apache.org/jira/browse/IGNITE-18474
> Project: Ignite
> Issue Type: Task
> Reporter: Kirill Gusakov
> Priority: Major
>
> It is counterintuitive, but at the moment usual {{select * from table where
> pk = 1}} query produces even more raft WriteCommands, than insert query.
> The reason is: we have numberOfPartitions*TxCleanupCommand + FinishTxCommand
> for each select query. With the default number of partitons 25 and 1-node
> installation we will have 26 synced writes to rocksDB per query. Together
> with https://issues.apache.org/jira/browse/IGNITE-18475 it blows up our
> latency in 10times per simple select by primary key query.
>
> Possible solution:
> - Detect, that we have only select query in transaction and use read-only
> path for these type of queries
> h3. Implementation Notes
> Core idea is to extend tx meta with an information about the fact whether
> data modificatoin was involved or not during transaction. Within code it
> means that we should check ReplicaRequest.requestType (some casts involved,
> becase replicaRequest is actually a method of SingleRowReplicaRequest,
> MultipleRowReplicaRequest) inside both enlistInTx and set some sort of
> boolean InternalTransaction#dataModificationIntent to true. Please pay
> attetion that should be parition or in other words primary replica specific
> state. Meaning that each enlisted partition/primaryReplica will have it's own
> dataModificationIntent flag.
> That flags should be propagated along with TxFinishReplicaRequest and
> txCleanupReplicaRequest in order to adjust
> PartitionReplicaListener#processTxCleanupAction meaning that it's not
> nessesary to replicate txCleanupCommand in there were no data modifications
> involved.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)