[
https://issues.apache.org/jira/browse/IGNITE-19105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aleksandr Polovtcev updated IGNITE-19105:
-----------------------------------------
Description:
When reading a stream of data from Meta Storage (e.g. using
{{MetaStorageManager#range}} method) a Cursor is created on the server side
(see {{MetaStorageListener}}), which is stored in memory for resource
management purposes. This approach allows Cursor resources to be replicated and
no new Cursors need to be created if a leader changes during an ongoing range
operation. On the other hand, it requires additional resources to be consumed
on Raft Followers and requires all Cursor management commands to be Write
commands, which doesn't allow them to be executed not on a Leader.
It is proposed to replace this approach with the following: make Range commands
Read commands and pass the last received key as a parameter. This will allow
reading the next batch of data without re-using a previous cursor, a new cursor
will simply be created. On every batch request, a cursor will be opened, the
batch will be collected into a list and the cursor will be closed. We will lose
RocksDB snapshot isolation though (since a new iterator will be created every
time), but this should not be a problem as every request should be bound to a
particular Meta Storage revision.
This approach will also help with migrating Meta Storage to {{ReplicaService}}
mechanism instead of using Raft for reading data, which will allow reads from
any node. However, this should be done in a different ticket.
was:
When reading a stream of data from Meta Storage (e.g. using
{{MetaStorageManager#range}} method) a Cursor is created on the server side
(see {{MetaStorageListener}}), which is stored in memory for resource
management purposes. This approach allows Cursor resources to be replicated and
no new Cursors need to be created if a leader changes during an ongoing range
operation. On the other hand, it requires additional resources to be consumed
on Raft Followers and requires all Cursor management commands to be Write
commands, which doesn't allow them to be executed not on a Leader.
It is proposed to replace this approach with the following: make Range commands
Read commands and pass the last received key as a parameter. This will allow
reading the next batch of data without re-using a previous cursor, a new cursor
will simply be created. On every batch request, a cursor will be opened, the
batch will be collected into a list and the cursor will be closed. We will lose
RocksDB snapshot isolation though (since a new iterator will be created every
time), but this should not be a problem as every request should be bound to a
particular Meta Storage revision.
This approach will also help with migrating Meta Storage to {{ReplicaService}}
mechanism instead of using Raft for reading data.
> Get rid of Meta Storage cursor management
> -----------------------------------------
>
> Key: IGNITE-19105
> URL: https://issues.apache.org/jira/browse/IGNITE-19105
> Project: Ignite
> Issue Type: Improvement
> Reporter: Aleksandr Polovtcev
> Assignee: Aleksandr Polovtcev
> Priority: Major
> Labels: ignite-3, tech-debt
>
> When reading a stream of data from Meta Storage (e.g. using
> {{MetaStorageManager#range}} method) a Cursor is created on the server side
> (see {{MetaStorageListener}}), which is stored in memory for resource
> management purposes. This approach allows Cursor resources to be replicated
> and no new Cursors need to be created if a leader changes during an ongoing
> range operation. On the other hand, it requires additional resources to be
> consumed on Raft Followers and requires all Cursor management commands to be
> Write commands, which doesn't allow them to be executed not on a Leader.
> It is proposed to replace this approach with the following: make Range
> commands Read commands and pass the last received key as a parameter. This
> will allow reading the next batch of data without re-using a previous cursor,
> a new cursor will simply be created. On every batch request, a cursor will be
> opened, the batch will be collected into a list and the cursor will be
> closed. We will lose RocksDB snapshot isolation though (since a new iterator
> will be created every time), but this should not be a problem as every
> request should be bound to a particular Meta Storage revision.
> This approach will also help with migrating Meta Storage to
> {{ReplicaService}} mechanism instead of using Raft for reading data, which
> will allow reads from any node. However, this should be done in a different
> ticket.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)