I agree that such option is hard to explain and will complicate data
storage tuning (which is already not simple).
The problem is that we don't divide pages to overflow/non-overflow so
far. We need to see benchmark results first - there's a chance that
negative effects will be insignificant and option won't be needed at
all. Otherwise, we may come up with a heuristic that will minimize
negative effect, e.g. apply bytewise comparison only for data pages with
only one payload item.
Best Regards,
Ivan Rakov
On 08.10.2018 18:25, Vladimir Ozerov wrote:
Can we use this mode for overflow pages and do not use for normal entries
which fir a single page?
In general users try to avoid fine-grained tuning options, as they are very
complex to understand. We should try to avoid any new configuration options.
On Mon, Oct 8, 2018 at 5:51 PM Ivan Rakov <ivan.glu...@gmail.com> wrote:
Huge +1.
Page dirty flag is set in PageMemoryImpl#writeUnlockPage body. Caller
passes "markDirty=true" boolean flag if he assumes that page content may
have changed (dirty flag will be set even if page content remained
intact). Instead of this, we can dump page content to thread-local
buffer after successful write lock and compare it bytewise with new
content on write unlock.
I believe, this logic should be introduced as a separate data storage
mode as it have both positive and negative effects.
Positive:
Small updates of large entries will produce much less dirty pages. It
can dramatically boost performance of updates - especially when SQL
update of single field is performed over large objects.
Negative:
CPU consumption and latency will be increased. We'll need some time to
copy and compare page content. Anyway, lack of disk IOPS hits us much
more often than lack of CPU - benchmarks will show whether such impact
will be perceptible.
Let's file a ticket for this task unless there are any objections.
Best Regards,
Ivan Rakov
On 08.10.2018 16:18, Dmitriy Pavlov wrote:
Hi Igniters,
I'd like to share a case which was implemented in the previous version of
TC Bot. It is a kind of REST responses cache <RestParms, Response>:
Response {
Long tsRefreshed; // timestamp of the last call to real service
List<Build> builds; // a huge list of builds, most times it is not
changed.
}
And it seems timestamp (ts) offset in all entries pages is constant and
it
requires 8 bytes. Data in builds storage will require a number of pages
in
the durable memory, probably >10-20 pages.
So if REST (real service) responds with the same builds content only TS
is
updated. After that, I did cache.put(restParms, reponse).
So my question is, will such update, which affects only 1 field causes
mark
dirty for 1 page or for 20? I feel according to checkpoints amount that
we
mark all pages as dirty even if the content is not modified. If so, I
would
like to suggest a slight change to Ignite: for data pages mark as only
that
pages, which has a modification in its content.
I understand that previous implementation in the Bot was quite naive (now
it is changed), but still, what if we will check for modifications by
mem-compare before marking? Mark dirty now seems to cause additional data
to be flushed to disk on next checkpoint.
I would appreciate if Native Persistence Experts can help me to find a
place in the code, where such updates are performed? (Maybe I miss
something).
Sincerely,
Dmitriy Pavlov