Sergey,
Yes, your understanding is similar to mine.
I created a JIRA ticket for this change:
https://issues.apache.org/jira/browse/IGNITE-12099
Denis
On 23 Aug 2019, 14:27 +0300, Sergey Chugunov , wrote:
> Alexei, If my understanding is correct (Denis please correct me if I'm
> wrong) we'll
Alexei, If my understanding is correct (Denis please correct me if I'm
wrong) we'll indeed delay only reqs that touch "dirty" metadata (metadata
with unfinished write to disk).
I don't expect significant performance impact here because for now we don't
allow other threads to use "dirty" metadata
Do I understand correctly what only affected requests with "dirty" metadata
will be delayed, but not all ?
Doesn't this check hurt performance? Otherwise ALL requests will be blocked
until some unrelated metadata is written which is highly undesirable.
Otherwise looks good if performance will not
Alexey,
Making only one node write metadata to disk synchronously is a possible and
easy to implement solution, but it still has a few drawbacks:
• Discovery will still be blocked on one node. This is better than blocking all
nodes one by one, but disk write may take indefinite time, so
Denis Mekhanikov,
I think at least one node (coordinator for example) still should write
metadata synchronously to protect from a scenario:
tx creating new metadata is commited <- all nodes in grid are failed
(powered off) <- async writing to disk is completed
where <- means "happens before"
Alexey,
I’m not suggesting to duplicate anything.
My point is that the proper fix will be implemented in a relatively distant
future. Why not improve the existing mechanism now instead of waiting for the
proper fix?
If we don’t agree on doing this fix in master, I can do it in a fork and use it
Denis Mekhanikov,
If we are still talking about "proper" solution the metastore (I've meant
of course distributed one) is the way to go.
It has a contract to store cluster wide metadata in most efficient way and
can have any optimization for concurrent writing inside.
I'm against creating some
Eduard,
Usages will wait for the metadata to be registered and written to disk. No
races should occur with such flow.
Or do you have some specific case on your mind?
I agree, that using a distributed meta storage would be nice here.
But this way we will kind of move to the previous scheme with
Denis,
How would we deal with races between registration and metadata usages with
such fast-fix?
I believe, that we need to move it to distributed metastorage, and await
registration completeness if we can't find it (wait for work in progress).
Discovery shouldn't wait for anything here.
On Tue,
Sergey,
Currently metadata is written to disk sequentially on every node. Only one node
at a time is able to write metadata to its storage.
Slowness accumulates when you add more nodes. A delay required to write one
piece of metadata may be not that big, but if you multiply it by say 200, then
Denis,
Thanks for bringing this issue up, decision to write binary metadata from
discovery thread was really a tough decision to make.
I don't think that moving metadata to metastorage is a silver bullet here
as this approach also has its drawbacks and is not an easy change.
In addition to
>
>> 1. Yes, only on OS failures. In such case data will be received from alive
>> nodes later.
What behavior would be in case of one node ? I suppose someone can obtain cache
data without unmarshalling schema, what in this case would be with grid
operability?
>
>> 2. Yes, for walmode=FSYNC
Alexey,
I still don’t understand completely if by using metastore we are going to stop
using discovery for metadata registration, or not. Could you clarify that point?
Is it going to be a distributed metastore or a local one?
Are there any relevant JIRA tickets for this change?
Denis
> On 14
Denis Mekhanikov,
1. Yes, only on OS failures. In such case data will be received from alive
nodes later.
2. Yes, for walmode=FSYNC writes to metastore will be slow. But such mode
should not be used if you have more than two nodes in grid because it has
huge impact on performance.
ср, 14 авг.
Folks,
Thanks for showing interest in this issue!
Alexey,
> I think removing fsync could help to mitigate performance issues with current
> implementation
Is my understanding correct, that if we remove fsync, then discovery won’t be
blocked, and data will be flushed to disk in background,
Denis,
Several clarifying questions:
1. Do you have an idea why metadata registration takes so long? So
poor disks? So many data to write? A contention with disk writes by
other subsystems?
2. Do we need a persistent metadata for in-memory caches? Or is it so
accidentally?
Generally, I think
Alexey, but in this case customer need to be informed, that whole (for example
1 node) cluster crash (power off) could lead to partial data unavailability.
And may be further index corruption.
1. Why your meta takes a substantial size? may be context leaking ?
2. Could meta be compressed ?
Denis Mekhanikov,
Currently metadata are fsync'ed on write. This might be the case of
slow-downs in case of metadata burst writes.
I think removing fsync could help to mitigate performance issues with
current implementation until proper solution will be implemented: moving
metadata to metastore.
I would also like to mention, that marshaller mappings are written to disk even
if persistence is disabled.
So, this issue affects purely in-memory clusters as well.
Denis
> On 13 Aug 2019, at 17:06, Denis Mekhanikov wrote:
>
> Hi!
>
> When persistence is enabled, binary metadata is written
Hi!
When persistence is enabled, binary metadata is written to disk upon
registration. Currently it happens in the discovery thread, which makes
processing of related messages very slow.
There are cases, when a lot of nodes and slow disks can make every binary type
be registered for several
20 matches
Mail list logo