Re: High priority TCP discovery messages

2019-01-30 Thread Anton Vinogradov
Denis,

>> Node availability check is based on the fact, that it receives fresh
>> metrics once in metricsUpdateFreq ms.

I see the problem here.
Node availability should be checked using some ping (fast and small)
message instead of huge and slow metrics message.

On Wed, Jan 30, 2019 at 4:08 PM Denis Mekhanikov 
wrote:

> Yakov,
>
> > You can put hard limit and process enqued MetricsUpdate message
> > if last one of the kind was processed more than metricsUpdFreq millisecs
> ago.
>  Makes sense. I'll try implementing it.
>
> > I would suggest we allow queue overflow for 1 min, but if situation does
> not go to normal then node
> > should fire a special event and then kill itself.
> Let's start with a warning in log and see, how they correlate with problems
> with network/GC.
> I'd like to make sure we don't kill innocents.
>
> Anton,
>
> > Maybe, better case it to have special "discovery like" channel (with ring
> or analog) for metrics like messages
> I don't think, that creating another data channel is reasonable. It will
> require additional network connections and more complex configuration.
> But splitting pings and metrics into different types of messages, as it was
> before, and moving metrics distribution to communication
> makes sense to me. Some kind of a gossip protocol could be used for it.
>
> > Anyway, Why are fighting with duplicates inside the queue instead of
> > fighting with new message initial creation while previous not yet
> processed
> > on the cluster?
>
> A situation, when multiple metrics update messages exist in the cluster, is
> normal.
> Node availability check is based on the fact, that it receives fresh
> metrics once in metricsUpdateFreq ms.
> If you make a coordinator wait for a previous metrics update message to be
> delivered before issuing a new one,
> then this frequency will depend on the number of nodes in the cluster,
> since time of one round-trip with differ on different topologies.
>
> Alex,
>
> I didn't check it yet. Theoretically, nodes will fail a bit more often,
> when their discovery worker queues are flooded with messages.
> This change definitely requires extensive testing.
>
> I think you can make metric update messages have a regular priority
> separately from fixing the issue, that I described.
>
> Denis
>
> вт, 29 янв. 2019 г. в 20:44, Alexey Goncharuk  >:
>
> > Folks,
> >
> > Did we already check that omitting hearbeat priority does not break
> > discovery? I am currently working on another issue with discovery and
> > skipping hearbeat priority would help a lot in my case.
> >
> > --AG
> >
> > пт, 11 янв. 2019 г. в 23:21, Yakov Zhdanov :
> >
> > > > How big the message worker's queue may grow until it becomes a
> problem?
> > >
> > > Denis, you never know. Imagine node may be flooded with messages
> because
> > of
> > > the increased timeouts and network problems. I remember some cases with
> > > hundreds of messages in queue on large topologies. Please, no O(n)
> > > approaches =)
> > >
> > > > So, we may never come to a point, when an actual
> > > TcpDiscoveryMetricsUpdateMessage is processed.
> > >
> > > Good catch! You can put hard limit and process enqued MetricsUpdate
> > message
> > > if last one of the kind was processed more than metricsUpdFreq
> millisecs
> > > ago.
> > >
> > > Denis, also note - initial problem is message queue growth. When we
> > choose
> > > to skip messages it means that node cannot process certain messages and
> > > most probably experiencing problems. We need to think of killing such
> > > nodes. I would suggest we allow queue overflow for 1 min, but if
> > situation
> > > does not go to normal then node should fire a special event and then
> kill
> > > itself. Thoughts?
> > >
> > > --Yakov
> > >
> >
>


Re: High priority TCP discovery messages

2019-01-30 Thread Denis Mekhanikov
Yakov,

> You can put hard limit and process enqued MetricsUpdate message
> if last one of the kind was processed more than metricsUpdFreq millisecs
ago.
 Makes sense. I'll try implementing it.

> I would suggest we allow queue overflow for 1 min, but if situation does
not go to normal then node
> should fire a special event and then kill itself.
Let's start with a warning in log and see, how they correlate with problems
with network/GC.
I'd like to make sure we don't kill innocents.

Anton,

> Maybe, better case it to have special "discovery like" channel (with ring
or analog) for metrics like messages
I don't think, that creating another data channel is reasonable. It will
require additional network connections and more complex configuration.
But splitting pings and metrics into different types of messages, as it was
before, and moving metrics distribution to communication
makes sense to me. Some kind of a gossip protocol could be used for it.

> Anyway, Why are fighting with duplicates inside the queue instead of
> fighting with new message initial creation while previous not yet
processed
> on the cluster?

A situation, when multiple metrics update messages exist in the cluster, is
normal.
Node availability check is based on the fact, that it receives fresh
metrics once in metricsUpdateFreq ms.
If you make a coordinator wait for a previous metrics update message to be
delivered before issuing a new one,
then this frequency will depend on the number of nodes in the cluster,
since time of one round-trip with differ on different topologies.

Alex,

I didn't check it yet. Theoretically, nodes will fail a bit more often,
when their discovery worker queues are flooded with messages.
This change definitely requires extensive testing.

I think you can make metric update messages have a regular priority
separately from fixing the issue, that I described.

Denis

вт, 29 янв. 2019 г. в 20:44, Alexey Goncharuk :

> Folks,
>
> Did we already check that omitting hearbeat priority does not break
> discovery? I am currently working on another issue with discovery and
> skipping hearbeat priority would help a lot in my case.
>
> --AG
>
> пт, 11 янв. 2019 г. в 23:21, Yakov Zhdanov :
>
> > > How big the message worker's queue may grow until it becomes a problem?
> >
> > Denis, you never know. Imagine node may be flooded with messages because
> of
> > the increased timeouts and network problems. I remember some cases with
> > hundreds of messages in queue on large topologies. Please, no O(n)
> > approaches =)
> >
> > > So, we may never come to a point, when an actual
> > TcpDiscoveryMetricsUpdateMessage is processed.
> >
> > Good catch! You can put hard limit and process enqued MetricsUpdate
> message
> > if last one of the kind was processed more than metricsUpdFreq millisecs
> > ago.
> >
> > Denis, also note - initial problem is message queue growth. When we
> choose
> > to skip messages it means that node cannot process certain messages and
> > most probably experiencing problems. We need to think of killing such
> > nodes. I would suggest we allow queue overflow for 1 min, but if
> situation
> > does not go to normal then node should fire a special event and then kill
> > itself. Thoughts?
> >
> > --Yakov
> >
>


Re: High priority TCP discovery messages

2019-01-29 Thread Alexey Goncharuk
Folks,

Did we already check that omitting hearbeat priority does not break
discovery? I am currently working on another issue with discovery and
skipping hearbeat priority would help a lot in my case.

--AG

пт, 11 янв. 2019 г. в 23:21, Yakov Zhdanov :

> > How big the message worker's queue may grow until it becomes a problem?
>
> Denis, you never know. Imagine node may be flooded with messages because of
> the increased timeouts and network problems. I remember some cases with
> hundreds of messages in queue on large topologies. Please, no O(n)
> approaches =)
>
> > So, we may never come to a point, when an actual
> TcpDiscoveryMetricsUpdateMessage is processed.
>
> Good catch! You can put hard limit and process enqued MetricsUpdate message
> if last one of the kind was processed more than metricsUpdFreq millisecs
> ago.
>
> Denis, also note - initial problem is message queue growth. When we choose
> to skip messages it means that node cannot process certain messages and
> most probably experiencing problems. We need to think of killing such
> nodes. I would suggest we allow queue overflow for 1 min, but if situation
> does not go to normal then node should fire a special event and then kill
> itself. Thoughts?
>
> --Yakov
>


Re: High priority TCP discovery messages

2019-01-11 Thread Yakov Zhdanov
> How big the message worker's queue may grow until it becomes a problem?

Denis, you never know. Imagine node may be flooded with messages because of
the increased timeouts and network problems. I remember some cases with
hundreds of messages in queue on large topologies. Please, no O(n)
approaches =)

> So, we may never come to a point, when an actual
TcpDiscoveryMetricsUpdateMessage is processed.

Good catch! You can put hard limit and process enqued MetricsUpdate message
if last one of the kind was processed more than metricsUpdFreq millisecs
ago.

Denis, also note - initial problem is message queue growth. When we choose
to skip messages it means that node cannot process certain messages and
most probably experiencing problems. We need to think of killing such
nodes. I would suggest we allow queue overflow for 1 min, but if situation
does not go to normal then node should fire a special event and then kill
itself. Thoughts?

--Yakov


Re: High priority TCP discovery messages

2019-01-11 Thread Anton Vinogradov
Denis,

+1 to the idea to get rid of priority.

Discovery should handle only messages used to update cluster state and
topology.
They are all small and have high priority.

There is no real reason to use discovery for metrics and similar "readonly"
features.
Maybe, better case it to have special "discovery like" channel (with ring
or analog) for metrics like messages which will not affect priority
features like PME?


Anyway, Why are fighting with duplicates inside the queue instead of
fighting with new message initial creation while previous not yet processed
on the cluster?

On Fri, Jan 11, 2019 at 4:30 PM Denis Mekhanikov 
wrote:

> I like the idea to make all messages be processed with equal priority.
> It will make nodes with overgrown discovery message queues die more often,
> though. But maybe, this is how it's supposed to work.
>
> Denis
>
> пт, 11 янв. 2019 г. в 16:26, Denis Mekhanikov :
>
> > Yakov,
> >
> > Sounds good. But there is a flaw in the procedure, that you described.
> > If we have a TcpDiscoveryMetricsUpdateMessage in a queue, and a newer one
> > arrives, then we will consider the existing one obsolete and won't
> process
> > it. The newest metrics update message will be moved to the queue's tail,
> > thus delaying the moment, when it will be processed.
> > So, we may never come to a point, when an actual
> > TcpDiscoveryMetricsUpdateMessage is processed.
> > But if we replace an existing message with a newer one, and save the old
> > position in a queue, then this approach will work. It will require a more
> > complex synchronization, though, so it will still lead to some overhead.
> >
> > How big the message worker's queue may grow until it becomes a problem?
> If
> > it's 20 elements max, then linear time check is not that bad.
> > BTW, RingMessageWorker#addMessage method checks for duplicating messages
> > in the queue for some message types, including all custom discovery
> > messages, which is done in linear time.
> >
> > Denis
> >
> > пт, 11 янв. 2019 г. в 00:54, Yakov Zhdanov :
> >
> >> Denis, what if we remove priority difference for messages and always add
> >> new to the end of the queue?
> >>
> >> As far as traversing the queue - I don't like O(n) approaches =). So,
> with
> >> adding all messages to the end of the queue (removing prio difference) I
> >> would suggest that we save latest 1st lap message and latest 2nd lap
> >> message and process metrics message in message worker thread in queue
> >> order
> >> if they are latest and skip the otherwise.
> >>
> >> Does this make sense?
> >>
> >> --Yakov
> >>
> >
>


Re: High priority TCP discovery messages

2019-01-11 Thread Denis Mekhanikov
I like the idea to make all messages be processed with equal priority.
It will make nodes with overgrown discovery message queues die more often,
though. But maybe, this is how it's supposed to work.

Denis

пт, 11 янв. 2019 г. в 16:26, Denis Mekhanikov :

> Yakov,
>
> Sounds good. But there is a flaw in the procedure, that you described.
> If we have a TcpDiscoveryMetricsUpdateMessage in a queue, and a newer one
> arrives, then we will consider the existing one obsolete and won't process
> it. The newest metrics update message will be moved to the queue's tail,
> thus delaying the moment, when it will be processed.
> So, we may never come to a point, when an actual
> TcpDiscoveryMetricsUpdateMessage is processed.
> But if we replace an existing message with a newer one, and save the old
> position in a queue, then this approach will work. It will require a more
> complex synchronization, though, so it will still lead to some overhead.
>
> How big the message worker's queue may grow until it becomes a problem? If
> it's 20 elements max, then linear time check is not that bad.
> BTW, RingMessageWorker#addMessage method checks for duplicating messages
> in the queue for some message types, including all custom discovery
> messages, which is done in linear time.
>
> Denis
>
> пт, 11 янв. 2019 г. в 00:54, Yakov Zhdanov :
>
>> Denis, what if we remove priority difference for messages and always add
>> new to the end of the queue?
>>
>> As far as traversing the queue - I don't like O(n) approaches =). So, with
>> adding all messages to the end of the queue (removing prio difference) I
>> would suggest that we save latest 1st lap message and latest 2nd lap
>> message and process metrics message in message worker thread in queue
>> order
>> if they are latest and skip the otherwise.
>>
>> Does this make sense?
>>
>> --Yakov
>>
>


Re: High priority TCP discovery messages

2019-01-11 Thread Denis Mekhanikov
Yakov,

Sounds good. But there is a flaw in the procedure, that you described.
If we have a TcpDiscoveryMetricsUpdateMessage in a queue, and a newer one
arrives, then we will consider the existing one obsolete and won't process
it. The newest metrics update message will be moved to the queue's tail,
thus delaying the moment, when it will be processed.
So, we may never come to a point, when an actual
TcpDiscoveryMetricsUpdateMessage is processed.
But if we replace an existing message with a newer one, and save the old
position in a queue, then this approach will work. It will require a more
complex synchronization, though, so it will still lead to some overhead.

How big the message worker's queue may grow until it becomes a problem? If
it's 20 elements max, then linear time check is not that bad.
BTW, RingMessageWorker#addMessage method checks for duplicating messages in
the queue for some message types, including all custom discovery messages,
which is done in linear time.

Denis

пт, 11 янв. 2019 г. в 00:54, Yakov Zhdanov :

> Denis, what if we remove priority difference for messages and always add
> new to the end of the queue?
>
> As far as traversing the queue - I don't like O(n) approaches =). So, with
> adding all messages to the end of the queue (removing prio difference) I
> would suggest that we save latest 1st lap message and latest 2nd lap
> message and process metrics message in message worker thread in queue order
> if they are latest and skip the otherwise.
>
> Does this make sense?
>
> --Yakov
>


Re: High priority TCP discovery messages

2019-01-10 Thread Yakov Zhdanov
Denis, what if we remove priority difference for messages and always add
new to the end of the queue?

As far as traversing the queue - I don't like O(n) approaches =). So, with
adding all messages to the end of the queue (removing prio difference) I
would suggest that we save latest 1st lap message and latest 2nd lap
message and process metrics message in message worker thread in queue order
if they are latest and skip the otherwise.

Does this make sense?

--Yakov


Re: High priority TCP discovery messages

2019-01-10 Thread Anton Vinogradov
Denis,

Correct me in case I got it all wrong.
Since metrics are scheduled, a possible situation is to have more than 1
TcpDiscoveryMetricsUpdateMessage inside the queue due to slow ... some
reasons.

Proposed solution looks like a fix hides the real problem.

My proposal is
1) Write a warning message in case second metrics added to the queue.
BTW, looks like not only metrics messages should be checked.
Do we have another scheduled messages?

2) Start new metrics journey as scheduled, but only in case previous
journey finished.
And, again, write a warning message in case journey was not started.

On Thu, Jan 10, 2019 at 6:18 PM Denis Mekhanikov 
wrote:

> A bit of background:
> When TcpDiscoverySpi is used, TcpDiscoveryMetricsUpdateMessage is sent by a
> coordinator once in metricsUpdateFrequency, which is 2 seconds by default.
> It serves as a ping message, which ensures, that the ring is connected, and
> all nodes are alive. These messages have a high priority, i.e., they are
> put into the head of the RightMessageWorker's queue instead of its tail.
>
> Now consider a situation, when a single link between two nodes in the ring
> works slowly. It may receive and deliver all messages, but with a certain
> delay. This situation is possible, when network is unstable or one of the
> nodes experiences a lot of short GC pauses.
> It leads to a growing message queue on the node, that stands before the
> slow link. The worst part is that if high priority messages are generated
> faster than they are processed, then other messages won't even get a chance
> to be processed. Thus, no nodes will be kicked out of the cluster, but no
> useful progress will happen. Partition map exchange may hang for this
> reason, and the reason won't be obvious from the logs.
>
> JIRA ticket: https://issues.apache.org/jira/browse/IGNITE-10808
> I made a draft of the fix for this problem:
> https://github.com/apache/ignite/pull/5771
> The PR also contains a test, that reproduces this situation.
>
> The patch makes sure, that only a single TcpDiscoveryMetricsUpdateMessage
> of one kind is stored in the queue, i.e., if a newer message comes to the
> node, then the old one is discarded. It also checks, that regular messages
> are also processed. If the last processed message had a high priority, then
> new high-priority message won't be put to the head of the queue, but to the
> tail instead.
> The fix addresses the described problem, but increases the utilization of
> the discovery threads a bit.
>
> What do you think of this fix? Do you have better ideas, how to improve the
> heartbeat mechanism to avoid such situations?
>
> Denis
>