Re: Replacing NodeFilter functionality with label approach

2019-08-01 Thread Ilya Kasnacheev
Hello!

I think this is a good idea. We already had problems with ClusterGroups
that won't recompute until PME, or which become invalid after PME. Relying
on string labels would fix all that.

I can think of a node filter which can't be replaced with label filter
(e.g. the one checking for presence of some partition) but generally that's
a Bad Idea.

Regards,
-- 
Ilya Kasnacheev


чт, 1 авг. 2019 г. в 18:47, Pavel Kovalenko :

> Hello Igniters,
>
> I would like to start a discussion about simplifying NodeFilter
> functionality.
> At the moment NodeFilter's are used to control Caches and Services
> distribution across nodes.
> In most cases, NodeFIlter implementation seeks for a specific attribute in
> NodeAttributes map. If the attribute is found on a node, Cache or Service
> is deployed on that node.
> However, current NodeFilter interface gives a user much more ways to adjust
> such distribution. This gives more flexibility for configuration on the one
> hand but it increases complexity and understanding of the API.
> Because NodeFilter is a functional interface and configured on the user
> side there are problems with serialization, class loading and consistency
> check of such objects.
> Here is a couple of problems we have with NodeFilter:
> 1. User-defined node filter classes must be deployed on all nodes whether
> or nor they required on them. This increases the complexity of resolving
> problems like in IGNITE-1903.
> 2. Part of consistency checking of CacheConfigurations based on NodeFilter
> classes comparison, not on objects. User may use the same class for
> NodeFilter but with different constructor parameters and this can lead to
> inconsistent behavior of the same node filter on different nodes while
> consistency check will pass.
> 3. We can resolve p.2 using objects equality approach, but we can't force
> users to properly implement .equals() method on NodeFilter class. We can
> only recommend doing that thing. If the user forgot to implement .equals()
> or did it incorrectly we can't deal anything with it.
> All of those problems can lead to cluster instability and unpredictable
> behavior.
>
> Instead of continuing using NodeFilter we can give more safe and simpler
> way to implement the same feature. I propose the following approach that is
> used in many other distributed systems:
> User may tag every Ignite node configuration with a specific label that
> will be placed in NodeAttributes map.
> NodeFilter interface replaced with just a string label. If a node
> NodeAttributes map contains such label a Cache or Service will be deployed
> on that node and not deployed if the label doesn't exist.
>
> I would like to add this change to Ignite 3.0 scope because it's an
> important and major change in public API.
>
> WDYT?
>


Re: Batch write to data pages

2019-08-01 Thread Pavel Pereslegin
Igniters,

Maxim Muzafarov and Anton Vinogradov reviewed my PR [1] for the task
described above [2],

Dmitriy Govorukhin, Alexey Goncharuk, please take a look at these changes.

Igniters, please, join the review. Your feedback is really appreciated.

[1] https://github.com/apache/ignite/pull/6364
[2] https://issues.apache.org/jira/browse/IGNITE-11584

пт, 14 июн. 2019 г. в 17:37, Maxim Muzafarov :
>
> Pavel,
>
> Thank you for your efforts!
>
> I'll take a look, shortly.
>
> On Fri, 14 Jun 2019 at 15:53, Pavel Pereslegin  wrote:
> >
> > Hello Igniters!
> >
> > I'm working on implementing batch updates in PageMemory [1] to improve
> > the performance of batch operations. Details can be found in IEP-32
> > [2].
> >
> > As the first step, I have prepared PR [3] with the implementation of
> > batch insertion of several data rows into the free list [4].
> >
> > Performance increased by reducing the workload on the free list -
> > after acquiring a memory page, we can write several data rows before
> > putting the page into the free list. This approach is used in data
> > preloading. Preloader cannot lock multiple cache entries at once due
> > to possible deadlock with concurrent batch updates, so it pre-creates
> > batch of data rows in the page memory, and then sequentially
> > initializes the cache entries one by one.
> >
> > Can someone review these changes?
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-7935
> > [2] 
> > https://cwiki.apache.org/confluence/display/IGNITE/IEP-32+Batch+updates+in+PageMemory
> > [3] https://github.com/apache/ignite/pull/6364
> > [4] https://issues.apache.org/jira/browse/IGNITE-11584


Replacing NodeFilter functionality with label approach

2019-08-01 Thread Pavel Kovalenko
Hello Igniters,

I would like to start a discussion about simplifying NodeFilter
functionality.
At the moment NodeFilter's are used to control Caches and Services
distribution across nodes.
In most cases, NodeFIlter implementation seeks for a specific attribute in
NodeAttributes map. If the attribute is found on a node, Cache or Service
is deployed on that node.
However, current NodeFilter interface gives a user much more ways to adjust
such distribution. This gives more flexibility for configuration on the one
hand but it increases complexity and understanding of the API.
Because NodeFilter is a functional interface and configured on the user
side there are problems with serialization, class loading and consistency
check of such objects.
Here is a couple of problems we have with NodeFilter:
1. User-defined node filter classes must be deployed on all nodes whether
or nor they required on them. This increases the complexity of resolving
problems like in IGNITE-1903.
2. Part of consistency checking of CacheConfigurations based on NodeFilter
classes comparison, not on objects. User may use the same class for
NodeFilter but with different constructor parameters and this can lead to
inconsistent behavior of the same node filter on different nodes while
consistency check will pass.
3. We can resolve p.2 using objects equality approach, but we can't force
users to properly implement .equals() method on NodeFilter class. We can
only recommend doing that thing. If the user forgot to implement .equals()
or did it incorrectly we can't deal anything with it.
All of those problems can lead to cluster instability and unpredictable
behavior.

Instead of continuing using NodeFilter we can give more safe and simpler
way to implement the same feature. I propose the following approach that is
used in many other distributed systems:
User may tag every Ignite node configuration with a specific label that
will be placed in NodeAttributes map.
NodeFilter interface replaced with just a string label. If a node
NodeAttributes map contains such label a Cache or Service will be deployed
on that node and not deployed if the label doesn't exist.

I would like to add this change to Ignite 3.0 scope because it's an
important and major change in public API.

WDYT?


Re: Partition loss event

2019-08-01 Thread Ilya Kasnacheev
Dear fellows!

I think we have a problem: when events were introduced, we were talking
about high-bandwdith events which may overflow your nodes if you
accidentally turn them on.

However, now we have a bunch of low-bandwidth events, such as:
EVT_CHECKPOINT_SAVED
EVT_CHECKPOINT_LOADED
EVT_CHECKPOINT_REMOVED
EVT_NODE_JOINED
EVT_NODE_LEFT
EVT_NODE_FAILED
EVT_NODE_SEGMENTED
EVT_CACHE_REBALANCE_STARTED
EVT_CACHE_REBALANCE_STOPPED
EVT_CACHE_REBALANCE_PART_LOADED
EVT_CACHE_REBALANCE_PART_UNLOADED
EVT_CACHE_REBALANCE_OBJECT_LOADED
EVT_CACHE_REBALANCE_OBJECT_UNLOADED
EVT_CACHE_REBALANCE_PART_DATA_LOST
EVT_CACHE_REBALANCE_PART_SUPPLIED
EVT_CACHE_REBALANCE_PART_MISSED
EVT_CLIENT_NODE_DISCONNECTED
EVT_CLIENT_NODE_RECONNECTED
EVT_WAL_SEGMENT_ARCHIVED
EVT_WAL_SEGMENT_COMPACTED
EVT_CLUSTER_ACTIVATED
EVT_CLUSTER_DEACTIVATED
EVT_PAGE_REPLACEMENT_STARTED

I suggest we enable these events by default, since I fail to see how this
may ever cause problems, but it will definitely decrease confusion
surrounding events.

WDYT?

Regards,
-- 
Ilya Kasnacheev


чт, 1 авг. 2019 г. в 15:18, balazspeterfi :

> Hi Alexandr,
>
> Thanks, that was the missing part. It would be nice to mention it in the
> docs I guess as it's quite easy to miss it.
>
> Regards,
> Balazs
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


[jira] [Created] (IGNITE-12033) .Net callbacks from striped pool due to async/await may hang cluster

2019-08-01 Thread Ilya Kasnacheev (JIRA)
Ilya Kasnacheev created IGNITE-12033:


 Summary: .Net callbacks from striped pool due to async/await may 
hang cluster
 Key: IGNITE-12033
 URL: https://issues.apache.org/jira/browse/IGNITE-12033
 Project: Ignite
  Issue Type: Bug
  Components: cache, platforms
Affects Versions: 2.7.5
Reporter: Ilya Kasnacheev


http://apache-ignite-users.70518.x6.nabble.com/Replace-or-Put-after-PutAsync-causes-Ignite-to-hang-td27871.html#a28051

There's a reproducer project. Long story short, .Net can invoke cache 
operations with future callbacks, which will be invoked from striped pool. If 
such callbacks are to use cache operations, those will be possibly sheduled to 
the same stripe and cause a deadlock.

The code is very simple:

{code}
Console.WriteLine("PutAsync");
await cache.PutAsync(1, "Test");

Console.WriteLine("Replace");
cache.Replace(1, "Testing"); // Hangs here

Console.WriteLine("Wait");
await Task.Delay(Timeout.Infinite); 
{code}

async/await should absolutely not allow any client code to be run from stripes.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (IGNITE-12032) Server node prints exception when ODBC driver disconnects

2019-08-01 Thread Evgenii Zhuravlev (JIRA)
Evgenii Zhuravlev created IGNITE-12032:
--

 Summary: Server node prints exception when ODBC driver disconnects
 Key: IGNITE-12032
 URL: https://issues.apache.org/jira/browse/IGNITE-12032
 Project: Ignite
  Issue Type: Bug
  Components: odbc
Affects Versions: 2.7.5
Reporter: Evgenii Zhuravlev


Whenever a process using ODBC clients is finished, it's printing in the 
node logs this exception: 

{code:java}
*[07:45:19,559][SEVERE][grid-nio-worker-client-listener-1-#30][ClientListenerProcessor]
 
Failed to process selector key [s 
es=GridSelectorNioSessionImpl [worker=ByteBufferNioClientWorker 
[readBuf=java.nio.HeapByteBuffer[pos=0 lim=8192 cap=8192 
], super=AbstractNioClientWorker [idx=1, bytesRcvd=0, bytesSent=0, 
bytesRcvd0=0, bytesSent0=0, select=true, super=GridWo 
rker [name=grid-nio-worker-client-listener-1, igniteInstanceName=null, 
finished=false, heartbeatTs=1564289118230, hashCo 
de=1829856117, interrupted=false, 
runner=grid-nio-worker-client-listener-1-#30]]], writeBuf=null, 
readBuf=null, inRecove 
ry=null, outRecovery=null, super=GridNioSessionImpl 
[locAddr=/0:0:0:0:0:0:0:1:10800, rmtAddr=/0:0:0:0:0:0:0:1:63697, cre 
ateTime=1564289116225, closeTime=0, bytesSent=1346, bytesRcvd=588, 
bytesSent0=0, bytesRcvd0=0, sndSchedTime=156428911623 
5, lastSndTime=1564289116235, lastRcvTime=1564289116235, readsPaused=false, 
filterChain=FilterChain[filters=[GridNioAsyn 
cNotifyFilter, GridNioCodecFilter [parser=ClientListenerBufferedParser, 
directMode=false]], accepted=true, markedForClos 
e=false]]] 
java.io.IOException: An existing connection was forcibly closed by the 
remote host 
at sun.nio.ch.SocketDispatcher.read0(Native Method) 
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:43) 
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) 
at sun.nio.ch.IOUtil.read(IOUtil.java:197) 
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) 
at 
org.apache.ignite.internal.util.nio.GridNioServer$ByteBufferNioClientWorker.processRead(GridNioServer.java:11
 
04) 
at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNi
 
oServer.java:2389) 
at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:215
 
6) 
at 
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1797)
 
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) 
at java.lang.Thread.run(Thread.java:748)* 
{code}

It's absolutely normal behavior when ODBC client disconnects from the node, so, 
we shouldn't print exception in the log. We should replace it with something 
like INFO message about ODBC client disconnection.

Thread from user list: 
http://apache-ignite-users.70518.x6.nabble.com/exceptions-in-Ignite-node-when-a-thin-client-process-ends-td28970.html



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (IGNITE-12031) node freeze and not able to login after few days stop and start the service give the message in description

2019-08-01 Thread Yaser Mohammad Abushaip (JIRA)
Yaser Mohammad Abushaip created IGNITE-12031:


 Summary: node freeze and not able to login after few days stop and 
start the service give the message in description
 Key: IGNITE-12031
 URL: https://issues.apache.org/jira/browse/IGNITE-12031
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 2.7.5
 Environment: Linux 7.5
Reporter: Yaser Mohammad Abushaip
 Fix For: None


During startup I receive the below error

Failed to wait for partition map exchange

[topVer=AffinityTopologyVersion [topVer=2,minorTopVer=1],

node=adfsdfdsfxx. Dumping pending objects that might be the cause

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (IGNITE-12030) node freeze and not able to login after few days stop and start the service give the message in description

2019-08-01 Thread Yaser Mohammad Abushaip (JIRA)
Yaser Mohammad Abushaip created IGNITE-12030:


 Summary: node freeze and not able to login after few days stop and 
start the service give the message in description
 Key: IGNITE-12030
 URL: https://issues.apache.org/jira/browse/IGNITE-12030
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 2.7.5
 Environment: Linux 7.5
Reporter: Yaser Mohammad Abushaip
 Fix For: None


During startup I receive the below error

Failed to wait for partition map exchange

[topVer=AffinityTopologyVersion [topVer=2,minorTopVer=1],

node=adfsdfdsfxx. Dumping pending objects that might be the cause

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: Threadpools and .WithExecute() for C# clients

2019-08-01 Thread Pavel Tupitsyn
Most probably - yes

On Fri, Jul 26, 2019 at 1:36 AM Raymond Wilson 
wrote:

> Thanks Pavel!
>
> Does the priority on the Jira ticket suggest this will target IA 2.8?
>
> Thanks,
> Raymond.
>
> On Wed, Jul 24, 2019 at 8:21 PM Pavel Tupitsyn 
> wrote:
>
>> Denis, yes, looks like a simple thing to add.
>>
>> On Tue, Jul 23, 2019 at 10:38 PM Denis Magda  wrote:
>>
>>> Looping in the dev list.
>>>
>>> Pavel, Igor and other C# maintainers, this looks like a valuable
>>> extension
>>> of our C# APIs. Shouldn't this be a quick addition to Ignite?
>>>
>>> -
>>> Denis
>>>
>>>
>>> On Mon, Jul 22, 2019 at 3:22 PM Raymond Wilson <
>>> raymond_wil...@trimble.com>
>>> wrote:
>>>
>>> > Alexandr,
>>> >
>>> > If .WithExecute is not planned to be made available in the C# client,
>>> what
>>> > is the plan to support custom thread pools from the C# side of things?
>>> >
>>> > Thanks,
>>> > Raymond.
>>> >
>>> >
>>> > On Thu, Jul 18, 2019 at 9:28 AM Raymond Wilson <
>>> raymond_wil...@trimble.com>
>>> > wrote:
>>> >
>>> >> The source of inbound requests into Server A is from client
>>> applications.
>>> >>
>>> >> Server B is really a cluster of servers that are performing clustered
>>> >> transformations and computations across a data set.
>>> >>
>>> >> I originally used IComputeJob and similar functions which work very
>>> well
>>> >> but have the restriction that they return the entire result set from a
>>> >> Server B node in a single response. These result sets can be large
>>> (100's
>>> >> of megabytes and larger), which makes life pretty hard for Server A
>>> if it
>>> >> has to field multiple incoming responses of this size. So, these
>>> types of
>>> >> requests progressively send responses back (using Ignite messaging) to
>>> >> Server A using the Ignite messaging fabric. As Server A receives each
>>> part
>>> >> of the overall response it processes it according the business rules
>>> >> relevant to the request.
>>> >>
>>> >> The cluster config and numbers of nodes are not really material to
>>> this.
>>> >>
>>> >> Raymond.
>>> >>
>>> >> On Thu, Jul 18, 2019 at 12:26 AM Alexandr Shapkin 
>>> >> wrote:
>>> >>
>>> >>> Hi,
>>> >>>
>>> >>>
>>> >>>
>>> >>> Can you share a more detailed use case, please?
>>> >>>
>>> >>>
>>> >>>
>>> >>> Right now it’s not clear why do you need a messaging fabric.
>>> >>>
>>> >>> If you are interesting in a progress tracking, then you could try a
>>> >>> CacheAPI or QueryContinious, for example.
>>> >>>
>>> >>>
>>> >>>
>>> >>> What are the sources of inbound requests? Is it a client requests?
>>> >>>
>>> >>>
>>> >>>
>>> >>> What is your cluster config? How many nodes do you have for your
>>> >>> distributed computations?
>>> >>>
>>> >>>
>>> >>>
>>> >>> *From: *Raymond Wilson 
>>> >>> *Sent: *Wednesday, July 17, 2019 1:49 PM
>>> >>> *To: *user 
>>> >>> *Subject: *Re: Threadpools and .WithExecute() for C# clients
>>> >>>
>>> >>>
>>> >>>
>>> >>> Hi Alexandr,
>>> >>>
>>> >>>
>>> >>>
>>> >>> To summarise from the original thread, say I have server A that
>>> accepts
>>> >>> requests. It contacts server B in order to help processing those
>>> requests.
>>> >>> Server B sends in-progress results to server A using the Ignite
>>> messaging
>>> >>> fabric. If the thread pool in server A is saturated with inbound
>>> requests,
>>> >>> then there are no available threads to service the messaging fabric
>>> traffic
>>> >>> from server B to server A resulting in a deadlock condition.
>>> >>>
>>> >>>
>>> >>>
>>> >>> In the original discussion it was suggested creating a custom thread
>>> >>> pool to handle the Server B to Server A traffic would resolve it.
>>> >>>
>>> >>>
>>> >>>
>>> >>> Thanks,
>>> >>>
>>> >>> Raymond.
>>> >>>
>>> >>>
>>> >>>
>>> >>> On Wed, Jul 17, 2019 at 9:48 PM Alexandr Shapkin 
>>> >>> wrote:
>>> >>>
>>> >>> Hi, Raymond!
>>> >>>
>>> >>>
>>> >>>
>>> >>> As far as I can see, there are no plans for porting custom executors
>>> >>> configuration in .NET client right now [1].
>>> >>>
>>> >>>
>>> >>>
>>> >>> Please, remind, why do you need a separate pool instead of a default
>>> >>> PublicPool?
>>> >>>
>>> >>>
>>> >>>
>>> >>> [1] - https://issues.apache.org/jira/browse/IGNITE-6566
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> *From: *Raymond Wilson 
>>> >>> *Sent: *Wednesday, July 17, 2019 10:58 AM
>>> >>> *To: *user 
>>> >>> *Subject: *Threadpools and .WithExecute() for C# clients
>>> >>>
>>> >>>
>>> >>>
>>> >>> Some time ago I ran into and issue with thread pool exhaustion and
>>> >>> deadlocking in AI 2.2.
>>> >>>
>>> >>>
>>> >>>
>>> >>> This is the original thread:
>>> >>>
>>> http://apache-ignite-users.70518.x6.nabble.com/Possible-dead-lock-when-number-of-jobs-exceeds-thread-pool-td17262.html
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> At the time .WithExecutor() was not implemented in the C# client so
>>> >>> there was little option but to expand the size of the public thread
>>> pool
>>> >>> sufficiently to prevent the deadlocking.
>>> >>>
>>> >>>
>>> >>>
>>> >>> We 

Re: Deprecate\remove REBALANCE_OBJECT_LOADED cache event

2019-08-01 Thread Pavel Kovalenko
Hello Maxim,

Thank you for researching this.
It seems those events can be used as an interceptor for the rebalance
process to make some extra actions after the entry is rebalanced.
However, I don't see any real usages despite tests. Most likely
functionality that used such rebalance events no longer exists.
I see no reasons to have it anymore.
+1 for removing in 2.8


ср, 31 июл. 2019 г. в 20:54, Maxim Muzafarov :

> Igniters,
>
>
> I've faced with EVT_CACHE_REBALANCE_OBJECT_LOADED [1] and
> EVT_CACHE_REBALANCE_OBJECT_UNLOADED [2] cache events and not fully
> understand their general purpose. Hope someone from the community can
> clarify to me the initial idea of adding these events.
>
> The first - it seems to me that these events are completely Ignite
> internal thing. Why the user should be able to subscribe to such
> events? (not related to tracking cache keys metrics). Once the data is
> loaded to cache, I see no reasons to notifying the user about moving
> cache keys from one node to another if the cluster topology changed.
> It's up to Ignites mission to keep data consistency in any cases.
>
> The second - I haven't found any real usages on GitHub\Google of these
> events. Most of the examples are related to our community members and
> Ignites documentation.
>
> The third - Correct me if I am wrong, but subscribing for Ignites
> events can have a strong influence on the cluster performance. So
> fewer events available to users the better performance will be.
>
>
> I think these events can be easily removed in the next 2.8 release.
> WDYT? Am I missing something?
>
> [1]
> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/events/EventType.html#EVT_CACHE_REBALANCE_OBJECT_LOADED
> [2]
> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/events/EventType.html#EVT_CACHE_REBALANCE_OBJECT_UNLOADED
>