Re: Thin client: compute support

2019-11-25 Thread Alex Plehanov
> And it is fine to use request ID to identify compute tasks (as we do with
query cursors).
I can't see any usage of request id in query cursors. We send query request
and get cursor id in response. After that, we only use cursor id (to get
next pages and to close the resource). Did I miss something?

> Looks like I'm missing something - how is topology change relevant to
executing compute tasks from client?
It's not relevant directly. But there are some cases where it will be
helpful. For example, if client sends long term tasks to nodes and wants to
do it with load balancing it will detect topology change only after some
time in the future with the first response, so load balancing will no work.
Perhaps we can add optional "topology version" field to the
OP_COMPUTE_EXECUTE_TASK request to solve this problem.


пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn :

> Alex,
>
> > we will mix entities from different layers (transport layer and request
> body)
> I would not call our message header (which includes the id) "transport
> layer".
> TCP is our transport layer. And it is fine to use request ID to identify
> compute tasks (as we do with query cursors).
>
> > we still can't be sure that the task is successfully started on a server
> The request to start the task will fail and we'll get a response indicating
> that right away
>
> > we won't ever know about topology change
> Looks like I'm missing something - how is topology change relevant to
> executing compute tasks from client?
>
> On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov 
> wrote:
>
> > Pavel, in this case, we will mix entities from different layers
> (transport
> > layer and request body), it's not very good. The same behavior we can
> > achieve with generated on client-side task id, but there will be no
> > inter-layer data intersection and I think it will be easier to implement
> on
> > both client and server-side. But we still can't be sure that the task is
> > successfully started on a server. We won't ever know about topology
> change,
> > because topology changed flag will be sent from server to client only
> with
> > a response when the task will be completed. Are we accept that?
> >
> > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn :
> >
> > > Alex,
> > >
> > > I have a simpler idea. We already do request id handling in the
> protocol,
> > > so:
> > > - Client sends a normal request to execute compute task. Request ID is
> > > generated as usual.
> > > - As soon as task is completed, a response is received.
> > >
> > > As for cancellation - client can send a new request (with new request
> ID)
> > > and (in the body) pass the request ID from above
> > > as a task identifier. As a result, there are two responses:
> > > - Cancellation response
> > > - Task response (with proper cancelled status)
> > >
> > > That's it, no need to modify the core of the protocol. One request -
> one
> > > response.
> > >
> > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov  >
> > > wrote:
> > >
> > > > Pavel, we need to inform the client when the task is completed, we
> need
> > > the
> > > > ability to cancel the task. I see several ways to implement this:
> > > >
> > > > 1. Сlient sends a request to the server to start a task, server
> return
> > > task
> > > > id in response. Server notifies client when task is completed with a
> > new
> > > > request (from server to client). Client can cancel the task by
> sending
> > a
> > > > new request with operation type "cancel" and task id. In this case,
> we
> > > > should implement 2-ways requests.
> > > > 2. Client generates unique task id and sends a request to the server
> to
> > > > start a task, server don't reply immediately but wait until task is
> > > > completed. Client can cancel task by sending new request with
> operation
> > > > type "cancel" and task id. In this case, we should decouple request
> and
> > > > response on the server-side (currently response is sent right after
> > > request
> > > > was processed). Also, we can't be sure that task is successfully
> > started
> > > on
> > > > a server.
> > > > 3. Client sends a request to the server to start a task, server
> return
> > id
> > > > in response. Client periodically asks the server about task status.
> > > Client
> > > > can cancel the task by sending new request with operation type
> "cancel"
> > > and
> > > > task id. This case brings some overhead to the communication channel.
> > > >
> > > > Personally, I think that the case with 2-ways requests is better, but
> > I'm
> > > > open to any other ideas.
> > > >
> > > > Aleksandr,
> > > >
> > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks
> > overcomplicated.
> > > Do
> > > > we need server-side filtering at all? Wouldn't it be better to send
> > basic
> > > > info (ids, order, flags) for all nodes (there is relatively small
> > amount
> > > of
> > > > data) and extended info (attributes) for selected list of nodes? In
> > this
> > > > case, we can do basic node filtration on 

Re: Thin client: compute support

2019-11-25 Thread Pavel Tupitsyn
Alex,

> we will mix entities from different layers (transport layer and request
body)
I would not call our message header (which includes the id) "transport
layer".
TCP is our transport layer. And it is fine to use request ID to identify
compute tasks (as we do with query cursors).

> we still can't be sure that the task is successfully started on a server
The request to start the task will fail and we'll get a response indicating
that right away

> we won't ever know about topology change
Looks like I'm missing something - how is topology change relevant to
executing compute tasks from client?

On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov 
wrote:

> Pavel, in this case, we will mix entities from different layers (transport
> layer and request body), it's not very good. The same behavior we can
> achieve with generated on client-side task id, but there will be no
> inter-layer data intersection and I think it will be easier to implement on
> both client and server-side. But we still can't be sure that the task is
> successfully started on a server. We won't ever know about topology change,
> because topology changed flag will be sent from server to client only with
> a response when the task will be completed. Are we accept that?
>
> пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn :
>
> > Alex,
> >
> > I have a simpler idea. We already do request id handling in the protocol,
> > so:
> > - Client sends a normal request to execute compute task. Request ID is
> > generated as usual.
> > - As soon as task is completed, a response is received.
> >
> > As for cancellation - client can send a new request (with new request ID)
> > and (in the body) pass the request ID from above
> > as a task identifier. As a result, there are two responses:
> > - Cancellation response
> > - Task response (with proper cancelled status)
> >
> > That's it, no need to modify the core of the protocol. One request - one
> > response.
> >
> > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov 
> > wrote:
> >
> > > Pavel, we need to inform the client when the task is completed, we need
> > the
> > > ability to cancel the task. I see several ways to implement this:
> > >
> > > 1. Сlient sends a request to the server to start a task, server return
> > task
> > > id in response. Server notifies client when task is completed with a
> new
> > > request (from server to client). Client can cancel the task by sending
> a
> > > new request with operation type "cancel" and task id. In this case, we
> > > should implement 2-ways requests.
> > > 2. Client generates unique task id and sends a request to the server to
> > > start a task, server don't reply immediately but wait until task is
> > > completed. Client can cancel task by sending new request with operation
> > > type "cancel" and task id. In this case, we should decouple request and
> > > response on the server-side (currently response is sent right after
> > request
> > > was processed). Also, we can't be sure that task is successfully
> started
> > on
> > > a server.
> > > 3. Client sends a request to the server to start a task, server return
> id
> > > in response. Client periodically asks the server about task status.
> > Client
> > > can cancel the task by sending new request with operation type "cancel"
> > and
> > > task id. This case brings some overhead to the communication channel.
> > >
> > > Personally, I think that the case with 2-ways requests is better, but
> I'm
> > > open to any other ideas.
> > >
> > > Aleksandr,
> > >
> > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks
> overcomplicated.
> > Do
> > > we need server-side filtering at all? Wouldn't it be better to send
> basic
> > > info (ids, order, flags) for all nodes (there is relatively small
> amount
> > of
> > > data) and extended info (attributes) for selected list of nodes? In
> this
> > > case, we can do basic node filtration on client-side (forClients(),
> > > forServers(), forNodeIds(), forOthers(), etc).
> > >
> > > Do you use standard ClusterNode serialization? There are also metrics
> > > serialized with ClusterNode, do we need it on thin client? There are
> > other
> > > interfaces exist to show metrics, I think it's redundant to export
> > metrics
> > > to thin clients too.
> > >
> > > What do you think?
> > >
> > >
> > >
> > >
> > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin :
> > >
> > > > Alex,
> > > >
> > > >
> > > >
> > > > I think you can create a new IEP page and I will fill it with the
> > Cluster
> > > > API details.
> > > >
> > > >
> > > >
> > > > In short, I’ve introduced several new codes:
> > > >
> > > >
> > > >
> > > > Cluster API is pretty straightforward:
> > > >
> > > >
> > > >
> > > > OP_CLUSTER_IS_ACTIVE = 5000
> > > >
> > > > OP_CLUSTER_CHANGE_STATE = 5001
> > > >
> > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002
> > > >
> > > > OP_CLUSTER_GET_WAL_STATE = 5003
> > > >
> > > >
> > > >
> > > > Cluster group codes:
> > > >
> > > >
> > > >
> > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100
> > 

Re: Thin client: compute support

2019-11-25 Thread Alex Plehanov
Pavel, in this case, we will mix entities from different layers (transport
layer and request body), it's not very good. The same behavior we can
achieve with generated on client-side task id, but there will be no
inter-layer data intersection and I think it will be easier to implement on
both client and server-side. But we still can't be sure that the task is
successfully started on a server. We won't ever know about topology change,
because topology changed flag will be sent from server to client only with
a response when the task will be completed. Are we accept that?

пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn :

> Alex,
>
> I have a simpler idea. We already do request id handling in the protocol,
> so:
> - Client sends a normal request to execute compute task. Request ID is
> generated as usual.
> - As soon as task is completed, a response is received.
>
> As for cancellation - client can send a new request (with new request ID)
> and (in the body) pass the request ID from above
> as a task identifier. As a result, there are two responses:
> - Cancellation response
> - Task response (with proper cancelled status)
>
> That's it, no need to modify the core of the protocol. One request - one
> response.
>
> On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov 
> wrote:
>
> > Pavel, we need to inform the client when the task is completed, we need
> the
> > ability to cancel the task. I see several ways to implement this:
> >
> > 1. Сlient sends a request to the server to start a task, server return
> task
> > id in response. Server notifies client when task is completed with a new
> > request (from server to client). Client can cancel the task by sending a
> > new request with operation type "cancel" and task id. In this case, we
> > should implement 2-ways requests.
> > 2. Client generates unique task id and sends a request to the server to
> > start a task, server don't reply immediately but wait until task is
> > completed. Client can cancel task by sending new request with operation
> > type "cancel" and task id. In this case, we should decouple request and
> > response on the server-side (currently response is sent right after
> request
> > was processed). Also, we can't be sure that task is successfully started
> on
> > a server.
> > 3. Client sends a request to the server to start a task, server return id
> > in response. Client periodically asks the server about task status.
> Client
> > can cancel the task by sending new request with operation type "cancel"
> and
> > task id. This case brings some overhead to the communication channel.
> >
> > Personally, I think that the case with 2-ways requests is better, but I'm
> > open to any other ideas.
> >
> > Aleksandr,
> >
> > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks overcomplicated.
> Do
> > we need server-side filtering at all? Wouldn't it be better to send basic
> > info (ids, order, flags) for all nodes (there is relatively small amount
> of
> > data) and extended info (attributes) for selected list of nodes? In this
> > case, we can do basic node filtration on client-side (forClients(),
> > forServers(), forNodeIds(), forOthers(), etc).
> >
> > Do you use standard ClusterNode serialization? There are also metrics
> > serialized with ClusterNode, do we need it on thin client? There are
> other
> > interfaces exist to show metrics, I think it's redundant to export
> metrics
> > to thin clients too.
> >
> > What do you think?
> >
> >
> >
> >
> > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin :
> >
> > > Alex,
> > >
> > >
> > >
> > > I think you can create a new IEP page and I will fill it with the
> Cluster
> > > API details.
> > >
> > >
> > >
> > > In short, I’ve introduced several new codes:
> > >
> > >
> > >
> > > Cluster API is pretty straightforward:
> > >
> > >
> > >
> > > OP_CLUSTER_IS_ACTIVE = 5000
> > >
> > > OP_CLUSTER_CHANGE_STATE = 5001
> > >
> > > OP_CLUSTER_CHANGE_WAL_STATE = 5002
> > >
> > > OP_CLUSTER_GET_WAL_STATE = 5003
> > >
> > >
> > >
> > > Cluster group codes:
> > >
> > >
> > >
> > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100
> > >
> > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101
> > >
> > >
> > >
> > > The underlying implementation is based on the thick client logic.
> > >
> > >
> > >
> > > For every request, we provide a known topology version and if it has
> > > changed,
> > >
> > > a client updates it firstly and then re-sends the filtering request.
> > >
> > >
> > >
> > > Alongside the topVer a client sends a serialized nodes projection
> object
> > >
> > > that could be considered as a code to value mapping.
> > >
> > > Consider: [{Code = 1, Value= [“DotNet”, “MyAttribute”}, {Code=2,
> > Value=1}]
> > >
> > > Where “1” stands for Attribute filtering and “2” – serverNodesOnly
> flag.
> > >
> > >
> > >
> > > As a result of request processing, a server sends nodeId UUIDs and a
> > > current topVer.
> > >
> > >
> > >
> > > When a client obtains nodeIds, it can perform a NODE_INFO call to get a
> > >
> > > serialized 

Re: Review needed for IGNITE-11410 Sandbox for user-defined code

2019-11-25 Thread Denis Garus
Hello, Igniters!


Alex Plehanov did the review, and I've reworked the implementation of the
issue [1] that is part of the IEP-38 [2].

Could somebody else do a review of the PR [3]?


Alex, thank you for the review!



   1. https://issues.apache.org/jira/browse/IGNITE-11410
   2. https://cwiki.apache.org/confluence/display/IGNITE/IEP-38%3A+Sandbox
   3. https://github.com/apache/ignite/pull/6707


чт, 17 окт. 2019 г. в 16:21, Denis Garus :

> Hello, Pavel!
>
> Thank you for the feedback!
>
> I've created IEP-38 that describes the Ignite Sandbox [1].
>
> Yes, the issue requires documentation (there is the flag "Docs equired"),
> but common practice is to write documentation in the end.
>
> >> 1) Why do you run resource injection through security and how it tested?
> >> 2) Why do you check security at *dumpThreads* and *wrapThreadLoader
> *methods?
> >>  These methods are needed only for internal node processes.
> >> 4) There are suspicious security checks at:
> >> CacheOperationContext:37
> >> GridCacheDefaultAffinityKeyMapper:86
> >> PageMemoryImpl:874
> >> I'm not following why they are needed.
>
> These questions have a common answer.
> A user-defined code can call any operation through the public API of
> Ignite. But he may don't have permissions to execute this operation
> successfully.
> For example, to put a value into a cache, it requires permissions for
> accessing to reflection API and reading system property
> IGNITE_ALLOW_ATOMIC_OPS_IN_TX.
> In that case, we have to use AccessController#doPrirvelged call to exclude
> a user-defined code from checking of permissions.
> SecurityUtils#doPriveleged does calling AccessController#doPrirvelged a
> more convenient way.
>
> >> 3) Have you tested security if a compute job is canceled?
> You are right; we should add a test for the cancel case.
> But, for now, we have the issue [2] with the current SecurityContext for
> the canceling of ComputeJob.
>
> 1. https://cwiki.apache.org/confluence/display/IGNITE/IEP-38%3A+Sandbox
> 2. https://issues.apache.org/jira/browse/IGNITE-12300
>
> пн, 14 окт. 2019 г. в 16:16, Pavel Kovalenko :
>
>> Denis,
>>
>> The idea of having a sandbox for running a user-defined code is useful,
>> but
>> I don't fully understand the implementation approach.
>> There is no detailed description of the ticket about what public API
>> methods or configuration parameters should be covered.
>> There is no description of what have done in initial PR and how.
>> First of all, there should be an umbrella ticket that should contain all
>> public API points and configuration parameters where used defined code may
>> be run.
>> Without a full list of all possible user-defined code injections, we can't
>> track what was covered and where are a possible security lacks.
>> I've checked PR and I have the following questions:
>> 1) Why do you run resource injection through security and how it tested?
>> 2) Why do you check security at *dumpThreads* and *wrapThreadLoader
>> *methods?
>> These methods are needed only for internal node processes.
>> 3) Have you tested security if a compute job is canceled?
>> 4) There are suspicious security checks at:
>> CacheOperationContext:37
>> GridCacheDefaultAffinityKeyMapper:86
>> PageMemoryImpl:874
>> I'm not following why they are needed.
>>
>>
>>
>>
>> пн, 14 окт. 2019 г. в 12:19, Anton Vinogradov :
>>
>> > Fully agree with the benchmark's importance.
>> > Currently, we're not able to perform proper benchmarking.
>> > Slava, Is it possible to ask you to check the solution using GridGain's
>> > benchmarking environment?
>> >
>> > On Mon, Oct 14, 2019 at 12:07 PM Вячеслав Коптилин <
>> > slava.kopti...@gmail.com>
>> > wrote:
>> >
>> > > Hello Anton,
>> > >
>> > > > We should avoid heavy merges if possible.
>> > > Why it should be avoided? To be honest, I don't see any reason for
>> that.
>> > > Every pull request can be and should be reviewed when it is done and
>> > ready
>> > > to be merged into the epic branch (IEP branch).
>> > > So, the final review of the entire IEP is just a technical/trivial
>> task,
>> > in
>> > > my opinion.
>> > >
>> > > If I am not mistaken, we are at the stage of preparing a new release
>> > (2.8),
>> > > right?
>> > > And we are trying to add a new feature that may impact the
>> performance.
>> > > For example, affinity function, which can be overridden by the
>> end-user,
>> > > and therefore should be covered by `sandbox`.
>> > > On the other hand, affinity function is a crucial component that is
>> used
>> > > very often.
>> > > Are we really sure that the proposed change does not affect the
>> > > performance? Do we have a benchmark?
>> > >
>> > > Please don't get me wrong, guys. I am not against the feature itself.
>> > > Moreover, it is a great feature and improvement of security.
>> > > I just want to say that we need to be sure that we are on the right
>> way
>> > of
>> > > implementing this without affecting other developers.
>> > >
>> > > PS: This is just my 

Re: Thin client: compute support

2019-11-25 Thread Pavel Tupitsyn
Alex,

I have a simpler idea. We already do request id handling in the protocol,
so:
- Client sends a normal request to execute compute task. Request ID is
generated as usual.
- As soon as task is completed, a response is received.

As for cancellation - client can send a new request (with new request ID)
and (in the body) pass the request ID from above
as a task identifier. As a result, there are two responses:
- Cancellation response
- Task response (with proper cancelled status)

That's it, no need to modify the core of the protocol. One request - one
response.

On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov 
wrote:

> Pavel, we need to inform the client when the task is completed, we need the
> ability to cancel the task. I see several ways to implement this:
>
> 1. Сlient sends a request to the server to start a task, server return task
> id in response. Server notifies client when task is completed with a new
> request (from server to client). Client can cancel the task by sending a
> new request with operation type "cancel" and task id. In this case, we
> should implement 2-ways requests.
> 2. Client generates unique task id and sends a request to the server to
> start a task, server don't reply immediately but wait until task is
> completed. Client can cancel task by sending new request with operation
> type "cancel" and task id. In this case, we should decouple request and
> response on the server-side (currently response is sent right after request
> was processed). Also, we can't be sure that task is successfully started on
> a server.
> 3. Client sends a request to the server to start a task, server return id
> in response. Client periodically asks the server about task status. Client
> can cancel the task by sending new request with operation type "cancel" and
> task id. This case brings some overhead to the communication channel.
>
> Personally, I think that the case with 2-ways requests is better, but I'm
> open to any other ideas.
>
> Aleksandr,
>
> Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks overcomplicated. Do
> we need server-side filtering at all? Wouldn't it be better to send basic
> info (ids, order, flags) for all nodes (there is relatively small amount of
> data) and extended info (attributes) for selected list of nodes? In this
> case, we can do basic node filtration on client-side (forClients(),
> forServers(), forNodeIds(), forOthers(), etc).
>
> Do you use standard ClusterNode serialization? There are also metrics
> serialized with ClusterNode, do we need it on thin client? There are other
> interfaces exist to show metrics, I think it's redundant to export metrics
> to thin clients too.
>
> What do you think?
>
>
>
>
> пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin :
>
> > Alex,
> >
> >
> >
> > I think you can create a new IEP page and I will fill it with the Cluster
> > API details.
> >
> >
> >
> > In short, I’ve introduced several new codes:
> >
> >
> >
> > Cluster API is pretty straightforward:
> >
> >
> >
> > OP_CLUSTER_IS_ACTIVE = 5000
> >
> > OP_CLUSTER_CHANGE_STATE = 5001
> >
> > OP_CLUSTER_CHANGE_WAL_STATE = 5002
> >
> > OP_CLUSTER_GET_WAL_STATE = 5003
> >
> >
> >
> > Cluster group codes:
> >
> >
> >
> > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100
> >
> > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101
> >
> >
> >
> > The underlying implementation is based on the thick client logic.
> >
> >
> >
> > For every request, we provide a known topology version and if it has
> > changed,
> >
> > a client updates it firstly and then re-sends the filtering request.
> >
> >
> >
> > Alongside the topVer a client sends a serialized nodes projection object
> >
> > that could be considered as a code to value mapping.
> >
> > Consider: [{Code = 1, Value= [“DotNet”, “MyAttribute”}, {Code=2,
> Value=1}]
> >
> > Where “1” stands for Attribute filtering and “2” – serverNodesOnly flag.
> >
> >
> >
> > As a result of request processing, a server sends nodeId UUIDs and a
> > current topVer.
> >
> >
> >
> > When a client obtains nodeIds, it can perform a NODE_INFO call to get a
> >
> > serialized ClusterNode object. In addition there should be a different
> API
> >
> > method for accessing/updating node metrics.
> >
> > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov :
> >
> > > Hi Pavel
> > >
> > > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn 
> > > wrote:
> > >
> > > > 1. I believe that Cluster operations for Thin Client protocol are
> > already
> > > > in the works
> > > > by Alexandr Shapkin. Can't find the ticket though.
> > > > Alexandr, can you please confirm and attach the ticket number?
> > > >
> > > > 2. Proposed changes will work only for Java tasks that are already
> > > deployed
> > > > on server nodes.
> > > > This is mostly useless for other thin clients we have (Python, PHP,
> > .NET,
> > > > C++).
> > > >
> > >
> > > I don't guess so. The task (execution) is a way to implement own layer
> > for
> > > the thin client application.
> > >
> > >
> > > > We should think of a way to make 

Re: Thin client: compute support

2019-11-25 Thread Alex Plehanov
Pavel, we need to inform the client when the task is completed, we need the
ability to cancel the task. I see several ways to implement this:

1. Сlient sends a request to the server to start a task, server return task
id in response. Server notifies client when task is completed with a new
request (from server to client). Client can cancel the task by sending a
new request with operation type "cancel" and task id. In this case, we
should implement 2-ways requests.
2. Client generates unique task id and sends a request to the server to
start a task, server don't reply immediately but wait until task is
completed. Client can cancel task by sending new request with operation
type "cancel" and task id. In this case, we should decouple request and
response on the server-side (currently response is sent right after request
was processed). Also, we can't be sure that task is successfully started on
a server.
3. Client sends a request to the server to start a task, server return id
in response. Client periodically asks the server about task status. Client
can cancel the task by sending new request with operation type "cancel" and
task id. This case brings some overhead to the communication channel.

Personally, I think that the case with 2-ways requests is better, but I'm
open to any other ideas.

Aleksandr,

Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks overcomplicated. Do
we need server-side filtering at all? Wouldn't it be better to send basic
info (ids, order, flags) for all nodes (there is relatively small amount of
data) and extended info (attributes) for selected list of nodes? In this
case, we can do basic node filtration on client-side (forClients(),
forServers(), forNodeIds(), forOthers(), etc).

Do you use standard ClusterNode serialization? There are also metrics
serialized with ClusterNode, do we need it on thin client? There are other
interfaces exist to show metrics, I think it's redundant to export metrics
to thin clients too.

What do you think?




пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin :

> Alex,
>
>
>
> I think you can create a new IEP page and I will fill it with the Cluster
> API details.
>
>
>
> In short, I’ve introduced several new codes:
>
>
>
> Cluster API is pretty straightforward:
>
>
>
> OP_CLUSTER_IS_ACTIVE = 5000
>
> OP_CLUSTER_CHANGE_STATE = 5001
>
> OP_CLUSTER_CHANGE_WAL_STATE = 5002
>
> OP_CLUSTER_GET_WAL_STATE = 5003
>
>
>
> Cluster group codes:
>
>
>
> OP_CLUSTER_GROUP_GET_NODE_IDS = 5100
>
> OP_CLUSTER_GROUP_GET_NODE_INFO = 5101
>
>
>
> The underlying implementation is based on the thick client logic.
>
>
>
> For every request, we provide a known topology version and if it has
> changed,
>
> a client updates it firstly and then re-sends the filtering request.
>
>
>
> Alongside the topVer a client sends a serialized nodes projection object
>
> that could be considered as a code to value mapping.
>
> Consider: [{Code = 1, Value= [“DotNet”, “MyAttribute”}, {Code=2, Value=1}]
>
> Where “1” stands for Attribute filtering and “2” – serverNodesOnly flag.
>
>
>
> As a result of request processing, a server sends nodeId UUIDs and a
> current topVer.
>
>
>
> When a client obtains nodeIds, it can perform a NODE_INFO call to get a
>
> serialized ClusterNode object. In addition there should be a different API
>
> method for accessing/updating node metrics.
>
> чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov :
>
> > Hi Pavel
> >
> > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn 
> > wrote:
> >
> > > 1. I believe that Cluster operations for Thin Client protocol are
> already
> > > in the works
> > > by Alexandr Shapkin. Can't find the ticket though.
> > > Alexandr, can you please confirm and attach the ticket number?
> > >
> > > 2. Proposed changes will work only for Java tasks that are already
> > deployed
> > > on server nodes.
> > > This is mostly useless for other thin clients we have (Python, PHP,
> .NET,
> > > C++).
> > >
> >
> > I don't guess so. The task (execution) is a way to implement own layer
> for
> > the thin client application.
> >
> >
> > > We should think of a way to make this useful for all clients.
> > > For example, we may allow sending tasks in some scripting language like
> > > Javascript.
> > > Thoughts?
> > >
> >
> > The arbitrary code execution from a remote client must be protected
> > from malicious code.
> > I don't know how it could be designed but without that we open the hole
> to
> > kill cluster.
> >
> >
> > >
> > > On Thu, Nov 21, 2019 at 11:21 AM Sergey Kozlov 
> > > wrote:
> > >
> > > > Hi Alex
> > > >
> > > > The idea is great. But I have some concerns that probably should be
> > taken
> > > > into account for design:
> > > >
> > > >1. We need to have the ability to stop a task execution, smth like
> > > >OP_COMPUTE_CANCEL_TASK  operation (client to server)
> > > >2. What's about task execution timeout? It may help to the cluster
> > > >survival for buggy tasks
> > > >3. Ignite doesn't have roles/authorization 

Re: The Spark 2.4 support

2019-11-25 Thread Alexey Zinoviev
Yes, as I mentioned above, you could observe the real changes after copying
of spark 2.3 module here, in special commit
The last commit
https://github.com/apache/ignite/pull/7058/commits/60386802299deedc6ed60bf4736e922201a67fb8
contains
real changes from Spark 2.3


пн, 25 нояб. 2019 г. в 17:28, Николай Ижиков :

> Hello, Alexey.
>
> Can we somehow highlight changes in Spark-2.4 module comparing to 2.3 one?
> For now the changes look too huge for me (+11,681 −1).
>
> Are we sure we want to add those huge piece of code to support two
> versions?
> Can we extract unchanged parts(based on spark public API) and keep them in
> one copy?
>
> > 18 нояб. 2019 г., в 23:47, Denis Magda  написал(а):
> >
> > Alexey, thanks for the details and for reaching out this milestone with
> the
> > 2.4 support.
> >
> > Generally, I would advise us to merge the changes to the master only
> after
> > we confirm the failing tests are not regressions. We should either remove
> > them or replace them with some others or just fix.
> >
> > -
> > Denis
> >
> >
> > On Mon, Nov 18, 2019 at 10:06 AM Alexey Zinoviev  >
> > wrote:
> >
> >> Right, a few tests from 200 are failed due to known issue and couldnt be
> >> fixed immediately, related to rare cases. These tests are copies of 2.3
> >> tests and part of them could have no meaning for 2.4 due to Spark
> changed
> >> behaviour.
> >>
> >> пн, 18 нояб. 2019 г., 19:42 Denis Magda :
> >>
> >>> Alexey,
> >>>
> >>> Please help to understand what it means that 2.4 integration supports
> >> "95%
> >>> of tests of 2.3". Does it mean that 5% of existing tests are failing
> and,
> >>> basically, need to be fixed?
> >>>
> >>> -
> >>> Denis
> >>>
> >>>
> >>> On Mon, Nov 18, 2019 at 6:52 AM Alexey Zinoviev <
> zaleslaw@gmail.com>
> >>> wrote:
> >>>
>  Dear Nikolay Izhikov, I've recreated the PR for 2.4 initial support
> 
>  The last commit
> 
> 
> >>>
> >>
> https://github.com/apache/ignite/pull/7058/commits/60386802299deedc6ed60bf4736e922201a67fb8
>  contains
>  real changes from Spark 2.3
> 
>  I suggest to merge to master this initial solution with 95% support of
>  Spark 2.4 and continue work on known issues listed in JIRA
> 
>  This solution supports the new Spark version for all examples and 95%
> >> of
>  tests of 2.3.
> 
>  вт, 1 окт. 2019 г. в 08:48, Ivan Pavlukhin :
> 
> > Alexey, Nikolay,
> >
> > Thank you for sharing details!
> >
> > вт, 1 окт. 2019 г. в 07:42, Alexey Zinoviev  >>> :
> >>
> >> Great talk and paper, I've learnt it last year
> >>
> >> пн, 30 сент. 2019 г., 21:42 Nikolay Izhikov :
> >>
> >>> Yes, I can :)
> >>>
> >>> В Пн, 30/09/2019 в 11:40 -0700, Denis Magda пишет:
>  Nikolay,
> 
>  Would you be able to review the changes? I'm not sure there is
> >> a
> > better
> >>> candidate for now.
> 
>  -
>  Denis
> 
> 
>  On Mon, Sep 30, 2019 at 11:01 AM Nikolay Izhikov <
> > nizhi...@apache.org>
> >>> wrote:
> > Hello, Ivan.
> >
> > I had a talk about internals of Spark integration in Ignite.
> > It answers on question why we should use Spark internals.
> >
> > You can take a look at my meetup talk(in Russian) [1] or read
> >>> an
> >>> article if you prefer text [2].
> >
> > [1] https://www.youtube.com/watch?v=CzbAweNKEVY
> > [2] https://habr.com/ru/company/sberbank/blog/427297/
> >
> > В Пн, 30/09/2019 в 20:29 +0300, Alexey Zinoviev пишет:
> >> Yes, as I understand it uses Spark internals from the first
> > commit)))
> >> The reason - we take Spark SQL query execution plan and try
> >>> to
> >>> execute it
> >> on Ignite cluster
> >> Also we inherit a lot of Developer API related classes that
> > could be
> >> unstable. Spark has no good point for extension and this
> >> is a
> > reason
> >>> why we
> >> should go deeper
> >>
> >> пн, 30 сент. 2019 г. в 20:17, Ivan Pavlukhin <
> > vololo...@gmail.com>:
> >>
> >>> Hi Alexey,
> >>>
> >>> As an external watcher very far from Ignite Spark
>  integration I
> >>> would
> >>> like to ask a humble question for my understanding. Why
> >>> this
> >>> integration uses Spark internals? Is it a common approach
> >>> for
> >>> integrating with Spark?
> >>>
> >>> пн, 30 сент. 2019 г. в 16:17, Alexey Zinoviev <
> >>> zaleslaw@gmail.com>:
> 
>  Hi, Igniters
>  I've started the work on the Spark 2.4 support
> 
>  We started the discussion here, in
>  https://issues.apache.org/jira/browse/IGNITE-12054
> 
>  The Spark internals were totally refactored between 2.3
> >>> and
> > 2.4
> 

Re: The Spark 2.4 support

2019-11-25 Thread Николай Ижиков
Hello, Alexey.

Can we somehow highlight changes in Spark-2.4 module comparing to 2.3 one?
For now the changes look too huge for me (+11,681 −1).

Are we sure we want to add those huge piece of code to support two versions?
Can we extract unchanged parts(based on spark public API) and keep them in one 
copy?

> 18 нояб. 2019 г., в 23:47, Denis Magda  написал(а):
> 
> Alexey, thanks for the details and for reaching out this milestone with the
> 2.4 support.
> 
> Generally, I would advise us to merge the changes to the master only after
> we confirm the failing tests are not regressions. We should either remove
> them or replace them with some others or just fix.
> 
> -
> Denis
> 
> 
> On Mon, Nov 18, 2019 at 10:06 AM Alexey Zinoviev 
> wrote:
> 
>> Right, a few tests from 200 are failed due to known issue and couldnt be
>> fixed immediately, related to rare cases. These tests are copies of 2.3
>> tests and part of them could have no meaning for 2.4 due to Spark changed
>> behaviour.
>> 
>> пн, 18 нояб. 2019 г., 19:42 Denis Magda :
>> 
>>> Alexey,
>>> 
>>> Please help to understand what it means that 2.4 integration supports
>> "95%
>>> of tests of 2.3". Does it mean that 5% of existing tests are failing and,
>>> basically, need to be fixed?
>>> 
>>> -
>>> Denis
>>> 
>>> 
>>> On Mon, Nov 18, 2019 at 6:52 AM Alexey Zinoviev 
>>> wrote:
>>> 
 Dear Nikolay Izhikov, I've recreated the PR for 2.4 initial support
 
 The last commit
 
 
>>> 
>> https://github.com/apache/ignite/pull/7058/commits/60386802299deedc6ed60bf4736e922201a67fb8
 contains
 real changes from Spark 2.3
 
 I suggest to merge to master this initial solution with 95% support of
 Spark 2.4 and continue work on known issues listed in JIRA
 
 This solution supports the new Spark version for all examples and 95%
>> of
 tests of 2.3.
 
 вт, 1 окт. 2019 г. в 08:48, Ivan Pavlukhin :
 
> Alexey, Nikolay,
> 
> Thank you for sharing details!
> 
> вт, 1 окт. 2019 г. в 07:42, Alexey Zinoviev >> :
>> 
>> Great talk and paper, I've learnt it last year
>> 
>> пн, 30 сент. 2019 г., 21:42 Nikolay Izhikov :
>> 
>>> Yes, I can :)
>>> 
>>> В Пн, 30/09/2019 в 11:40 -0700, Denis Magda пишет:
 Nikolay,
 
 Would you be able to review the changes? I'm not sure there is
>> a
> better
>>> candidate for now.
 
 -
 Denis
 
 
 On Mon, Sep 30, 2019 at 11:01 AM Nikolay Izhikov <
> nizhi...@apache.org>
>>> wrote:
> Hello, Ivan.
> 
> I had a talk about internals of Spark integration in Ignite.
> It answers on question why we should use Spark internals.
> 
> You can take a look at my meetup talk(in Russian) [1] or read
>>> an
>>> article if you prefer text [2].
> 
> [1] https://www.youtube.com/watch?v=CzbAweNKEVY
> [2] https://habr.com/ru/company/sberbank/blog/427297/
> 
> В Пн, 30/09/2019 в 20:29 +0300, Alexey Zinoviev пишет:
>> Yes, as I understand it uses Spark internals from the first
> commit)))
>> The reason - we take Spark SQL query execution plan and try
>>> to
>>> execute it
>> on Ignite cluster
>> Also we inherit a lot of Developer API related classes that
> could be
>> unstable. Spark has no good point for extension and this
>> is a
> reason
>>> why we
>> should go deeper
>> 
>> пн, 30 сент. 2019 г. в 20:17, Ivan Pavlukhin <
> vololo...@gmail.com>:
>> 
>>> Hi Alexey,
>>> 
>>> As an external watcher very far from Ignite Spark
 integration I
>>> would
>>> like to ask a humble question for my understanding. Why
>>> this
>>> integration uses Spark internals? Is it a common approach
>>> for
>>> integrating with Spark?
>>> 
>>> пн, 30 сент. 2019 г. в 16:17, Alexey Zinoviev <
>>> zaleslaw@gmail.com>:
 
 Hi, Igniters
 I've started the work on the Spark 2.4 support
 
 We started the discussion here, in
 https://issues.apache.org/jira/browse/IGNITE-12054
 
 The Spark internals were totally refactored between 2.3
>>> and
> 2.4
>>> versions,
 main changes touches
 
   - External catalog and listeners refactoring
   - Changes of HAVING operator semantic support
   - Push-down NULL filters generation in JOIN plans
   - minor changes in Plan Generation that should be
 adopted
> in
>>> our
   integration module
 
 I propose the initial solution here via creation of new
> module
>>> spark-2.4
 here
>> https://issues.apache.org/jira/browse/IGNITE-12247
 and
>>> addition of

[jira] [Created] (IGNITE-12394) Confusing message and thread dumps about ignored failures

2019-11-25 Thread Mirza Aliev (Jira)
Mirza Aliev created IGNITE-12394:


 Summary: Confusing message and thread dumps about ignored failures
 Key: IGNITE-12394
 URL: https://issues.apache.org/jira/browse/IGNITE-12394
 Project: Ignite
  Issue Type: Improvement
Affects Versions: 2.7
Reporter: Mirza Aliev
 Fix For: 2.8


Currently FailureProcessor shows the following message in logs

Critical system error detected. Will be handled accordingly to configured 
handler

even if a failure type is configured in \{{AbstractFailureHandler}}as ignored.

The message should be changed in cases when failure types are ignored.

*Solution:*
 * Change current message (but save corresponding handler and context info as 
it was before ) to 
Possible failure suppressed accordingly to a configured handler

for cases when failures are ignored and show the message on quiete mode and 
warn logging levels
 * Show original message on error level when failure is not from ignored set

Thread dump should also be logged in level accordingly to the failure processor 
message log level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12393) Thread pool queue system view

2019-11-25 Thread Nikolay Izhikov (Jira)
Nikolay Izhikov created IGNITE-12393:


 Summary: Thread pool queue system view
 Key: IGNITE-12393
 URL: https://issues.apache.org/jira/browse/IGNITE-12393
 Project: Ignite
  Issue Type: Sub-task
Reporter: Nikolay Izhikov


When in the production environment exist some cluster performance issues 
usually it leads to the large striped executor queue size.

The number of tasks in the queue can observe by 
{StripedExecutorMXBean#getTotalQueueSize} metric. In the case queue size 
becomes large it's useful to have the ability to know what tasks are waiting 
for execution in the thread pool.

Especially, for dealing with failover scenarios.

We should create a system views to expose information about striped executor 
services queue.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12392) Faster transaction rolled back when one of backup node failed

2019-11-25 Thread Kirill Tkalenko (Jira)
Kirill Tkalenko created IGNITE-12392:


 Summary: Faster transaction rolled back when one of backup node 
failed 
 Key: IGNITE-12392
 URL: https://issues.apache.org/jira/browse/IGNITE-12392
 Project: Ignite
  Issue Type: Improvement
Reporter: Kirill Tkalenko
Assignee: Kirill Tkalenko
 Fix For: 2.8


In case of massive prepared transactions roll back, when node fail, have a 
linearizable behavior:

{noformat}2019-09-26 
18:48:21.034[ERROR][sys-stripe-16-#17%DPL_GRID%DplGridNodeName%[o.a.i.s.c.tcp.TcpCommunicationSpi]
 Failed to send message to remote node [node=TcpDiscoveryNode 
[id=1dc0c76a-8e72-48e7-9718-b157eea1b812, addrs=ArrayList [10.124.133.201], 
sockAddrs=HashSet [marica63.ca.sbrf.ru/10.124.133.201:47500], discPort=47500, 
order=524, intOrder=311, lastExchangeTime=1569430937898, loc=false, 
ver=2.5.1#20190327-sha1:6edfea1b, isClient=false], msg=GridIoMessage [plc=2, 
topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, 
msg=GridCacheIdMessage [cacheId=0]GridDistributedBaseMessage 
[ver=GridCacheVersion [topVer=176921134, order=1634060411645, nodeOrder=1], 
committedVers=EmptyList [], rolledbackVers=EmptyList [], cnt=0, 
super=]GridDistributedTxFinishRequest [topVer=AffinityTopologyVersion 
[topVer=524, minorTopVer=2], 
futId=fb44a686e61-9a074a8c-dca4--84fe-e9a93818fbd2, threadId=2098, 
commitVer=GridCacheVersion [topVer=176921134, order=1634060411645, nodeOrder=1],

org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Failed to 
send message (node left topology): TcpDiscoveryNode 
[id=1dc0c76a-8e72-48e7-9718-b157eea1b812, addrs=ArrayList [10.124.133.201], 
sockAddrs=HashSet [marica63.ca.sbrf.ru/10.124.133.201:47500], discPort=47500, 
order=524, intOrder=311, lastExchangeTime=1569430937898, loc=false, 
ver=2.5.1#20190327-sha1:6edfea1b, isClient=false]
 at 
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3276)
 at 
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2998)
 at 
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2878)
 at 
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2721)
 at 
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2680)
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1643)
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:1715)
 at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1177)
 at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxFinishFuture.finish(GridDhtTxFinishFuture.java:462)
 at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxFinishFuture.finish(GridDhtTxFinishFuture.java:291)
 at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.finishTx(GridDhtTxLocal.java:495)
 at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.rollbackDhtLocalAsync(GridDhtTxLocal.java:571)
 at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finishDhtLocal(IgniteTxHandler.java:1005)
 at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finish(IgniteTxHandler.java:876)
 at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processNearTxFinishRequest(IgniteTxHandler.java:832)
 at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$200(IgniteTxHandler.java:101)
 at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:193)
 at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:191)
 at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1061)
 at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:586)
 at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:385)
 at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:311)
 at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101)
 at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:300)
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1556)
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1184)
 at 

Re: [DISCUSSION] Single point in API for changing cluster state.

2019-11-25 Thread Sergey Antonov
Alexei Scherbakov,

> After activation (in read-only mode or not) rebalancing is possible to
begin and the grid will not be free of updates until it's finished. So the
grid will not be in truly read-only mode even if cache updates are
prohibited. Probably it would be enough just wait until rebalancing is
finished before releasing future.
I'm afraid, we can't guarantee truly read-only mode due to TTL on cache
entries

> I do not understand the necessity of handling states comparison on
transition. Why not just return current(previous) state ? Could you give
more detailed explanation for this ?
User could face the unexpected behaviour, If we always return current state
(target state of transition) or previous state (initial state of
transition). For example, transition from INACTIVE to ACTIVE and we return
current state. User gets cluster state (ACTIVE) and tries to get cache, but
activation has not been completed yet. User will get exception, but cluster
state returns ACTIVE value. So, we can't always return current state.
Another example: transition from ACTIVE to READ_ONLY and we return previous
state. User gets cluster state (ACTIVE) and tries to update value in cache.
User could get ClusterReadOnlyModeException in this case.
So we should return state with lower functionality from previous and
current states for avoiding unexpected behaviour.



чт, 31 окт. 2019 г. в 18:24, Alexei Scherbakov :

> Sergey Antonov,
>
> > Read-only mode doesn't affects rebalance process.
> After activation (in read-only mode or not) rebalancing is possible to
> begin and the grid will not be free of updates until it's finished.
> So the grid will not be in truly read-only mode even if cache updates are
> prohibited. Probably it would be enough just wait until rebalancing is
> finished before releasing future.
>
> > How about INACTIVE, ACTIVE, ACTIVE_READ-ONLY states?
> INACTIVE, READ_WRITE, READ_ONLY  seems more appropriate for ClusterState.
>
> I do not understand the necessity of handling states comparison on
> transition. Why not just return current(previous) state ? Could you give
> more detailed explanation for this ?
>
> вт, 29 окт. 2019 г. в 14:25, Sergey Antonov :
>
> > He, Igniters!
> >
> > I'd like to share some points encountered during work on ticket [1]:
> >
> >- I added property clusterStateOnStart with type ClusterState to
> >IgniteConfiguration. The property will be analogue of activeOnStart.
> >Default value of the property will be ACTIVE for keeping defalut value
> >consistency. Also I marked property activeOnStart as deprecated.
> >- I introduced order on values of ClusterState. It needs for user
> >friendly behaviour during cluster state transition. If cluster is
> > changing
> >state from state_A to state_B and user is requesting current cluster
> >state without waiting end of transition we must return lesser of two
> >states: state_A and state_B. I think the order must be: ACTIVE >
> >READ_ONLY > INACTIVE. Examples (state_A -> state_B = response on the
> >users cluster state request during transition):
> >   - ACTIVE -> INACTIVE = INACTIVE (Now we have this behavior)
> >   - INACTIVE -> ACTIVE = INACTIVE (Now we have this behavior)
> >   - ACTIVE -> READ_ONLY = READ_ONLY
> >   - READ_ONLY -> ACTIVE = READ_ONLY
> >   - READ_ONLY -> INACTIVE = INACTIVE
> >   - INACTIVE -> READ_ONLY = INACTIVE
> >
> > I'd like to know your opinion about my points. What do you think about
> it?
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-12225
> >
> >
> > вт, 15 окт. 2019 г. в 14:56, Sergey Antonov :
> >
> > > Hi, Alexei!
> > >
> > > Thank you for reply!
> > >
> > > > The states ACTIVE, INACTIVE, READ-ONLY look confusing. Actually
> > > read-only cluster is active too.
> > > How about INACTIVE, ACTIVE, ACTIVE_READ-ONLY states?
> > >
> > > > Also it would be useful to allow users wait for re-balance which
> could
> > > happen after activation in read-only mode to achieve really idle grid.
> > > Read-only mode doesn't affects rebalance process.
> > >
> > > > I would suggest adding new property to Ignite configuration like
> > > setActivationOptions(ActivationOption... options) which should be
> mutable
> > > in runtime.
> > > I'm not sure that it's good idea. @Alexey Goncharuk
> > >  I'd like to know your opinion about activation
> > > option and storing them on PDS.
> > >
> > > > This proposal also better regarding backward compatibility.
> > > Which kind of compatibility did you mean? New cluster mode doesn't
> > affects
> > > PDS compatibility.
> > >
> > > ср, 25 сент. 2019 г. в 13:26, Alexei Scherbakov <
> > > alexey.scherbak...@gmail.com>:
> > >
> > >> Sergey Antonov,
> > >>
> > >> The states ACTIVE, INACTIVE, READ-ONLY look confusing.
> > >> Actually read-only cluster is active too.
> > >>
> > >> I would suggest adding new property to Ignite configuration like
> > >> setActivationOptions(ActivationOption... options) which should be
> >