[jira] [Created] (IGNITE-13371) Sporadic partition inconsistency after historical rebalancing of updates with same key put-remove pattern

2020-08-18 Thread Ivan Rakov (Jira)
Ivan Rakov created IGNITE-13371:
---

 Summary: Sporadic partition inconsistency after historical 
rebalancing of updates with same key put-remove pattern
 Key: IGNITE-13371
 URL: https://issues.apache.org/jira/browse/IGNITE-13371
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Rakov
Assignee: Ivan Rakov
 Fix For: 2.10


h4. scenario
# start 3 servers 3 clients, create caches
# clients start combined put + 1% remove of data in transactions 
PESSIMISTIC/REPEATABLE_READ
## kill one node
## restart one node
# ensure all transactions completed
# run idle_verify

Expected: no conflicts found
Actual:
{noformat}
[12:03:18][:55 :230] Control utility --cache idle_verify --skip-zeros 
--cache-filter PERSISTENT
[12:03:20][:55 :230] Control utility [ver. 8.7.13#20200228-sha1:7b016d63]
[12:03:20][:55 :230] 2020 Copyright(C) GridGain Systems, Inc. and Contributors
[12:03:20][:55 :230] User: prtagent
[12:03:20][:55 :230] Time: 2020-03-03T12:03:19.836
[12:03:20][:55 :230] Command [CACHE] started
[12:03:20][:55 :230] Arguments: --host 172.25.1.11 --port 11211 --cache 
idle_verify --skip-zeros --cache-filter PERSISTENT 
[12:03:20][:55 :230] 

[12:03:20][:55 :230] idle_verify task was executed with the following args: 
caches=[], excluded=[], cacheFilter=[PERSISTENT]
[12:03:20][:55 :230] idle_verify check has finished, found 1 conflict 
partitions: [counterConflicts=0, hashConflicts=1]
[12:03:20][:55 :230] Hash conflicts:
[12:03:20][:55 :230] Conflict partition: PartitionKeyV2 [grpId=1338167321, 
grpName=cache_group_3_088_1, partId=24]
[12:03:20][:55 :230] Partition instances: [PartitionHashRecordV2 
[isPrimary=false, consistentId=node_1_2, updateCntr=172349, 
partitionState=OWNING, size=6299, partHash=157875238], PartitionHashRecordV2 
[isPrimary=true, consistentId=node_1_1, updateCntr=172349, 
partitionState=OWNING, size=6299, partHash=157875238], PartitionHashRecordV2 
[isPrimary=false, consistentId=node_1_4, updateCntr=172349, 
partitionState=OWNING, size=6300, partHash=-944532882]]
[12:03:20][:55 :230] Command [CACHE] finished with code: 0
[12:03:20][:55 :230] Control utility has completed execution at: 
2020-03-03T12:03:20.593
[12:03:20][:55 :230] Execution time: 757 ms
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSSION] Add index rebuild time metrics

2020-08-12 Thread Ivan Rakov
I seem to be in the minority here :)
Fine, let's make it as clear as possible which metric method
(localCacheSize) should be called in order to retrieve a 100% progress
milestone.
I've left comments in the PR.

On Tue, Aug 11, 2020 at 4:31 PM Nikolay Izhikov  wrote:

> > I propose to stick with a cache-group level metric (e.g.
> getIndexBuildProgress)
>
> +1
>
> > that returns a float from 0 to 1, which is calculated as [processedKeys]
> / [localCacheSize].
>
> From my point of view, we shouldn’t do calculations on the Ignite side if
> we can avoid it.
> I’d rather provide two separate metrics - processedKeys and localCacheSize.
>
> > 11 авг. 2020 г., в 16:26, Ivan Rakov  написал(а):
> >
> >>
> >> As a compromise, I can add jmx methods (rebuilding indexes in the
> process
> >> and the percentage of rebuilding) for the entire node, but I tried to
> find
> >> a suitable place and did not find it, tell me where to add it?
> >
> > I have checked existing JMX beans. To be honest, I struggle to find a
> > suitable place as well.
> > We have ClusterMetrics that may represent the state of a local node, but
> > this class is also used for aggregated cluster metrics. I can't propose a
> > reasonable way to merge percentages from different nodes.
> > On the other hand, total index rebuild for all caches isn't a common
> > scenario. It's either performed after manual index.bin removal or after
> > index creation, both operations are performed on cache / cache-group
> level.
> > Also, all other similar metrics are provided on cache-group level.
> >
> > I propose to stick with a cache-group level metric (e.g.
> > getIndexBuildProgress) that returns a float from 0 to 1, which is
> > calculated as [processedKeys] / [localCacheSize]. Even if a user handles
> > metrics through Zabbix, I anticipate that he'll perform this calculation
> on
> > his own in order to estimate progress. Let's help him a bit and perform
> it
> > on the system side.
> > If a per-group percentage metric is present, I
> > think getIndexRebuildKeyProcessed becomes redundant.
> >
> > On Tue, Aug 11, 2020 at 8:20 AM ткаленко кирилл 
> > wrote:
> >
> >> Hi, Ivan!
> >>
> >> What precision would be sufficient?
> >>> If the progress is very slow, I don't see issues with tracking it if
> the
> >>> percentage float has enough precision.
> >>
> >> I think we can add a mention getting cache size.
> >>> 1. Gain an understanding that local cache size
> >>> (CacheMetricsImpl#getCacheSize) should be used as a 100% milestone (it
> >>> isn't mentioned neither in javadoc nor in JMX method description).
> >>
> >> Do you think users collect metrics with their hands? I think this is
> done
> >> by other systems, such as zabbix.
> >>> 2. Manually calculate sum of all metrics and divide to sum of all cache
> >>> sizes.
> >>
> >> As a compromise, I can add jmx methods (rebuilding indexes in the
> process
> >> and the percentage of rebuilding) for the entire node, but I tried to
> find
> >> a suitable place and did not find it, tell me where to add it?
> >>> On the other hand, % of index rebuild progress is self-descriptive. I
> >> don't
> >>> understand why we tend to make user's life harder.
> >>
> >> 10.08.2020, 21:57, "Ivan Rakov" :
> >>>> This metric can be used only for local node, to get size of cache use
> >>>>
> >>
> org.apache.ignite.internal.processors.cache.CacheMetricsImpl#getCacheSize.
> >>>
> >>> Got it, agree.
> >>>
> >>> If there is a lot of data in node that can be rebuilt, percentage may
> >>>> change very rarely and may not give an estimate of how much time is
> >> left.
> >>>> If we see for example that 50_000 keys are rebuilt once a minute, and
> >> we
> >>>> have 1_000_000_000 keys, then we can have an approximate estimate.
> >> What do
> >>>> you think of that?
> >>>
> >>> If the progress is very slow, I don't see issues with tracking it if
> the
> >>> percentage float has enough precision.
> >>> Still, usability of the metric concerns me. In order to estimate
> >> remaining
> >>> time of index rebuild, user should:
> >>> 1. Gain an understanding that local cache size
> >>> (CacheMetricsImpl#getCacheSize) should be used as a 1

Re: [DISCUSSION] Add index rebuild time metrics

2020-08-11 Thread Ivan Rakov
>
> As a compromise, I can add jmx methods (rebuilding indexes in the process
> and the percentage of rebuilding) for the entire node, but I tried to find
> a suitable place and did not find it, tell me where to add it?

I have checked existing JMX beans. To be honest, I struggle to find a
suitable place as well.
We have ClusterMetrics that may represent the state of a local node, but
this class is also used for aggregated cluster metrics. I can't propose a
reasonable way to merge percentages from different nodes.
On the other hand, total index rebuild for all caches isn't a common
scenario. It's either performed after manual index.bin removal or after
index creation, both operations are performed on cache / cache-group level.
Also, all other similar metrics are provided on cache-group level.

I propose to stick with a cache-group level metric (e.g.
getIndexBuildProgress) that returns a float from 0 to 1, which is
calculated as [processedKeys] / [localCacheSize]. Even if a user handles
metrics through Zabbix, I anticipate that he'll perform this calculation on
his own in order to estimate progress. Let's help him a bit and perform it
on the system side.
If a per-group percentage metric is present, I
think getIndexRebuildKeyProcessed becomes redundant.

On Tue, Aug 11, 2020 at 8:20 AM ткаленко кирилл 
wrote:

> Hi, Ivan!
>
> What precision would be sufficient?
> > If the progress is very slow, I don't see issues with tracking it if the
> > percentage float has enough precision.
>
> I think we can add a mention getting cache size.
> > 1. Gain an understanding that local cache size
> > (CacheMetricsImpl#getCacheSize) should be used as a 100% milestone (it
> > isn't mentioned neither in javadoc nor in JMX method description).
>
> Do you think users collect metrics with their hands? I think this is done
> by other systems, such as zabbix.
> > 2. Manually calculate sum of all metrics and divide to sum of all cache
> > sizes.
>
> As a compromise, I can add jmx methods (rebuilding indexes in the process
> and the percentage of rebuilding) for the entire node, but I tried to find
> a suitable place and did not find it, tell me where to add it?
> > On the other hand, % of index rebuild progress is self-descriptive. I
> don't
> > understand why we tend to make user's life harder.
>
> 10.08.2020, 21:57, "Ivan Rakov" :
> >>  This metric can be used only for local node, to get size of cache use
> >>
>  org.apache.ignite.internal.processors.cache.CacheMetricsImpl#getCacheSize.
> >
> >  Got it, agree.
> >
> > If there is a lot of data in node that can be rebuilt, percentage may
> >>  change very rarely and may not give an estimate of how much time is
> left.
> >>  If we see for example that 50_000 keys are rebuilt once a minute, and
> we
> >>  have 1_000_000_000 keys, then we can have an approximate estimate.
> What do
> >>  you think of that?
> >
> > If the progress is very slow, I don't see issues with tracking it if the
> > percentage float has enough precision.
> > Still, usability of the metric concerns me. In order to estimate
> remaining
> > time of index rebuild, user should:
> > 1. Gain an understanding that local cache size
> > (CacheMetricsImpl#getCacheSize) should be used as a 100% milestone (it
> > isn't mentioned neither in javadoc nor in JMX method description).
> > 2. Manually calculate sum of all metrics and divide to sum of all cache
> > sizes.
> > On the other hand, % of index rebuild progress is self-descriptive. I
> don't
> > understand why we tend to make user's life harder.
> >
> > --
> > Best regards,
> > Ivan
> >
> > On Mon, Aug 10, 2020 at 8:53 PM ткаленко кирилл 
> > wrote:
> >
> >>  Hi, Ivan!
> >>
> >>  For this you can use
> >>  org.apache.ignite.cache.CacheMetrics#IsIndexRebuildInProgress
> >>  > How can a local number of processed keys can help us to understand
> when
> >>  > index rebuild will be finished?
> >>
> >>  This metric can be used only for local node, to get size of cache use
> >>
>  org.apache.ignite.internal.processors.cache.CacheMetricsImpl#getCacheSize.
> >>  > We can't compare metric value with cache.size(). First one is
> node-local,
> >>  > while cache size covers all partitions in the cluster.
> >>
> >>  If there is a lot of data in node that can be rebuilt, percentage may
> >>  change very rarely and may not give an estimate of how much time is
> left.
> >>  If we see for example that 50_000 keys are rebuilt once a minute, and
> we
> >>  have 1_000_000_000 keys, then 

Re: [DISCUSSION] Add index rebuild time metrics

2020-08-10 Thread Ivan Rakov
>
> This metric can be used only for local node, to get size of cache use
> org.apache.ignite.internal.processors.cache.CacheMetricsImpl#getCacheSize.

 Got it, agree.

If there is a lot of data in node that can be rebuilt, percentage may
> change very rarely and may not give an estimate of how much time is left.
> If we see for example that 50_000 keys are rebuilt once a minute, and we
> have 1_000_000_000 keys, then we can have an approximate estimate. What do
> you think of that?

If the progress is very slow, I don't see issues with tracking it if the
percentage float has enough precision.
Still, usability of the metric concerns me. In order to estimate remaining
time of index rebuild, user should:
1. Gain an understanding that local cache size
(CacheMetricsImpl#getCacheSize) should be used as a 100% milestone (it
isn't mentioned neither in javadoc nor in JMX method description).
2. Manually calculate sum of all metrics and divide to sum of all cache
sizes.
On the other hand, % of index rebuild progress is self-descriptive. I don't
understand why we tend to make user's life harder.

--
Best regards,
Ivan


On Mon, Aug 10, 2020 at 8:53 PM ткаленко кирилл 
wrote:

> Hi, Ivan!
>
> For this you can use
> org.apache.ignite.cache.CacheMetrics#IsIndexRebuildInProgress
> > How can a local number of processed keys can help us to understand when
> > index rebuild will be finished?
>
> This metric can be used only for local node, to get size of cache use
> org.apache.ignite.internal.processors.cache.CacheMetricsImpl#getCacheSize.
> > We can't compare metric value with cache.size(). First one is node-local,
> > while cache size covers all partitions in the cluster.
>
> If there is a lot of data in node that can be rebuilt, percentage may
> change very rarely and may not give an estimate of how much time is left.
> If we see for example that 50_000 keys are rebuilt once a minute, and we
> have 1_000_000_000 keys, then we can have an approximate estimate. What do
> you think of that?
> > I find one single metric much more usable. It would be perfect if metric
> > value is represented in percentage, e.g. current progress of local node
> > index rebuild is 60%.
>
> 10.08.2020, 19:11, "Ivan Rakov" :
> > Folks,
> >
> > Sorry for coming late to the party. I've taken a look at this issue
> during
> > review.
> >
> > How can a local number of processed keys can help us to understand when
> > index rebuild will be finished?
> > We can't compare metric value with cache.size(). First one is node-local,
> > while cache size covers all partitions in the cluster.
> > Also, I don't understand why we need to keep separate metrics for all
> > caches. Of course, the metric becomes more fair, but obviously harder to
> > make conclusions on whether "the index rebuild" process is over (and the
> > cluster is ready to process queries quickly).
> >
> > I find one single metric much more usable. It would be perfect if metric
> > value is represented in percentage, e.g. current progress of local node
> > index rebuild is 60%.
> >
> > --
> > Best regards,
> > Ivan
> >
> > On Fri, Jul 24, 2020 at 1:35 PM Stanislav Lukyanov <
> stanlukya...@gmail.com>
> > wrote:
> >
> >>  Got it. I thought that index building and index rebuilding are
> essentially
> >>  the same,
> >>  but now I see that they are different: index rebuilding cares about all
> >>  indexes at once while index building cares about particular ones.
> >>
> >>  Kirill's approach sounds good.
> >>
> >>  Stan
> >>
> >>  > On 20 Jul 2020, at 14:54, Alexey Goncharuk <
> alexey.goncha...@gmail.com>
> >>  wrote:
> >>  >
> >>  > Stan,
> >>  >
> >>  > Currently we never build indexes one-by-one - we always use a cache
> data
> >>  > row visitor which either updates all indexes (see
> >>  IndexRebuildFullClosure)
> >>  > or updates a set of all indexes that need to catch up (see
> >>  > IndexRebuildPartialClosure). GIven that, I do not see any need for
> >>  > per-index rebuild status as this status will be updated for all
> outdated
> >>  > indexes simultaneously.
> >>  >
> >>  > Kirill's approach for the total number of processed keys per cache
> seems
> >>  > reasonable to me.
> >>  >
> >>  > --AG
> >>  >
> >>  > пт, 3 июл. 2020 г. в 10:12, ткаленко кирилл :
> >>  >
> >>  >> Hi, Stan!
> >>  >>
> >>  >> Perhaps it is worth cl

Re: [DISCUSSION] Add index rebuild time metrics

2020-08-10 Thread Ivan Rakov
Folks,

Sorry for coming late to the party. I've taken a look at this issue during
review.

How can a local number of processed keys can help us to understand when
index rebuild will be finished?
We can't compare metric value with cache.size(). First one is node-local,
while cache size covers all partitions in the cluster.
Also, I don't understand why we need to keep separate metrics for all
caches. Of course, the metric becomes more fair, but obviously harder to
make conclusions on whether "the index rebuild" process is over (and the
cluster is ready to process queries quickly).

I find one single metric much more usable. It would be perfect if metric
value is represented in percentage, e.g. current progress of local node
index rebuild is 60%.

--
Best regards,
Ivan

On Fri, Jul 24, 2020 at 1:35 PM Stanislav Lukyanov 
wrote:

> Got it. I thought that index building and index rebuilding are essentially
> the same,
> but now I see that they are different: index rebuilding cares about all
> indexes at once while index building cares about particular ones.
>
> Kirill's approach sounds good.
>
> Stan
>
> > On 20 Jul 2020, at 14:54, Alexey Goncharuk 
> wrote:
> >
> > Stan,
> >
> > Currently we never build indexes one-by-one - we always use a cache data
> > row visitor which either updates all indexes (see
> IndexRebuildFullClosure)
> > or updates a set of all indexes that need to catch up (see
> > IndexRebuildPartialClosure). GIven that, I do not see any need for
> > per-index rebuild status as this status will be updated for all outdated
> > indexes simultaneously.
> >
> > Kirill's approach for the total number of processed keys per cache seems
> > reasonable to me.
> >
> > --AG
> >
> > пт, 3 июл. 2020 г. в 10:12, ткаленко кирилл :
> >
> >> Hi, Stan!
> >>
> >> Perhaps it is worth clarifying what exactly I wanted to say.
> >> Now we have 2 processes: building and rebuilding indexes.
> >>
> >> At moment, we have some metrics for rebuilding indexes:
> >> "IsIndexRebuildInProgress", "IndexBuildCountPartitionsLeft".
> >>
> >> I suggest adding another metric "Indexrebuildkeyprocessed", which will
> >> allow you to determine how many records are left to rebuild for cache.
> >>
> >> I think your comments are more about building an index that may need
> more
> >> metrics, but I think you should do it in a separate ticket.
> >>
> >> 03.07.2020, 03:09, "Stanislav Lukyanov" :
> >>> If multiple indexes are to be built "number of indexed keys" metric may
> >> be misleading.
> >>>
> >>> As a cluster admin, I'd like to know:
> >>> - Are all indexes ready on a node?
> >>> - How many indexes are to be built?
> >>> - How much resources are used by the index building (how many threads
> >> are used)?
> >>> - Which index(es?) is being built right now?
> >>> - How much time until the current (single) index building finishes?
> Here
> >> "time" can be a lot of things: partitions, entries, percent of the
> cache,
> >> minutes and hours
> >>> - How much time until all indexes are built?
> >>> - How much does it take to build each of my indexes / a single index of
> >> my cache on average?
> >>>
> >>> I think we need a set of metrics and/or log messages to solve all of
> >> these questions.
> >>> I imaging something like:
> >>> - numberOfIndexesToBuild
> >>> - a standard set of metrics on the index building thread pool (do we
> >> already have it?)
> >>> - currentlyBuiltIndexName (assuming we only build one at a time which
> is
> >> probably not true)
> >>> - for the "time" metrics I think percentage might be the best as it's
> >> the easiest to understand; we may add multiple metrics though.
> >>> - For "time per each index" I'd add detailed log messages stating how
> >> long did it take to build a particular index
> >>>
> >>> Thanks,
> >>> Stan
> >>>
>  On 26 Jun 2020, at 12:49, ткаленко кирилл 
> >> wrote:
> 
>  Hi, Igniters.
> 
>  I would like to know if it is possible to estimate how much the index
> >> rebuild will take?
> 
>  At the moment, I have found the following metrics [1] and [2] and
> >> since the rebuild is based on caches, I think it would be useful to know
> >> how many records are processed in indexing. This way we can estimate how
> >> long we have to wait for the index to be rebuilt by subtracting [3] and
> how
> >> many records are indexed.
> 
>  I think we should add this metric [4].
> 
>  Comments, suggestions?
> 
>  [1] - https://issues.apache.org/jira/browse/IGNITE-12184
>  [2] -
> >>
> org.apache.ignite.internal.processors.cache.CacheGroupMetricsImpl#idxBuildCntPartitionsLeft
>  [3] - org.apache.ignite.cache.CacheMetrics#getCacheSize
>  [4] - org.apache.ignite.cache.CacheMetrics#getNumberIndexedKeys
> >>
>
>


Re: Re[2]: Apache Ignite 2.9.0 RELEASE [Time, Scope, Manager]

2020-07-31 Thread Ivan Rakov
Hi Alex,

https://issues.apache.org/jira/browse/IGNITE-13306 is merged to master.
Can you please cherry-pick to 2.9?

On Thu, Jul 30, 2020 at 7:42 PM Ilya Kasnacheev 
wrote:

> Hello!
>
> I don't think that IGNITE-13006
> <https://issues.apache.org/jira/browse/IGNITE-13006> is a blocker in any
> way. It is a good candidate for 3.0.
>
> ignite-spring will work with 4.x Spring as well as 5.x and the user is free
> to bump Spring version. I think bumping this dependency explicitly is
> infeasible since it may break existing code.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> ср, 22 июл. 2020 г. в 10:22, Alex Plehanov :
>
> > Guys,
> >
> > We are in code-freeze phase now. I've moved almost all non-blocker
> > unresolved tickets from 2.9 to the next release. If you think that
> > some ticket is a blocker and should be included into 2.9 release, please
> > write a note in this thread.
> >
> > There are some tickets with "blocker" priority targeted to 2.9, some of
> > them in "open" state and still unassigned, and I'm not sure we need all
> of
> > these tickets in 2.9:
> >
> > IGNITE-13006 [1] (Apache Ignite spring libs upgrade from version 4x to
> > spring 5.2 version or later) - Is it really a blocker for 2.9 release? If
> > yes, can somebody help with resolving this ticket?
> >
> > IGNITE-11942 [2] (IGFS and Hadoop Accelerator Discontinuation) - ticket
> in
> > "Patch available" state. There is a thread on dev-list related to this
> > ticket ([6]), but as far as I understand we still don't have consensus
> > about version for this patch (2.9, 2.10, 3.0).
> >
> > IGNITE-12489 [3] (Error during purges by expiration: Unknown page type) -
> > perhaps issue is already resolved by some related tickets, there is still
> > no reproducer, no additional details and no work in progress. I propose
> to
> > move this ticket to the next release.
> >
> > IGNITE-12911 [4] (B+Tree Corrupted exception when using a key extracted
> > from a BinaryObject value object --- and SQL enabled) - ticket in "Patch
> > available" state, but there is no activity since May 2020. Anton
> > Kalashnikov, Ilya Kasnacheev, do we have any updates on this ticket? Is
> it
> > still in progress?
> >
> > IGNITE-12553 [5] ([IEP-35] public Java metric API) - since the new
> metrics
> > framework is already released in 2.8 and it's still marked with
> > @IgniteExperemental annotation, I think this ticket is not a blocker. I
> > propose to change the ticket priority and move it to the next release.
> >
> >
> > [1]: https://issues.apache.org/jira/browse/IGNITE-13006
> > [2]: https://issues.apache.org/jira/browse/IGNITE-11942
> > [3]: https://issues.apache.org/jira/browse/IGNITE-12489
> > [4]: https://issues.apache.org/jira/browse/IGNITE-12911
> > [5]: https://issues.apache.org/jira/browse/IGNITE-12553
> > [6]:
> >
> http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSSION-Complete-Discontinuation-of-IGFS-and-Hadoop-Accelerator-td42282.html
> >
> > пт, 17 июл. 2020 г. в 11:50, Alex Plehanov :
> >
> >> Ivan,
> >>
> >> Merged to 2.9.
> >>
> >> Thanks
> >>
> >> пт, 17 июл. 2020 г. в 01:35, Ivan Rakov :
> >>
> >>> Alex,
> >>>
> >>> Tracing is merged to master:
> >>> https://issues.apache.org/jira/browse/IGNITE-13060
> >>>
> >>> Can you please port it to 2.9?
> >>> For you convenience, there's PR versus 2.9 with conflicts resolved:
> >>> https://github.com/apache/ignite/pull/8046/files
> >>>
> >>> --
> >>> Best Regards,
> >>> Ivan Rakov
> >>>
> >>> On Wed, Jul 15, 2020 at 5:33 PM Alex Plehanov  >
> >>> wrote:
> >>>
> >>>> Ivan,
> >>>>
> >>>> Looks like master is broken after IGNITE-13246 (but everything is ok
> in
> >>>> 2.9
> >>>> branch)
> >>>>
> >>>> ср, 15 июл. 2020 г. в 18:54, Alex Plehanov :
> >>>>
> >>>> > Zhenya, Ivan,
> >>>> >
> >>>> > I've cherry-picked IGNITE-13229 and IGNITE-13246 to ignite-2.9
> branch.
> >>>> > Thank you.
> >>>> >
> >>>> > ср, 15 июл. 2020 г. в 18:31, Ivan Bessonov :
> >>>> >
> >>>> >> Guys,
> >>>> >>
> >>>> >> can you please backport
> >>&

Re: Re[2]: Apache Ignite 2.9.0 RELEASE [Time, Scope, Manager]

2020-07-16 Thread Ivan Rakov
Alex,

Tracing is merged to master:
https://issues.apache.org/jira/browse/IGNITE-13060

Can you please port it to 2.9?
For you convenience, there's PR versus 2.9 with conflicts resolved:
https://github.com/apache/ignite/pull/8046/files

--
Best Regards,
Ivan Rakov

On Wed, Jul 15, 2020 at 5:33 PM Alex Plehanov 
wrote:

> Ivan,
>
> Looks like master is broken after IGNITE-13246 (but everything is ok in 2.9
> branch)
>
> ср, 15 июл. 2020 г. в 18:54, Alex Plehanov :
>
> > Zhenya, Ivan,
> >
> > I've cherry-picked IGNITE-13229 and IGNITE-13246 to ignite-2.9 branch.
> > Thank you.
> >
> > ср, 15 июл. 2020 г. в 18:31, Ivan Bessonov :
> >
> >> Guys,
> >>
> >> can you please backport
> >> https://issues.apache.org/jira/browse/IGNITE-13246
> >> to ignite-2.9? Me and Alexey Kuznetsov really want these new events in
> >> release.
> >>
> >> This time I prepared PR with resolved conflicts:
> >> https://github.com/apache/ignite/pull/8042
> >>
> >> Thank you!
> >>
> >> вт, 14 июл. 2020 г. в 19:39, Zhenya Stanilovsky
> >>  >> >:
> >>
> >> >
> >> >
> >> >
> >> > Alex, i also suggest to merge this
> >> > https://issues.apache.org/jira/browse/IGNITE-13229 too, GridClient
> >> > leakage and further TC OOM preventing.
> >> >
> >> > >Ivan,
> >> > >
> >> > >It was already in release scope as discussed in this thread.
> >> > >
> >> > >вт, 14 июл. 2020 г. в 14:31, Ivan Rakov < ivan.glu...@gmail.com >:
> >> > >
> >> > >> Hi,
> >> > >>
> >> > >> We are still waiting for a final review of Tracing functionality
> [1]
> >> > until
> >> > >> the end of tomorrow (July 15).
> >> > >> We anticipate that it will be merged to Ignite master no later than
> >> July
> >> > >> 16.
> >> > >>
> >> > >> Sorry for being a bit late here. Alex P., can you include [1] to
> the
> >> > >> release scope?
> >> > >>
> >> > >> [1]:  https://issues.apache.org/jira/browse/IGNITE-13060
> >> > >>
> >> > >> --
> >> > >> Best Regards,
> >> > >> Ivan Rakov
> >> > >>
> >> > >> On Tue, Jul 14, 2020 at 6:16 AM Alexey Kuznetsov <
> >> > akuznet...@gridgain.com >
> >> > >> wrote:
> >> > >>
> >> > >>> Alex,
> >> > >>>
> >> > >>> Can you cherry-pick to Ignite 2.9 this issue:
> >> > >>>  https://issues.apache.org/jira/browse/IGNITE-13246 ?
> >> > >>>
> >> > >>> This issue is about BASELINE events and it is very useful for
> >> > notification
> >> > >>> external tools about changes in baseline.
> >> > >>>
> >> > >>> Thank you!
> >> > >>>
> >> > >>> ---
> >> > >>> Alexey Kuznetsov
> >> > >>>
> >> > >>
> >> >
> >> >
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Sincerely yours,
> >> Ivan Bessonov
> >>
> >
>


Re: Choosing historical rebalance heuristics

2020-07-16 Thread Ivan Rakov
>
>  I think we can modify the heuristic so
> 1) Exclude partitions by threshold (IGNITE_PDS_WAL_REBALANCE_THRESHOLD -
> reduce it to 500)
> 2) Select only that partition for historical rebalance where difference
> between counters less that partition size.

Agreed, let's go this way.

On Thu, Jul 16, 2020 at 11:03 AM Vladislav Pyatkov 
wrote:

> I completely forget about another promise to favor of using historical
> rebalance where it is possible. When cluster decided to use a full balance,
> demander nodes should clear not empty partitions.
> This can to consume a long time, in some cases that may be compared with a
> time of rebalance.
> It also accepts a side of heuristics above.
>
> On Thu, Jul 16, 2020 at 12:09 AM Vladislav Pyatkov 
> wrote:
>
> > Ivan,
> >
> > I agree with a combined approach: threshold for small partitions and
> count
> > of update for partition that outgrew it.
> > This helps to avoid partitions that update not frequently.
> >
> > Reading of a big WAL piece (more than 100Gb) it can happen, when a client
> > configured it intentionally.
> > There are no doubts we can to read it, otherwise WAL space was not
> > configured that too large.
> >
> > I don't see a connection optimization of iterator and issue in atomic
> > protocol.
> > Reordering in WAL, that happened in checkpoint where counter was not
> > changing, is an extremely rare case and the issue will not solve for
> > generic case, this should be fixed in bound of protocol.
> >
> > I think we can modify the heuristic so
> > 1) Exclude partitions by threshold (IGNITE_PDS_WAL_REBALANCE_THRESHOLD -
> > reduce it to 500)
> > 2) Select only that partition for historical rebalance where difference
> > between counters less that partition size.
> >
> > Also implement mentioned optimization for historical iterator, that may
> > reduce a time on reading large WAL interval.
> >
> > On Wed, Jul 15, 2020 at 3:15 PM Ivan Rakov 
> wrote:
> >
> >> Hi Vladislav,
> >>
> >> Thanks for raising this topic.
> >> Currently present IGNITE_PDS_WAL_REBALANCE_THRESHOLD (default is
> 500_000)
> >> is controversial. Assuming that the default number of partitions is
> 1024,
> >> cache should contain a really huge amount of data in order to make WAL
> >> delta rebalancing possible. In fact, it's currently disabled for most
> >> production cases, which makes rebalancing of persistent caches
> >> unreasonably
> >> long.
> >>
> >> I think, your approach [1] makes much more sense than the current
> >> heuristic, let's move forward with the proposed solution.
> >>
> >> Though, there are some other corner cases, e.g. this one:
> >> - Configured size of WAL archive is big (>100 GB)
> >> - Cache has small partitions (e.g. 1000 entries)
> >> - Infrequent updates (e.g. ~100 in the whole WAL history of any node)
> >> - There is another cache with very frequent updates which allocate >99%
> of
> >> WAL
> >> In such scenario we may need to iterate over >100 GB of WAL in order to
> >> fetch <1% of needed updates. Even though the amount of network traffic
> is
> >> still optimized, it would be more effective to transfer partitions with
> >> ~1000 entries fully instead of reading >100 GB of WAL.
> >>
> >> I want to highlight that your heuristic definitely makes the situation
> >> better, but due to possible corner cases we should keep the fallback
> lever
> >> to restrict or limit historical rebalance as before. Probably, it would
> be
> >> handy to keep IGNITE_PDS_WAL_REBALANCE_THRESHOLD property with a low
> >> default value (1000, 500 or even 0) and apply your heuristic only for
> >> partitions with bigger size.
> >>
> >> Regarding case [2]: it looks like an improvement that can mitigate some
> >> corner cases (including the one that I have described). I'm ok with it
> as
> >> long as it takes data updates reordering on backup nodes into account.
> We
> >> don't track skipped updates for atomic caches. As a result, detection of
> >> the absence of updates between two checkpoint markers with the same
> >> partition counter can be false positive.
> >>
> >> --
> >> Best Regards,
> >> Ivan Rakov
> >>
> >> On Tue, Jul 14, 2020 at 3:03 PM Vladislav Pyatkov  >
> >> wrote:
> >>
> >> > Hi guys,
> >> >
> >> > I want to implement a more honest heuristic for historical re

Re: Choosing historical rebalance heuristics

2020-07-15 Thread Ivan Rakov
Hi Vladislav,

Thanks for raising this topic.
Currently present IGNITE_PDS_WAL_REBALANCE_THRESHOLD (default is 500_000)
is controversial. Assuming that the default number of partitions is 1024,
cache should contain a really huge amount of data in order to make WAL
delta rebalancing possible. In fact, it's currently disabled for most
production cases, which makes rebalancing of persistent caches unreasonably
long.

I think, your approach [1] makes much more sense than the current
heuristic, let's move forward with the proposed solution.

Though, there are some other corner cases, e.g. this one:
- Configured size of WAL archive is big (>100 GB)
- Cache has small partitions (e.g. 1000 entries)
- Infrequent updates (e.g. ~100 in the whole WAL history of any node)
- There is another cache with very frequent updates which allocate >99% of
WAL
In such scenario we may need to iterate over >100 GB of WAL in order to
fetch <1% of needed updates. Even though the amount of network traffic is
still optimized, it would be more effective to transfer partitions with
~1000 entries fully instead of reading >100 GB of WAL.

I want to highlight that your heuristic definitely makes the situation
better, but due to possible corner cases we should keep the fallback lever
to restrict or limit historical rebalance as before. Probably, it would be
handy to keep IGNITE_PDS_WAL_REBALANCE_THRESHOLD property with a low
default value (1000, 500 or even 0) and apply your heuristic only for
partitions with bigger size.

Regarding case [2]: it looks like an improvement that can mitigate some
corner cases (including the one that I have described). I'm ok with it as
long as it takes data updates reordering on backup nodes into account. We
don't track skipped updates for atomic caches. As a result, detection of
the absence of updates between two checkpoint markers with the same
partition counter can be false positive.

--
Best Regards,
Ivan Rakov

On Tue, Jul 14, 2020 at 3:03 PM Vladislav Pyatkov 
wrote:

> Hi guys,
>
> I want to implement a more honest heuristic for historical rebalance.
> Before, a cluster makes a choice between the historical rebalance or not it
> only from a partition size. This threshold more known by a name of property
> IGNITE_PDS_WAL_REBALANCE_THRESHOLD.
> It might prevent a historical rebalance when a partition is too small, but
> not if WAL contains more updates than a size of partition, historical
> rebalance still can be chosen.
> There is a ticket where need to implement more fair heuristic[1].
>
> My idea for implementation is need to estimate a size of data which will be
> transferred owe network. In other word if need to rebalance a part of WAL
> that contains N updates, for recover a partition on another node, which
> have to contain M rows at all, need chooses a historical rebalance on the
> case where N < M (WAL history should be presented as well).
>
> This approach is easy implemented, because a coordinator node has the size
> of partitions and counters' interval. But in this case cluster still can
> find not many updates in too long WAL history. I assume a possibility to
> work around it, if rebalance historical iterator will not handle
> checkpoints where not contains updates of particular cache. Checkpoints can
> skip if counters for the cache (maybe even a specific partitions) was not
> changed between it and next one.
>
> Ticket for improvement rebalance historical iterator[2]
>
> I want to hear a view of community on the thought above.
> Maybe anyone has another opinion?
>
> [1]: https://issues.apache.org/jira/browse/IGNITE-13253
> [2]: https://issues.apache.org/jira/browse/IGNITE-13254
>
> --
> Vladislav Pyatkov
>


Re: Apache Ignite 2.9.0 RELEASE [Time, Scope, Manager]

2020-07-14 Thread Ivan Rakov
Hi,

We are still waiting for a final review of Tracing functionality [1] until
the end of tomorrow (July 15).
We anticipate that it will be merged to Ignite master no later than July 16.

Sorry for being a bit late here. Alex P., can you include [1] to the
release scope?

[1]: https://issues.apache.org/jira/browse/IGNITE-13060

--
Best Regards,
Ivan Rakov

On Tue, Jul 14, 2020 at 6:16 AM Alexey Kuznetsov 
wrote:

> Alex,
>
> Can you cherry-pick to Ignite 2.9 this issue:
> https://issues.apache.org/jira/browse/IGNITE-13246 ?
>
> This issue is about BASELINE events and it is very useful for notification
> external tools about changes in baseline.
>
> Thank you!
>
> ---
> Alexey Kuznetsov
>


Re: [DISCUSSION] Tracing: IGNITE-13060

2020-07-14 Thread Ivan Rakov
Igniters,

The PR is ready to be merged, all comments from my side have been fixed.
If anyone has more comments, please let know today.

Best Regards,
Ivan Rakov



On Tue, Jun 30, 2020 at 10:43 AM Alexander Lapin 
wrote:

> Hello Igniters,
>
> I'd like to discuss with you and then donate changes related to
> IGNITE-13060
> <https://issues.apache.org/jira/browse/IGNITE-13060>
> In very brief it's an initial tracing implementation that allows to thrace
> Communication, Exchange, Discovery and Transactions. Spi concept is used
> with OpenCensus as one of implementations. For more details about tracing
> engine, tracing configuration, etc please see IEP-48
> <https://cwiki.apache.org/confluence/display/IGNITE/IEP-48%3A+Tracing>.
>
> Best regards,
> Alexander
>


[jira] [Created] (IGNITE-13211) Improve public exceptions for case when user attempts to access data from a lost partition

2020-07-03 Thread Ivan Rakov (Jira)
Ivan Rakov created IGNITE-13211:
---

 Summary: Improve public exceptions for case when user attempts to 
access data from a lost partition
 Key: IGNITE-13211
 URL: https://issues.apache.org/jira/browse/IGNITE-13211
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Rakov


After IGNITE-13003, attempt to access lost partition from public API throws 
CacheException with CacheInvalidStateException inside as a root cause. We can 
improve user experience a bit:
1. Create new type of public exception (subclass of CacheException), which will 
be thrown in accessing lost data scenarios
2. In case partition is lost in persistent cache, error message should be 
changed from "partition data has been lost" to "partition data temporary 
unavailable".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Extended logging for rebalance performance analysis

2020-06-29 Thread Ivan Rakov
+1 to Alex G.

>From my experience, the most interesting cases with Ignite rebalancing
happen exactly in production. According to the fact that we already have
detailed rebalancing logging, adding info about rebalance performance looks
like a reasonable improvement. With new logs we'll be able to detect and
investigate situations when rebalance is slow due to uneven suppliers
distribution or network issues.
Option to disable the feature in runtime shouldn't be used often, but it
will keep us on the safe side in case something goes wrong.
The format described in
https://issues.apache.org/jira/browse/IGNITE-12080 looks
good to me.

On Tue, Jun 23, 2020 at 7:01 PM ткаленко кирилл 
wrote:

> Hello, Alexey!
>
> Currently there is no way to disable / enable it, but it seems that the
> logs will not be overloaded, since Alexei Scherbakov offer seems reasonable
> and compact. Of course, you can add disabling / enabling statistics
> collection via jmx for example.
>
> 23.06.2020, 18:47, "Alexey Goncharuk" :
> > Hello Maxim, folks,
> >
> > ср, 6 мая 2020 г. в 21:01, Maxim Muzafarov :
> >
> >>  We won't do performance analysis on the production environment. Each
> >>  time we need performance analysis it will be done on a test
> >>  environment with verbose logging enabled. Thus I suggest moving these
> >>  changes to a separate `profiling` module and extend the logging much
> >>  more without any ышяу limitations. The same as these [2] [3]
> >>  activities do.
> >
> >  I strongly disagree with this statement. I am not sure who is meant here
> > by 'we', but I see a strong momentum in increasing observability tooling
> > that helps people to understand what exactly happens in the production
> > environment [1]. Not everybody can afford two identical environments for
> > testing. We should make sure users have enough information to understand
> > the root cause after the incident happened, and not force them to
> reproduce
> > it, let alone make them add another module to the classpath and restart
> the
> > nodes.
> > I think having this functionality in the core module with the ability to
> > disable/enable it is the right approach. Having the information printed
> to
> > log is ok, having it in an event that can be sent to a monitoring/tracing
> > subsystem is even better.
> >
> > Kirill, can we enable and disable this feature in runtime to avoid the
> very
> > same nodes restart?
> >
> > [1]
> https://www.honeycomb.io/blog/yes-i-test-in-production-and-so-do-you/
>


Re: Various shutdown guaranties

2020-06-09 Thread Ivan Rakov
Vlad,

+1, that's what I mean.
We don't need either  or dedicated USE_STATIC_CONFIGURATION in case
the user will be able to retrieve current shutdown policy and apply the one
he needs.
My only requirement is that ignite.cluster().getShutdownPolicy() should
return a statically configured value {@link
IgniteConfiguration#shutdownPolicy} in case no override has been specified.
So, static configuration will be applied only on cluster start, like it
currently works for SQL schemas.

On Tue, Jun 9, 2020 at 7:09 PM V.Pyatkov  wrote:

> Hi,
>
> ignite.cluster().setShutdownPolicy(null); // Clear dynamic value and switch
> to statically configured.
>
> I do not understand why we need it. if user want to change configuration to
> any other value he set it explicitly.
> We can to add warning on start when static option does not math to dynamic
> (dynamic always prefer if it initiated).
>
> shutdownPolicy=IMMEDIATE|GRACEFUL
>
> Looks better that DEFAULT and WAIT_FOR_BACKUP.
>
> I general I consider job cancellation need to added in these policies'
> enumeration.
> But we can do it in the future.
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>


Re: Apache Ignite 2.9.0 RELEASE [Time, Scope, Manager]

2020-06-09 Thread Ivan Rakov
Hi,

Indeed, the tracing feature is almost ready. Discovery, communication and
transactions tracing will be introduced, as well as an option to configure
tracing in runtime. Right now we are working on final performance
optimizations, but it's very likely that we'll complete this activity
before the code freeze date.
Let's include tracing to the 2.9 release scope.

More info:
https://cwiki.apache.org/confluence/display/IGNITE/IEP-48%3A+Tracing
https://issues.apache.org/jira/browse/IGNITE-13060

--
Best Regards,
Ivan Rakov

On Sat, Jun 6, 2020 at 4:30 PM Denis Magda  wrote:

> Hi folks,
>
> The timelines proposed by Alex Plekhanov sounds reasonable to me. I'd like
> only to hear inputs of @Ivan Rakov , who is about to
> finish with the tracing support, and @Ivan Bessonov
> , who is fixing a serious limitation for K8
> deployments [1]. Most likely, both features will be ready by the code
> freeze date (July 10), but the guys should know it better.
>
> [1]
> http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSSION-New-Ignite-settings-for-IGNITE-12438-and-IGNITE-13013-td47586.html
>
> -
> Denis
>
>
> On Wed, Jun 3, 2020 at 4:45 AM Alex Plehanov 
> wrote:
>
>> Hello Igniters,
>>
>> AI 2.8.1 is finally released and as we discussed here [1] its time to
>> start
>> the discussion about 2.9 release.
>>
>> I want to propose myself to be the release manager of the 2.9 release.
>>
>> What about release time, I agree with Maxim that we should deliver
>> features
>> as frequently as possible. If some feature doesn't fit into release dates
>> we should better include it into the next release and schedule the next
>> release earlier then postpone the current release.
>>
>> I propose the following dates for 2.9 release:
>>
>> Scope Freeze: June 26, 2020
>> Code Freeze: July 10, 2020
>> Voting Date: July 31, 2020
>> Release Date: August 7, 2019
>>
>> WDYT?
>>
>> [1] :
>>
>> http://apache-ignite-developers.2346864.n4.nabble.com/Ignite-Releases-Plan-td47360.html#a47575
>>
>


Re: Various shutdown guaranties

2020-06-09 Thread Ivan Rakov
Alex,

Also shutdown policy must be always consistent on the grid or unintentional
> data loss is possible if two nodes are stopping simultaneously with
> different policies.

 Totally agree.

Let's use shutdownPolicy=DEFAULT|GRACEFUL, as was proposed by me earlier.

 I'm ok with GRACEFUL instead of WAIT_FOR_BACKUPS.

5. Let's keep a static property for simplifying setting of initial
> behavior.
> In most cases the policy will never be changed during grid's lifetime.
> No need for an explicit call to API on grid start.
> A joining node should check a local configuration value to match the grid.
> If a dynamic value is already present in a metastore, it should override
> static value with a warning.

To sum it up:
- ShutdownPolicy can be set with static configuration
(IgniteConfiguration#setShutdownPolicy), on join we validate that
statically configured policies on different server nodes are the same
- It's possible to override statically configured value by adding
distributed metastorage value, which can be done by
calling ignite.cluster().setShutdownPolicy(plc) or control.sh method
- Dynamic property is persisted

Generally, I don't mind if we have both dynamic and static configuration
properties. Necessity to call ignite.cluster().setShutdownPolicy(plc); on
every new cluster creation is a usability issue itself.
What bothers me here are the possible conflicts between static and dynamic
configuration. User may be surprised if he has shutdown policy X in
IgniteConfiguration, but the cluster behaves according to policy Y (because
several months ago another admin had called
IgniteCluster#setShutdownPolicy).
We can handle it by adding a separate enum field to the shutdown policy:

> public enum ShutdownPolicy {
>   /* Default value of dynamic shutdown policy property. If it's set, the
> shutdown policy is resolved according to value of static {@link
> IgniteConfiguration#shutdownPolicy} configuration parameter. */
>   USE_STATIC_CONFIGURATION,
>
>   /* Node leaves the cluster even if it's the last owner of some
> partitions. Only partitions of caches with backups > 0 are taken into
> account. */
>   IMMEDIATE,
>
>   /* Shutdown is blocked until node is safe to leave without the data
> loss. */
>   GRACEFUL
> }
>
This way:
1) User may easily understand whether the static parameter is overridden by
dynamic. If ignite.cluster().getShutdownPolicy() return anything except
USE_STATIC_CONFIGURATION, behavior is overridden.
2) User may clear previous overriding by calling
ignite.cluster().setShutdownPolicy(USE_STATIC_CONFIGURATION). After that,
behavior will be resolved based in IgniteConfiguration#shutdownPolicy again.
If we agree on this mechanism, I propose to use IMMEDIATE name instead of
DEFAULT for non-safe policy in order to don't confuse user.
Meanwhile, static configuration will accept the same enum, but
USE_STATIC_CONFIGURATION will be restricted:

> public class IgniteConfiguration {
>   public static final ShutdownPolicy DFLT_STATIC_SHUTDOWN_POLICY =
> IMMEDIATE;
>   private ShutdownPolicy shutdownPolicy = DFLT_STATIC_SHUTDOWN_POLICY;
>   ...
>   public void setShutdownPolicy(ShutdownPolicy shutdownPlc) {
> if (shutdownPlc ==  USE_STATIC_CONFIGURATION)
>   throw new IllegalArgumentException("USE_STATIC_CONFIGURATION can
> only be passed as dynamic property value via
> ignite.cluster().setShutdownPolicy");
> ...
>   }
> ...
> }
>

What do you think?


On Tue, Jun 9, 2020 at 11:46 AM Alexei Scherbakov <
alexey.scherbak...@gmail.com> wrote:

> Ivan Rakov,
>
> Your proposal overall looks good to me. My comments:
>
> 1. I would avoid adding such a method, because it will be impossible to
> change it in the future if some more shutdown policies will be introduced
> later.
> Also shutdown policy must be always consistent on the grid or unintentional
> data loss is possible if two nodes are stopping simultaneously with
> different policies.
>
> This behavior can be achieved by changing policy globally when stopping a
> node:
> ignite.cluster().setShutdownPolicy(DEFAULT);
> ignore.stop();
>
> 2. defaultShutdownPolicy with DEFAULT value is a mess. WAIT_FOR_BACKUPS is
> not very clear either.
> Let's use shutdownPolicy=DEFAULT|GRACEFUL, as was proposed by me earlier.
>
> 3. OK
>
> 4. OK
>
> 5. Let's keep a static property for simplifying setting of initial
> behavior.
> In most cases the policy will never be changed during grid's lifetime.
> No need for an explicit call to API on grid start.
> A joining node should check a local configuration value to match the grid.
> If a dynamic value is already present in a metastore, it should override
> static value with a warning.
>
>
>
>
> пн, 8 июн. 2020 г. в 19:06, Ivan Rakov :
>
> > Vlad, thanks for s

Re: Various shutdown guaranties

2020-06-09 Thread Ivan Rakov
Alex,

I'm not sure there is a problem at all, because user can always query the
> current policy, and a javadoc can describe such behavior clearly.

What will the query method return if the static policy is not overridden?
If we decide to avoid adding dedicated USE_STATIC_CONFIGURATION value,
semantics can be as follows:

> // Returns shutdown policy that is currently used by the cluster
> // If ignite.cluster().setShutdownPolicy() was never called, returns value
> from static configuration {@link IgniteConfiguration#shutdownPolicy}, which
> is consistent across all server nodes
> // If shutdown policy was overridden by user via
> ignite.cluster().setShutdownPolicy(), returns corresponding value

ignite.cluster().getShutdownPolicy();
>

Seems like there will be no need to reset distributed meta storage value.
User can always check which policy is used right now (regardless of whether
it has been overridden) and just set the policy that he needs if he wants
to change it.
The behavior is simple, the only magic is mapping  value in
distributed meta storage to value from IgniteConfiguration#shutdownPolicy.
Can we agree on this?

On Tue, Jun 9, 2020 at 3:48 PM Alexei Scherbakov <
alexey.scherbak...@gmail.com> wrote:

> Ivan,
>
> Using an additional enum on public API for resetting dynamic value looks a
> little bit dirty for me.
> I'm not sure there is a problem at all, because user can always query the
> current policy, and a javadoc can describe such behavior clearly.
> If you really insist maybe use null to reset policy value:
>
> ignite.cluster().setShutdownPolicy(null); // Clear dynamic value and switch
> to statically configured.
>
> On top of this, we already have a bunch of over properties, which are set
> statically and can be changed dynamically later,  for example [1]
> I think all such properties should behave the same way as shutdown policy
> and we need a ticket for this.
> In such a case we probably should go with something like
>
> ignite.cluster().resetDynamicProperValuey(propName); // Resets a property
> to statically configured default value.
>
> Right now I would prefer for shutdown policy behave as other dynamic
> properties to make things consistent and fix them all later to be
> resettable to static configuration value.
>
> [1]
> org.apache.ignite.IgniteCluster#setTxTimeoutOnPartitionMapExchange(timeout)
>
>
>
> вт, 9 июн. 2020 г. в 15:12, Ivan Rakov :
>
> > Something went wrong with gmail formatting. Resending my reply.
> >
> > Alex,
> >
> > Also shutdown policy must be always consistent on the grid or
> unintentional
> > > data loss is possible if two nodes are stopping simultaneously with
> > > different policies.
> >
> >  Totally agree.
> >
> > Let's use shutdownPolicy=DEFAULT|GRACEFUL, as was proposed by me earlier.
> >
> >  I'm ok with GRACEFUL instead of WAIT_FOR_BACKUPS.
> >
> > 5. Let's keep a static property for simplifying setting of initial
> > > behavior.
> > > In most cases the policy will never be changed during grid's lifetime.
> > > No need for an explicit call to API on grid start.
> > > A joining node should check a local configuration value to match the
> > grid.
> > > If a dynamic value is already present in a metastore, it should
> override
> > > static value with a warning.
> >
> > To sum it up:
> > - ShutdownPolicy can be set with static configuration
> > (IgniteConfiguration#setShutdownPolicy), on join we validate that
> > statically configured policies on different server nodes are the same
> > - It's possible to override statically configured value by adding
> > distributed metastorage value, which can be done by
> > calling ignite.cluster().setShutdownPolicy(plc) or control.sh method
> > - Dynamic property is persisted
> >
> > Generally, I don't mind if we have both dynamic and static configuration
> > properties. Necessity to call ignite.cluster().setShutdownPolicy(plc); on
> > every new cluster creation is a usability issue itself.
> > What bothers me here are the possible conflicts between static and
> dynamic
> > configuration. User may be surprised if he has shutdown policy X in
> > IgniteConfiguration, but the cluster behaves according to policy Y
> (because
> > several months ago another admin had called
> > IgniteCluster#setShutdownPolicy).
> > We can handle it by adding a separate enum field to the shutdown policy:
> >
> > > public enum ShutdownPolicy {
> > >   /* Default value of dynamic shutdown policy property. If it's set,
> the
> > > shutdown policy is resolved according to value of static {@link
> > 

Re: Various shutdown guaranties

2020-06-09 Thread Ivan Rakov
Something went wrong with gmail formatting. Resending my reply.

Alex,

Also shutdown policy must be always consistent on the grid or unintentional
> data loss is possible if two nodes are stopping simultaneously with
> different policies.

 Totally agree.

Let's use shutdownPolicy=DEFAULT|GRACEFUL, as was proposed by me earlier.

 I'm ok with GRACEFUL instead of WAIT_FOR_BACKUPS.

5. Let's keep a static property for simplifying setting of initial
> behavior.
> In most cases the policy will never be changed during grid's lifetime.
> No need for an explicit call to API on grid start.
> A joining node should check a local configuration value to match the grid.
> If a dynamic value is already present in a metastore, it should override
> static value with a warning.

To sum it up:
- ShutdownPolicy can be set with static configuration
(IgniteConfiguration#setShutdownPolicy), on join we validate that
statically configured policies on different server nodes are the same
- It's possible to override statically configured value by adding
distributed metastorage value, which can be done by
calling ignite.cluster().setShutdownPolicy(plc) or control.sh method
- Dynamic property is persisted

Generally, I don't mind if we have both dynamic and static configuration
properties. Necessity to call ignite.cluster().setShutdownPolicy(plc); on
every new cluster creation is a usability issue itself.
What bothers me here are the possible conflicts between static and dynamic
configuration. User may be surprised if he has shutdown policy X in
IgniteConfiguration, but the cluster behaves according to policy Y (because
several months ago another admin had called
IgniteCluster#setShutdownPolicy).
We can handle it by adding a separate enum field to the shutdown policy:

> public enum ShutdownPolicy {
>   /* Default value of dynamic shutdown policy property. If it's set, the
> shutdown policy is resolved according to value of static {@link
> IgniteConfiguration#shutdownPolicy} configuration parameter. */
>   USE_STATIC_CONFIGURATION,
>
>   /* Node leaves the cluster even if it's the last owner of some
> partitions. Only partitions of caches with backups > 0 are taken into
> account. */
>   IMMEDIATE,
>
>   /* Shutdown is blocked until node is safe to leave without the data
> loss. */
>   GRACEFUL
> }
>
This way:
1) User may easily understand whether the static parameter is overridden by
dynamic. If ignite.cluster().getShutdownPolicy() return anything except
USE_STATIC_CONFIGURATION, behavior is overridden.
2) User may clear previous overriding by calling
ignite.cluster().setShutdownPolicy(USE_STATIC_CONFIGURATION). After that,
behavior will be resolved based in IgniteConfiguration#shutdownPolicy again.
If we agree on this mechanism, I propose to use IMMEDIATE name instead of
DEFAULT for non-safe policy in order to don't confuse user.
Meanwhile, static configuration will accept the same enum, but
USE_STATIC_CONFIGURATION will be restricted:

> public class IgniteConfiguration {
>   public static final ShutdownPolicy DFLT_STATIC_SHUTDOWN_POLICY =
> IMMEDIATE;
>   private ShutdownPolicy shutdownPolicy = DFLT_STATIC_SHUTDOWN_POLICY;
>   ...
>   public void setShutdownPolicy(ShutdownPolicy shutdownPlc) {
> if (shutdownPlc ==  USE_STATIC_CONFIGURATION)
>   throw new IllegalArgumentException("USE_STATIC_CONFIGURATION can
> only be passed as dynamic property value via
> ignite.cluster().setShutdownPolicy");
> ...
>   }
> ...
> }
>

What do you think?

On Tue, Jun 9, 2020 at 3:09 PM Ivan Rakov  wrote:

> Alex,
>
> Also shutdown policy must be always consistent on the grid or unintentional
>> data loss is possible if two nodes are stopping simultaneously with
>> different policies.
>
>  Totally agree.
>
> Let's use shutdownPolicy=DEFAULT|GRACEFUL, as was proposed by me earlier.
>
>  I'm ok with GRACEFUL instead of WAIT_FOR_BACKUPS.
>
> 5. Let's keep a static property for simplifying setting of initial
>> behavior.
>> In most cases the policy will never be changed during grid's lifetime.
>> No need for an explicit call to API on grid start.
>> A joining node should check a local configuration value to match the grid.
>> If a dynamic value is already present in a metastore, it should override
>> static value with a warning.
>
> To sum it up:
> - ShutdownPolicy can be set with static configuration
> (IgniteConfiguration#setShutdownPolicy), on join we validate that
> statically configured policies on different server nodes are the same
> - It's possible to override statically configured value by adding
> distributed metastorage value, which can be done by
> calling ignite.cluster().setShutdownPolicy(plc) or control.sh method
> - Dynamic property is persisted
>
> Generally, I don't mind if 

Re: Various shutdown guaranties

2020-06-09 Thread Ivan Rakov
Alex,

Also shutdown policy must be always consistent on the grid or unintentional
> data loss is possible if two nodes are stopping simultaneously with
> different policies.

 Totally agree.

Let's use shutdownPolicy=DEFAULT|GRACEFUL, as was proposed by me earlier.

 I'm ok with GRACEFUL instead of WAIT_FOR_BACKUPS.

5. Let's keep a static property for simplifying setting of initial
> behavior.
> In most cases the policy will never be changed during grid's lifetime.
> No need for an explicit call to API on grid start.
> A joining node should check a local configuration value to match the grid.
> If a dynamic value is already present in a metastore, it should override
> static value with a warning.

To sum it up:
- ShutdownPolicy can be set with static configuration
(IgniteConfiguration#setShutdownPolicy), on join we validate that
statically configured policies on different server nodes are the same
- It's possible to override statically configured value by adding
distributed metastorage value, which can be done by
calling ignite.cluster().setShutdownPolicy(plc) or control.sh method
- Dynamic property is persisted

Generally, I don't mind if we have both dynamic and static configuration
properties. Necessity to call ignite.cluster().setShutdownPolicy(plc); on
every new cluster creation is a usability issue itself.
What bothers me here are the possible conflicts between static and dynamic
configuration. User may be surprised if he has shutdown policy X in
IgniteConfiguration, but the cluster behaves according to policy Y (because
several months ago another admin had called
IgniteCluster#setShutdownPolicy).
We can handle it by adding a separate enum field to the shutdown policy:

> public enum ShutdownPolicy {
>   /* Default value of dynamic shutdown policy property. If it's set, the
> shutdown policy is resolved according to value of static {@link
> IgniteConfiguration#shutdownPolicy} configuration parameter. */
>   USE_STATIC_CONFIGURATION,
>
>   /* Node leaves the cluster even if it's the last owner of some
> partitions. Only partitions of caches with backups > 0 are taken into
> account. */
>   IMMEDIATE,
>
>   /* Shutdown is blocked until node is safe to leave without the data
> loss. */
>   GRACEFUL
> }
>
This way:
1) User may easily understand whether the static parameter is overridden by
dynamic. If ignite.cluster().getShutdownPolicy() return anything except
USE_STATIC_CONFIGURATION, behavior is overridden.
2) User may clear previous overriding by calling
ignite.cluster().setShutdownPolicy(USE_STATIC_CONFIGURATION). After that,
behavior will be resolved based in IgniteConfiguration#shutdownPolicy again.
If we agree on this mechanism, I propose to use IMMEDIATE name instead of
DEFAULT for non-safe policy in order to don't confuse user.
Meanwhile, static configuration will accept the same enum, but
USE_STATIC_CONFIGURATION will be restricted:

> public class IgniteConfiguration {
>   public static final ShutdownPolicy DFLT_STATIC_SHUTDOWN_POLICY =
> IMMEDIATE;
>   private ShutdownPolicy shutdownPolicy = DFLT_STATIC_SHUTDOWN_POLICY;
>   ...
>   public void setShutdownPolicy(ShutdownPolicy shutdownPlc) {
> if (shutdownPlc ==  USE_STATIC_CONFIGURATION)
>   throw new IllegalArgumentException("USE_STATIC_CONFIGURATION can
> only be passed as dynamic property value via
> ignite.cluster().setShutdownPolicy");
> ...
>   }
> ...
> }
>

What do you think?

On Tue, Jun 9, 2020 at 11:46 AM Alexei Scherbakov <
alexey.scherbak...@gmail.com> wrote:

> Ivan Rakov,
>
> Your proposal overall looks good to me. My comments:
>
> 1. I would avoid adding such a method, because it will be impossible to
> change it in the future if some more shutdown policies will be introduced
> later.
> Also shutdown policy must be always consistent on the grid or unintentional
> data loss is possible if two nodes are stopping simultaneously with
> different policies.
>
> This behavior can be achieved by changing policy globally when stopping a
> node:
> ignite.cluster().setShutdownPolicy(DEFAULT);
> ignore.stop();
>
> 2. defaultShutdownPolicy with DEFAULT value is a mess. WAIT_FOR_BACKUPS is
> not very clear either.
> Let's use shutdownPolicy=DEFAULT|GRACEFUL, as was proposed by me earlier.
>
> 3. OK
>
> 4. OK
>
> 5. Let's keep a static property for simplifying setting of initial
> behavior.
> In most cases the policy will never be changed during grid's lifetime.
> No need for an explicit call to API on grid start.
> A joining node should check a local configuration value to match the grid.
> If a dynamic value is already present in a metastore, it should override
> static value with a warning.
>
>
>
>
> пн, 8 июн. 2020 г. в 19:06, Ivan Rakov :
>
> > Vlad, thanks for starting thi

Re: Various shutdown guaranties

2020-06-08 Thread Ivan Rakov
Vlad, thanks for starting this discussion.

I'll try to clarify the motivation for this change as I see it.
In general, Ignite clusters are vulnerable to the data loss. Of course, we
have configurable PartitionLossPolicy, which allows to handle data loss
safely and mitigate its consequences. But being able to avoid critical
situations is always better than being able to recover from it.

The most common issue from my perspective is absence of a way to perform
rolling cluster restart safely. Scenario:
1. Backup count is 1
2. Admin wants to perform rolling restart in order to deploy new version of
business code that uses Ignite in embedded mode
3. Admin shuts down first node, replaces needed binaries and returns the
node back to the topology
4. Node joins the cluster successfully
5. Admin shuts down second node
6. Data loss happens: the second node was the only owner of a certain
partition, which was being rebalanced from the second node to the first

We can prevent such situations by introducing "safe shutdown by default"
mode, which blocks stopping node while it remains the only owner for at
least one partition. It should be applied to "common" ways of stopping
nodes - Ignite.close() and kill .
I think, option to be enabled or disabled in runtime should be a
requirement for this behavior. Safe shutdown mode has weird side-effects.
For example, admin won't be able to stop the whole cluster: stop of last
node will be blocked, because the last node is the only present owner of
all its partitions. Sure, kill -9 will resolve it, but it's still a
usability issue.

With the described dynamic property scenario will be changed as follows:
1. Admin enables "safe shutdown" mode
2. Admin shuts down first node, replaces needed binaries and returns the
node back to the topology
3. Admin shuts down second node (with either ignite.close() or kill ),
shutdown is blocked until the first node returns to the topology and
completes the rebalancing process
4. Admin proceeds the rolling restart procedure
5. Admin disables "safe shutdown" mode

This logic will also simplify the rolling restart scenario in K8S. Pod with
Ignite node won't be terminated until its termination will cause data loss.

Aside from waiting for backups, Ignition interface provide lots of options
to perform various node stop:
- Whether or not to cancel pending compute jobs
- Whether or not to perform instant halt() instead of any graceful stop
logic
- Whether or not to wait for some timeout before halt()
- Whether or not the stopped grid should be restarted
All these "stop" methods provide very custom logic. I don't see a need to
make them part of dynamic cluster-wide configuration. They still can be
invoked directly via Java API. Later we can extract some of them to dynamic
cluster-wide parameters of default stop if it will become necessary. That's
why I think we should create an enum for default shutdown policy, but only
with two options so far (we can add more later): DEFAULT and
WAIT_FOR_BACKUPS.
Regarding the "NORMAL" option that you propose (where the node is not
stopped until the rebalance is finished): I don't think that we should add
it. It doesn't ensure any strict guarantees: the data still can be lost
with it.

To sum it up, I propose:
1. Add a new method to Ignition interface to make it possible to stop with
"wait for backups" logic directly via Java API, like Ignition.stop(boolean
cancel, boolean waitForBackups)
2. Introduce "defaultShutdownPolicy" as a dynamic cluster configuration,
two values are available so far: DEFAULT and WAIT_FOR_BACKUPS
3. This property is stored in the distributed metastorage (thus persisted),
can be changed via Java API and ./control.sh
4. Behavior configured with this property will be applied only on common
ways of stopping the node - Ignite.close() and kill .
5. *Don't* add new options to the static IgniteConfiguration to avoid
conflicts between dynamic and static configuration

-- 
Best Regards,
Ivan Rakov

On Mon, Jun 8, 2020 at 6:44 PM V.Pyatkov  wrote:

> Hi
>
> We need to have ability to calling shutdown with various guaranties.
> For example:
> Need to reboot a node, but after that node should be available for
> historical rebalance (all partitions in MOVING state should have gone to
> OWNING).
>
> Implemented a circled reboot of cluster, but all data should be available
> on
> that time (at least one copy of partition should be available in cluster).
>
> Need to wait not only data available, but all jobs (before this behavior
> available through a stop(false) method invocation).
>
> All these reason required various behavior before shutting down node.
> I propose slightly modify public API and add here method which shown on
> shutdown behavior directly:
> Ignite.close(Shutdown)
>
> /public enum Shutdownn {
> /**
>  * Stop immediately as s

Re: Re[2]: Proposal: set default transaction timeout to 5 minutes

2020-05-26 Thread Ivan Rakov
Zhenya,

Can you please elaborate?
Why we need to change default TX timeout via JMX? It looks feasible and
perhaps may work as a hotfix for live deployments experiencing issues with
long transactions, but it's definitely a separate issue.

On Fri, May 22, 2020 at 6:20 PM Zhenya Stanilovsky
 wrote:

>
> Ivan, does global timeout change through jmx in scope of this ticket ? If
> so, can you add it ? Opposite we need additional ticket, i hope ? We
> still have no somehow store for jmx changed params, every one need to
> remember that cluster restart will reset this setting to default, in this
> case system param need to be appended.
>
>
>
> >https://issues.apache.org/jira/browse/IGNITE-13064 is raised with label
> >"newbie".
> >
> >On Tue, May 19, 2020 at 4:10 PM Ivan Rakov < ivan.glu...@gmail.com >
> wrote:
> >
> >> Support this idea in general but why 5 minutes and not less?
> >>
> >> This value looks to me greater than any value that can possibly affect
> >> existing deployments (existing long transactions may suddenly start to
> >> rollback), but less than reaction time of users that are only starting
> to
> >> get along with Ignite and suddenly experience TX deadlock.
> >>
> >> --
> >> Best Regards,
> >> Ivan Rakov
> >>
> >> On Tue, May 19, 2020 at 10:31 AM Anton Vinogradov < a...@apache.org >
> wrote:
> >>
> >>> +1
> >>>
> >>> On Mon, May 18, 2020 at 9:45 PM Sergey Antonov <
> antonovserge...@gmail.com
> >>> >
> >>> wrote:
> >>>
> >>> > +1
> >>> >
> >>> > пн, 18 мая 2020 г. в 21:26, Andrey Mashenkov <
> >>>  andrey.mashen...@gmail.com >:
> >>> >
> >>> > > +1
> >>> > >
> >>> > > On Mon, May 18, 2020 at 9:19 PM Ivan Rakov < ivan.glu...@gmail.com
> >
> >>> > wrote:
> >>> > >
> >>> > > > Hi Igniters,
> >>> > > >
> >>> > > > I have a very simple proposal. Let's set default TX timeout to 5
> >>> > minutes
> >>> > > > (right now it's 0 = no timeout).
> >>> > > > Pros:
> >>> > > > 1. Deadlock detection procedure is triggered on timeout. In case
> >>> user
> >>> > > will
> >>> > > > get into key-level deadlock, he'll be able to discover root cause
> >>> from
> >>> > > the
> >>> > > > logs (even though load will hang for a while) and skip step with
> >>> > googling
> >>> > > > and debugging.
> >>> > > > 2. Almost every system with transactions has timeout enabled by
> >>> > default.
> >>> > > >
> >>> > > > WDYT?
> >>> > > >
> >>> > > > --
> >>> > > > Best Regards,
> >>> > > > Ivan Rakov
> >>> > > >
> >>> > >
> >>> > >
> >>> > > --
> >>> > > Best regards,
> >>> > > Andrey V. Mashenkov
> >>> > >
> >>> >
> >>> >
> >>> > --
> >>> > BR, Sergey Antonov
> >>> >
> >>>
> >>
>
>
>
>


Re: Proposal: set default transaction timeout to 5 minutes

2020-05-22 Thread Ivan Rakov
https://issues.apache.org/jira/browse/IGNITE-13064 is raised with label
"newbie".

On Tue, May 19, 2020 at 4:10 PM Ivan Rakov  wrote:

> Support this idea in general but why 5 minutes and not less?
>
> This value looks to me greater than any value that can possibly affect
> existing deployments (existing long transactions may suddenly start to
> rollback), but less than reaction time of users that are only starting to
> get along with Ignite and suddenly experience TX deadlock.
>
> --
> Best Regards,
> Ivan Rakov
>
> On Tue, May 19, 2020 at 10:31 AM Anton Vinogradov  wrote:
>
>> +1
>>
>> On Mon, May 18, 2020 at 9:45 PM Sergey Antonov > >
>> wrote:
>>
>> > +1
>> >
>> > пн, 18 мая 2020 г. в 21:26, Andrey Mashenkov <
>> andrey.mashen...@gmail.com>:
>> >
>> > > +1
>> > >
>> > > On Mon, May 18, 2020 at 9:19 PM Ivan Rakov 
>> > wrote:
>> > >
>> > > > Hi Igniters,
>> > > >
>> > > > I have a very simple proposal. Let's set default TX timeout to 5
>> > minutes
>> > > > (right now it's 0 = no timeout).
>> > > > Pros:
>> > > > 1. Deadlock detection procedure is triggered on timeout. In case
>> user
>> > > will
>> > > > get into key-level deadlock, he'll be able to discover root cause
>> from
>> > > the
>> > > > logs (even though load will hang for a while) and skip step with
>> > googling
>> > > > and debugging.
>> > > > 2. Almost every system with transactions has timeout enabled by
>> > default.
>> > > >
>> > > > WDYT?
>> > > >
>> > > > --
>> > > > Best Regards,
>> > > > Ivan Rakov
>> > > >
>> > >
>> > >
>> > > --
>> > > Best regards,
>> > > Andrey V. Mashenkov
>> > >
>> >
>> >
>> > --
>> > BR, Sergey Antonov
>> >
>>
>


[jira] [Created] (IGNITE-13064) Set default transaction timeout to 5 minutes

2020-05-22 Thread Ivan Rakov (Jira)
Ivan Rakov created IGNITE-13064:
---

 Summary: Set default transaction timeout to 5 minutes
 Key: IGNITE-13064
 URL: https://issues.apache.org/jira/browse/IGNITE-13064
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Rakov


Let's set default TX timeout to 5 minutes (right now it's 0 = no timeout).
Pros:
1. Deadlock detection procedure is triggered on timeout. In case user will get 
into key-level deadlock, he'll be able to discover root cause from the logs 
(even though load will hang for a while) and skip step with googling and 
debugging.
2. Almost every system with transactions has timeout enabled by default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Best way to re-encrypt existing data (TDE cache key rotation).

2020-05-22 Thread Ivan Rakov
Folks,

Just keeping you informed: I and my colleagues are highly interested in TDE
in general and keys rotations specifically, but we don't have enough time
so far.
We'll dive into this feature and participate in reviews next month.

--
Best Regards,
Ivan Rakov

On Sun, May 17, 2020 at 10:51 PM Pavel Pereslegin  wrote:

> Hello, Alexey.
>
> > is the encryption key for the data the same on all nodes in the cluster?
> Yes, each encrypted cache group has its own encryption key, the key is
> the same on all nodes.
>
> > Clearly, during the re-encryption there will exist pages
> > encrypted with both new and old keys at the same time.
> Yes, there will be pages encrypted with different keys at the same time.
> Currently, we only store one key for one cache group. To rotate a key,
> at a certain point in time it is necessary to support several keys (at
> least for reading the WAL).
> For the "in place" strategy, we'll store the encryption key identifier
> on each encrypted page (we currently have some unused space on
> encrypted page, so I don't expect any memory overhead here). Thus, we
> will have several keys for reading and one key for writing. I assume
> that the old key will be automatically deleted when a specific WAL
> segment is deleted (and re-encryption is finished).
>
> > Will a node continue to re-encrypt the data after it restarts?
> Yes.
>
> > If a node goes down during the re-encryption, but the rest of the
> > cluster finishes re-encryption, will we consider the procedure complete?
> I'm not sure, but it looks like the key rotation is complete when we
> set the new key on all nodes so that the updates will be encrypted
> with the new key (as required by PCI DSS).
> Status of re-encryption can be obtained separately (locally or cluster
> wide).
>
> I forgot to mention that with “in place” re-encryption it will be
> impossible to quickly cancel re-encryption, because by canceling we
> mean re-encryption with the old key.
>
> > How do you see the whole key rotation procedure will work?
> Initial design for re-encryption with "partition copying" is described
> here [1]. I'll prepare detailed design for "in place" re-encryption if
> we'll go this way. In short, send the new encryption key cluster-wide,
> each node adds a new key and starts background re-encryption.
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652384#TDE.Phase-3.Cachekeyrotation.-Copywithre-encryptiondesign
> .
>
> вс, 17 мая 2020 г. в 18:35, Alexey Goncharuk :
> >
> > Pavel, Anton,
> >
> > How do you see the whole key rotation procedure will work? Clearly,
> during
> > the re-encryption there will exist pages encrypted with both new and old
> > keys at the same time. Will a node continue to re-encrypt the data after
> it
> > restarts? If a node goes down during the re-encryption, but the rest of
> the
> > cluster finishes re-encryption, will we consider the procedure complete?
> By
> > the way, is the encryption key for the data the same on all nodes in the
> > cluster?
> >
> > чт, 14 мая 2020 г. в 11:30, Anton Vinogradov :
> >
> > > +1 to "In place re-encryption".
> > >
> > > - It has a simple design.
> > > - Clusters under load may require just load to re-encrypt the data.
> > > (Friendly to load).
> > > - Easy to throttle.
> > > - Easy to continue.
> > > - Design compatible with the multi-key architecture.
> > > - It can be optimized to use own WAL buffer and to re-encrypt pages
> without
> > > restoring them to on-heap.
> > >
> > > On Thu, May 14, 2020 at 1:54 AM Pavel Pereslegin 
> wrote:
> > >
> > > > Hello Igniters.
> > > >
> > > > Recently, master key rotation for Apache Ignite Transparent Data
> > > > Encryption was implemented [1], but some security standards (PCI DSS
> > > > at least) require rotation of all encryption keys [2]. Currently,
> > > > encryption occurs when reading/writing pages to disk, cache
> encryption
> > > > keys are stored in metastore.
> > > >
> > > > I'm going to contribute cache encryption key rotation and want to
> > > > consult what is the best way to re-encrypting existing data, I see
> two
> > > > different strategies.
> > > >
> > > > 1. In place re-encryption:
> > > > Using the old key, sequentially read all the pages from the
> datastore,
> > > > mark as dirty and log them into the WAL. After checkpoint pages will
> > > > be stored to disk encrypted with the new

[jira] [Created] (IGNITE-13052) Calculate result of reserveHistoryForExchange in advance

2020-05-21 Thread Ivan Rakov (Jira)
Ivan Rakov created IGNITE-13052:
---

 Summary: Calculate result of reserveHistoryForExchange in advance
 Key: IGNITE-13052
 URL: https://issues.apache.org/jira/browse/IGNITE-13052
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Rakov


Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate it's result only on first PME 
(ideally, even before first PME, on recovery stage), keep resulting map {grpId, 
partId -> earlisetCheckpoint} on heap and update it if necessary. From the 
first glance, map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changed its state to non-OWNING, it should removed from the map. 
If checkpoint is marked as inapplicable for a certain group, the whole group 
should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.

Memory overhead of storing described map on heap in significant. It's size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Proposal: set default transaction timeout to 5 minutes

2020-05-19 Thread Ivan Rakov
>
> Support this idea in general but why 5 minutes and not less?

This value looks to me greater than any value that can possibly affect
existing deployments (existing long transactions may suddenly start to
rollback), but less than reaction time of users that are only starting to
get along with Ignite and suddenly experience TX deadlock.

--
Best Regards,
Ivan Rakov

On Tue, May 19, 2020 at 10:31 AM Anton Vinogradov  wrote:

> +1
>
> On Mon, May 18, 2020 at 9:45 PM Sergey Antonov 
> wrote:
>
> > +1
> >
> > пн, 18 мая 2020 г. в 21:26, Andrey Mashenkov  >:
> >
> > > +1
> > >
> > > On Mon, May 18, 2020 at 9:19 PM Ivan Rakov 
> > wrote:
> > >
> > > > Hi Igniters,
> > > >
> > > > I have a very simple proposal. Let's set default TX timeout to 5
> > minutes
> > > > (right now it's 0 = no timeout).
> > > > Pros:
> > > > 1. Deadlock detection procedure is triggered on timeout. In case user
> > > will
> > > > get into key-level deadlock, he'll be able to discover root cause
> from
> > > the
> > > > logs (even though load will hang for a while) and skip step with
> > googling
> > > > and debugging.
> > > > 2. Almost every system with transactions has timeout enabled by
> > default.
> > > >
> > > > WDYT?
> > > >
> > > > --
> > > > Best Regards,
> > > > Ivan Rakov
> > > >
> > >
> > >
> > > --
> > > Best regards,
> > > Andrey V. Mashenkov
> > >
> >
> >
> > --
> > BR, Sergey Antonov
> >
>


Proposal: set default transaction timeout to 5 minutes

2020-05-18 Thread Ivan Rakov
Hi Igniters,

I have a very simple proposal. Let's set default TX timeout to 5 minutes
(right now it's 0 = no timeout).
Pros:
1. Deadlock detection procedure is triggered on timeout. In case user will
get into key-level deadlock, he'll be able to discover root cause from the
logs (even though load will hang for a while) and skip step with googling
and debugging.
2. Almost every system with transactions has timeout enabled by default.

WDYT?

--
Best Regards,
Ivan Rakov


Re: [ANNOUNCE] New Committer: Taras Ledkov

2020-05-12 Thread Ivan Rakov
Taras,

Congratulations and welcome!

On Tue, May 12, 2020 at 8:26 PM Denis Magda  wrote:

> Taras,
>
> Welcome, that was long overdue on our part! Hope to see you soon among the
> PMC group.
>
> -
> Denis
>
>
> On Tue, May 12, 2020 at 9:09 AM Dmitriy Pavlov  wrote:
>
> > Hello Ignite Community,
> >
> >
> >
> > The Project Management Committee (PMC) for Apache Ignite has invited
> Taras
> > Ledkov to become a committer and we are pleased to announce that he has
> > accepted.
> >
> >
> > Taras is an Ignite SQL veteran who knows in detail current Ignite - H2
> > integration and binary serialization, actively participates in JDBC and
> > thin client protocol development, he is eager to help users on the user
> > list within his area of expertise.
> >
> >
> >
> > Being a committer enables easier contribution to the project since there
> is
> > no need to go via the patch submission process. This should enable better
> > productivity.
> >
> >
> >
> > Taras, thank you for all your efforts, congratulations and welcome on
> > board!
> > .
> >
> >
> >
> > Best Regards,
> >
> > Dmitriy Pavlov
> >
> > on behalf of Apache Ignite PMC
> >
>


Re: [DISCUSS] Apache URL for TC bot

2020-05-12 Thread Ivan Rakov
Ivan,

Agree.
Mail notifications can be temporarily turned off in configuration of the
new bot.

On Tue, May 12, 2020 at 3:12 PM Ivan Pavlukhin  wrote:

> Having bot deployed in open/free (and reliable) infrastructure sounds
> great! One precaution which seems important to me though is avoidance
> of duplicate (or even controversial) notifications from 2 bots at the
> same time.
>
> Best regards,
> Ivan Pavlukhin
>
> вт, 12 мая 2020 г. в 15:06, Ivan Rakov :
> >
> > Hi,
> >
> > I've created an INFRA ticket [1] for forwarding requests from "
> > mtcga.ignite.apache.org" to the server where TC bot is hosted [1].
> > Definitely, I wouldn't object if anyone will deploy TC bot to the public
> > cloud. We can live with two bots for a while, and then start using a
> public
> > bot after it accumulates enough build history to grant VISAs. If anyone
> is
> > interested, please check TC bot homepage on github with setup guide [2].
> > <https://github.com/apache/ignite-teamcity-bot>
> >
> > [1]: https://issues.apache.org/jira/browse/INFRA-20257
> > [2]: https://github.com/apache/ignite-teamcity-bot
> >
> > --
> > Best Regards,
> > Ivan Rakov
> >
> > On Tue, May 12, 2020 at 12:44 PM Ilya Kasnacheev <
> ilya.kasnach...@gmail.com>
> > wrote:
> >
> > > Hello!
> > >
> > > It would be nice if somebody would try to bring up a parallel
> deployment of
> > > MTCGA bot on Apache domain.
> > >
> > > This way people will have a choice of using "old" or "new" bot, and
> they we
> > > may decide of sticking to one of them.
> > >
> > > Regards,
> > > --
> > > Ilya Kasnacheev
> > >
> > >
> > > пн, 11 мая 2020 г. в 18:37, Maxim Muzafarov :
> > >
> > > > Ivan,
> > > >
> > > >
> > > > Good idea.
> > > > +1 to have the right domain name.
> > > >
> > > > I can imagine that we can go even further and completely move TC.Bot
> > > > to some public cloud storage. For example, Amazon can provide
> > > > promotional credits for open source projects [1].
> > > >
> > > >
> > > > [1]
> > > >
> > >
> https://aws.amazon.com/blogs/opensource/aws-promotional-credits-open-source-projects/
> > > >
> > > > On Mon, 11 May 2020 at 11:35, Ivan Pavlukhin 
> > > wrote:
> > > > >
> > > > > Igniters,
> > > > >
> > > > > As you might know currently TC bot has a domain name in a GridGain
> > > > > domain [1]. What do you think should we assign a name in an Apache
> > > > > domain to the bot?
> > > > >
> > > > > [1] https://mtcga.gridgain.com/
> > > > >
> > > > > Best regards,
> > > > > Ivan Pavlukhin
> > > >
> > >
>


Re: [DISCUSS] Apache URL for TC bot

2020-05-12 Thread Ivan Rakov
Hi,

I've created an INFRA ticket [1] for forwarding requests from "
mtcga.ignite.apache.org" to the server where TC bot is hosted [1].
Definitely, I wouldn't object if anyone will deploy TC bot to the public
cloud. We can live with two bots for a while, and then start using a public
bot after it accumulates enough build history to grant VISAs. If anyone is
interested, please check TC bot homepage on github with setup guide [2].
<https://github.com/apache/ignite-teamcity-bot>

[1]: https://issues.apache.org/jira/browse/INFRA-20257
[2]: https://github.com/apache/ignite-teamcity-bot

--
Best Regards,
Ivan Rakov

On Tue, May 12, 2020 at 12:44 PM Ilya Kasnacheev 
wrote:

> Hello!
>
> It would be nice if somebody would try to bring up a parallel deployment of
> MTCGA bot on Apache domain.
>
> This way people will have a choice of using "old" or "new" bot, and they we
> may decide of sticking to one of them.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> пн, 11 мая 2020 г. в 18:37, Maxim Muzafarov :
>
> > Ivan,
> >
> >
> > Good idea.
> > +1 to have the right domain name.
> >
> > I can imagine that we can go even further and completely move TC.Bot
> > to some public cloud storage. For example, Amazon can provide
> > promotional credits for open source projects [1].
> >
> >
> > [1]
> >
> https://aws.amazon.com/blogs/opensource/aws-promotional-credits-open-source-projects/
> >
> > On Mon, 11 May 2020 at 11:35, Ivan Pavlukhin 
> wrote:
> > >
> > > Igniters,
> > >
> > > As you might know currently TC bot has a domain name in a GridGain
> > > domain [1]. What do you think should we assign a name in an Apache
> > > domain to the bot?
> > >
> > > [1] https://mtcga.gridgain.com/
> > >
> > > Best regards,
> > > Ivan Pavlukhin
> >
>


Re: Extended logging for rebalance performance analysis

2020-05-06 Thread Ivan Rakov
Hi,

IGNITE_WRITE_REBALANCE_PARTITION_DISTRIBUTION_THRESHOLD - threshold
> duration rebalance of cache group after which partitions distribution is
> output, set in milliseconds, default value is 10 minutes.

 Does it mean that if the rebalancing process took less than 10 minutes,
only a short version of the message (with supplier statistics) will show up?

In general, I have no objections.


On Mon, May 4, 2020 at 10:38 AM ткаленко кирилл 
wrote:

> Hi, Igniters!
>
> I'd like to share a new small feature in AI [1].
>
> Current rebalance logging does not allow you to quickly answer following
> questions:
> 1)How long was the balance(divided by supplier)?
> 2)How many records and bytes per supplier were rebalanced?
> 3)How many times did rebalance restart?
> 4)Which partitions were rebalanced and from which nodes did they receive
> them?
> 5)When did rebalance for all cache groups end?
>
> What you can see in logs now:
>
> 1)Starting rebalance with order of cache groups.
> Rebalancing scheduled [order=[ignite-sys-cache, grp1, grp0],
> top=AffinityTopologyVersion [topVer=2, minorTopVer=0], force=false,
> evt=NODE_JOINED, node=c2146a04-dc23-4bc9-870d-dfbb55c1]
>
> 2)Start rebalance of cache group from a specific supplier, specifying
> partition ids and mode - historical or full.
> Starting rebalance routine [ignite-sys-cache,
> topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0],
> supplier=8c525892-703b-4fc4-b28b-b2f13970, fullPartitions=[0-99],
> histPartitions=[]]
>
> 3)Getting partial or complete partitions of cache group.
> Completed rebalancing [grp=ignite-sys-cache,
> supplier=8c525892-703b-4fc4-b28b-b2f13970,
> topVer=AffinityTopologyVersion [topVer=5, minorTopVer=0], progress=1/2]
> Completed (final) rebalancing [grp=ignite-sys-cache,
> supplier=c2146a04-dc23-4bc9-870d-dfbb55c1,
> topVer=AffinityTopologyVersion [topVer=5, minorTopVer=0], progress=2/2]
>
> 4)End rebalance of cache group.
> Completed rebalance future: RebalanceFuture [grp=CacheGroupContext
> [grp=ignite-sys-cache], topVer=AffinityTopologyVersion [topVer=2,
> minorTopVer=0], rebalanceId=1, routines=1, receivedBytes=1200,
> receivedKeys=0, partitionsLeft=0, startTime=1588519707607, endTime=-1,
> lastCancelledTime=-1]
>
> Rebalance statistics:
>
> To speed up rebalance analysis, statistics will be output for each cache
> group and total for all cache groups.
> If duration rebalance for cache group is greater than threshold value,
> partition distribution is output.
> Statistics will you to analyze duration of the balance for each supplier
> to understand which of them has been transmitting data for longest time.
>
> System properties are used to output statistics:
>
> IGNITE_QUIET - to output statistics, value must be false;
> IGNITE_WRITE_REBALANCE_PARTITION_DISTRIBUTION_THRESHOLD - threshold
> duration rebalance of cache group after which partitions distribution is
> output, set in milliseconds, default value is 10 minutes.
>
> Statistics examples:
>
> Successful full and historical rebalance of group cache, without
> partitions distribution.
> Rebalance information per cache group (successful rebalance): [id=3181548,
> name=grp1, startTime=2020-04-13 10:55:16,117, finishTime=2020-04-13
> 10:55:16,127, d=10 ms, restarted=0] Supplier statistics: [nodeId=0, p=5,
> d=10 ms] [nodeId=1, p=5, d=10 ms] Aliases: p - partitions, e - entries, b -
> bytes, d - duration, h - historical, nodeId mapping
> (nodeId=id,consistentId) [0=rebalancing.RebalanceStatisticsTest1]
> [1=rebalancing.RebalanceStatisticsTest0]
> Rebalance information per cache group (successful rebalance): [id=3181547,
> name=grp0, startTime=2020-04-13 15:01:44,000, finishTime=2020-04-13
> 15:01:44,116, d=116 ms, restarted=0] Supplier statistics: [nodeId=0, hp=10,
> he=300, hb=30267, d=116 ms] Aliases: p - partitions, e - entries, b -
> bytes, d - duration, h - historical, nodeId mapping
> (nodeId=id,consistentId) [0=rebalancing.RebalanceStatisticsTest0]
>
> Successful full and historical rebalance of group cache, with partitions
> distribution.
> Rebalance information per cache group (successful rebalance): [id=3181548,
> name=grp1, startTime=2020-04-13 10:55:16,117, finishTime=2020-04-13
> 10:55:16,127, d=10 ms, restarted=0] Supplier statistics: [nodeId=0, p=5,
> d=10 ms] [nodeId=1, p=5, d=10 ms] Aliases: p - partitions, e - entries, b -
> bytes, d - duration, h - historical, nodeId mapping
> (nodeId=id,consistentId) [0=rebalancing.RebalanceStatisticsTest1]
> [1=rebalancing.RebalanceStatisticsTest0] Rebalance duration was greater
> than 5 ms, printing detailed information about partitions distribution
> (threshold can be changed by setting number of milliseconds into
> IGNITE_WRITE_REBALANCE_PARTITION_DISTRIBUTION_THRESHOLD) 0 =
> [0,bu,su],[1,bu],[2,pr,su] 1 = [0,bu,su],[1,bu],[2,pr,su] 2 =
> [0,bu,su],[1,bu],[2,pr,su] 3 = [0,bu,su],[1,bu],[2,pr,su] 4 =
> [0,bu,su],[1,bu],[2,pr,su] 5 = [0,bu,su],[1,bu],[2,pr,su] 6 =
> [0,bu,su],[1,bu],[2,pr,su] 7 = 

Re: Apache Ignite 2.8.1 RELEASE [Time, Scope, Manager]

2020-04-17 Thread Ivan Rakov
Hi,

I suggest to include these fixes into 2.8.1 release:
https://issues.apache.org/jira/browse/IGNITE-12101
https://issues.apache.org/jira/browse/IGNITE-12651

On Fri, Apr 17, 2020 at 11:32 AM Ivan Pavlukhin  wrote:

> Hi folks,
>
> A side note from an external spectator. Should not we reflect on the
> release page [1] who is a release manager?
>
> [1] https://cwiki.apache.org/confluence/display/IGNITE/Apache+Ignite+2.8.1
>
> Best regards,
> Ivan Pavlukhin
>
> пт, 17 апр. 2020 г. в 11:11, Nikolay Izhikov :
> >
> > Hello, Igniters.
> >
> > I’ve added all tickets proposed in this thread to 2.8.1 scope [1]
> > For now we have
> >
> > 61 resolved tickets.
> > 19 unresolved tickets.
> >
> >
> >
> > [1]
> https://cwiki.apache.org/confluence/display/IGNITE/Apache+Ignite+2.8.1
> >
> > > 17 апр. 2020 г., в 01:38, Alex Plehanov 
> написал(а):
> > >
> > > Hello guys,
> > >
> > > I propose to cherry-pick to 2.8.1 following bug-fixes too, which are
> > > already in master:
> > > Scan query over an evicted partition can cause node failure
> (IGNITE-12734
> > > [1])
> > > Java thin client: There were problems with deserialization of some
> types on
> > > the client-side, these types can't be used (IGNITE-12624
> > > [2], IGNITE-12468 [3])
> > > Java thin client: Thread doesn't stop properly on client close when
> > > partition awareness is enabled, this prevents main() method from
> exiting
> > > (IGNITE-12743 [4])
> > >
> > > Also, there is a performance fix for checkpoint read lock, which I
> propose
> > > to cherry-pick too (IGNITE-12491 [5]). This fix brings significant
> > > performance boost on environments with a large number of CPUs (there
> was
> > > some drop on such environments introduced in 2.8.0 for all
> transactional
> > > operations after IGNITE-12593 fixing)
> > >
> > > WDYT?
> > >
> > > [1]: https://issues.apache.org/jira/browse/IGNITE-12734
> > > [2]: https://issues.apache.org/jira/browse/IGNITE-12624
> > > [3]: https://issues.apache.org/jira/browse/IGNITE-12468
> > > [4]: https://issues.apache.org/jira/browse/IGNITE-12743
> > > [5]: https://issues.apache.org/jira/browse/IGNITE-12491
> > >
> > > чт, 16 апр. 2020 г. в 18:48, Maxim Muzafarov :
> > >
> > >> Nikolay,
> > >>
> > >> Probably, we should not wait for all blocker issues in minor bug-fix
> > >> releases except very special cases. I think we should release all
> > >> accumulated bug-fixes `as is` and schedule the next 2.8.2 release.
> > >> This will allow as to have shorten minor releases.
> > >>
> > >> On Thu, 16 Apr 2020 at 18:17, Nikolay Izhikov 
> wrote:
> > >>>
> > >>> Hello, Igniters.
> > >>>
> > >>> I’m started to work on this 2.8.1 release [1]
> > >>>
> > >>> Resolved issues for release(28) - [2]
> > >>> Unresolved issues for release(30) - [3]
> > >>>
> > >>> My next step:
> > >>>
> > >>> 1. I want to double-check that all commits for the tickets with the
> > >> fixVersion=2.8.1 presented in corresponding release branch.
> > >>> And cherry-pick losted changes.
> > >>>
> > >>> 2. I want to reduce the scope of the release and exclude tickets
> that is
> > >> not ready for now.
> > >>>
> > >>> As you may know, 2.8.1 is a bug fix release.
> > >>> Therefore, I think we can wait only for a blocker issues.
> > >>>
> > >>> What do you think?
> > >>>
> > >>> [1]
> > >>
> https://cwiki.apache.org/confluence/display/IGNITE/Apache+Ignite+2.8.1
> > >>> [2]
> > >>
> https://issues.apache.org/jira/issues/?jql=(project%20%3D%20%27Ignite%27%20AND%20fixVersion%20is%20not%20empty%20AND%20fixVersion%20in%20(%272.8.1%27))%20AND%20(component%20is%20EMPTY%20OR%20component%20not%20in%20(documentation))%20and%20status%20in%20(%27CLOSED%27%2C%20%27RESOLVED%27)%20ORDER%20BY%20priority%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20
> > >>> [3]
> > >>
> https://issues.apache.org/jira/issues/?jql=(project%20%3D%20%27Ignite%27%20AND%20fixVersion%20is%20not%20empty%20AND%20fixVersion%20in%20(%272.8.1%27))%20AND%20(component%20is%20EMPTY%20OR%20component%20not%20in%20(documentation))%20%20and%20status%20not%20in%20(%27CLOSED%27%2C%20%27RESOLVED%27)%20ORDER%20BY%20priority%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20
> > >>>
> > >>>
> >  8 апр. 2020 г., в 20:15, Вячеслав Коптилин <
> slava.kopti...@gmail.com>
> > >> написал(а):
> > 
> >  Folks,
> > 
> >  I'd like to add ticket IGNITE-12805 "NullPointerException on node
> > >> restart
> >  when 3rd party persistence and Ignite native persistence are used"
> to
> >  ignite-2.8.1 scope.
> > 
> >  [1]  https://issues.apache.org/jira/browse/IGNITE-12805
> > 
> >  Thanks,
> >  S.
> > 
> >  вт, 7 апр. 2020 г. в 19:57, Ilya Kasnacheev <
> ilya.kasnach...@gmail.com
> > >>> :
> > 
> > > Hello!
> > >
> > > Done!
> > >
> > > Regards,
> > > --
> > > Ilya Kasnacheev
> > >
> > >
> > > вт, 7 апр. 2020 г. в 12:31, Sergey :
> > >
> > >> Hi,
> > >>
> > >> I'm proposing to add
> > >> 

Re: [DISCUSSION] Major changes in Ignite in 2020

2020-04-10 Thread Ivan Rakov
Hi everyone!

Major changes that are going to be contributed from our side:
- https://issues.apache.org/jira/browse/IGNITE-11704 - keeping tombstones
for removed entries to make rebalance consistent (this problem is solved by
on-heap deferred deletes queue so far).
- https://issues.apache.org/jira/browse/IGNITE-11147  - don't cancel
ongoing rebalance if affinity assignment for the rebalancing group wasn't
changed during the PME.
- Batch of other updates related to the historical rebalance. Goal is to
make historical rebalance stable and to ensure that if WAL history is
configured properly the cluster will be able to recover data consistency
via historical rebalance in case of any topology changes (including cycling
restart).
- Overhaul of partition loss handling. It has several flaws so far; the
most critical one is that by default (with PartitionLossPolicy.IGNORE)
Ignite may silently lose data. Also, (PartitionLossPolicy.IGNORE) is
totally inapplicable to scenarios when persistence is enabled and BLT is
established. Also, even safe policies have bugs: LOST state is reset when
node rejoins the cluster, so data actually can be lost even with safe
policy. We are going to set safe policy as default and fix related bugs.
- Distributed tracing (via OpenCensus). Discovery, communication and
transactions will be covered.

On Fri, Apr 10, 2020 at 11:43 AM Anton Kalashnikov 
wrote:

> My top priorities:
> * Cache warm-up - loading data from disk to memory before the join to
> cluster -
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-40+Cache+warm-up
> * PDS Defragmentation - possibility to free up space on disc after
> removing entries
>
>
> --
> Best regards,
> Anton Kalashnikov
>
>
>
> 20.03.2020, 10:19, "Pavel Tupitsyn" :
>
> My top priorities:
>
>- Thin Client API extension: Compute, Continuous Queries, Services
>- .NET Near Cache: soon to come in Thick API, to be investigated for
>Thin Clients
>- .NET Modernization for Ignite 3.0: drop legacy .NET Framework
>support, target .NET Standard 2.0, add nullable annotations to the API
>
>
> On Fri, Mar 20, 2020 at 5:23 AM Saikat Maitra 
> wrote:
>
> Hi Denis,
>
> Thank you for sharing the list of top changes. The list looks good.
>
> I wanted to share that efforts regarding IEP-36 is already underway and
> there are also open PRs under review and working through review feedback.
> One of the area that we are focussing is first we will merge changes in
> ignite-extensions repo before removing the specific migrated module from
> ignite repo.
>
> There are also contribution from community on bug fixes in
> ignite-extensions repo as well which we are verifying and merging in
> ignite-extensions repo after running through CI pipeline in teamcity.
>
> I like the focus area on docs and I really like the Apache Ignite Usecases
> page https://ignite.apache.org/provenusecases.html,  I would like to
> suggest if we can add a page like powered by Apache Ignite and list few Org
> who are already using Apache Ignite in prod.
>
> Something similar to this page https://flink.apache.org/poweredby.html
>
> Regards,
> Saikat
>
>
>
>
>
>
> On Thu, Mar 19, 2020 at 1:44 PM Denis Magda  wrote:
>
> My top list of changes is as follows:
>
>- Feature: New lightweight Apache Ignite website with advanced search
>engine optimizations and updated technical content. Why? Much better
>discoverability of Ignite via search engines like Google to let many more
>application developers learn about Ignite existence. This change is to be
>brought to live soon:
>
> http://apache-ignite-developers.2346864.n4.nabble.com/Ignite-Website-New-Look-td46324.html
>
>
>- Feature: New Ignite documentation on a new platform and with a new
>structure. Why? Ignite documentation has to help new application developers
>to get up and running as quickly as possible, it also has to become a
>primary source that answers most of the questions. Our current docs have a
>lot of gaps: https://issues.apache.org/jira/browse/IGNITE-7595
>
>
>- Process Change: to be successful with the point above, documentation
>should be created/updated before we close a JIRA ticket for
>code/API/feature contribution. Why? First, application developers learn
>Ignite and create their Ignite-apps referring to API reference and
>technical documentation (and not to the source code), thus, documentation
>needs to be treated as an integral part of the whole project. Second, while
>writing a new documentation paragraph we could discover incompleteness of a
>fix/feature or usability issues before the change is released publicly.
>
>
>- Feature: complete the modularization project by defining the Ignite
>core that will be released separately from Ignite extensions. The 'why' is
>written here:
>https://cwiki.apache.org/confluence/display/IGNITE/IEP-36%3A+Modularization
>
> -
> Denis
>
>
> On Thu, Mar 19, 2020 at 11:21 AM Denis Magda  

Re: Data vanished from cluster after INACTIVE/ACTIVE switch

2020-04-01 Thread Ivan Rakov
I don't think that making javadocs more descriptive can be considered as
harmful code base enlargement.
I'd recommend to extend the docs, but the last word is yours ;)

On Tue, Mar 31, 2020 at 2:44 PM Vladimir Steshin  wrote:

> Ivan, hi.
>
> I absolutely agree this particular description is not enough to see the
> deactivation issue. I also vote for brief code.
>
> There are about 15 places in inner logic with this description. I
> propose balance between code base size and comment completeness.
>
> Should we enlarge code even if we got several full descriptions?
>
>
> 30.03.2020 20:02, Ivan Rakov пишет:
> > Vladimir,
> >
> > @param forceDeactivation If {@code true}, cluster deactivation will be
> >> forced.
> > It's true that it's possible to infer semantics of forced deactivation
> from
> > other parts of API. I just wanted to highlight that exactly this
> > description explains something that can be guessed by the parameter name.
> > I suppose to shorten the lookup path and shed a light on deactivation
> > semantics a bit:
> >
> >> @param forceDeactivation If {@code true}, cluster will be deactivated
> even
> >> if running in-memory caches are present. All data in the corresponding
> >> caches will be vanished as a result.
> > Does this make sense?
> >
> > On Fri, Mar 27, 2020 at 12:00 PM Vladimir Steshin 
> > wrote:
> >
> >> Ivan, hi.
> >>
> >>
> >> 1) >>> Is it correct? If we are on the same page, let's proceed this way
> >>
> >> It is correct.
> >>
> >>
> >> 2) - In many places in the code I can see the following javadoc
> >>
> >>>@param forceDeactivation If {@code true}, cluster deactivation will
> be
> >> forced.
> >>
> >> In the internal params/flags. You can also find /@see
> >> ClusterState#INACTIVE/ and full description with several public APIs (
> >> like /Ignite.active(boolean)/ ):
> >>
> >> //
> >>
> >> /* /
> >>
> >> //
> >>
> >> /* NOTE:/
> >>
> >> //
> >>
> >> /* Deactivation clears in-memory caches (without persistence) including
> >> the system caches./
> >>
> >> Should be enough. Is not?
> >>
> >>
> >> 27.03.2020 10:51, Ivan Rakov пишет:
> >>> Vladimir, Igniters,
> >>>
> >>> Let's emphasize our final plan.
> >>>
> >>> We are going to add --force flags that will be necessary to pass for a
> >>> deactivation if there are in-memory caches to:
> >>> 1) Rest API (already implemented in [1])
> >>> 2) Command line utility (already implemented in [1])
> >>> 3) JMX bean (going to be implemented in [2])
> >>> We are *not* going to change IgniteCluster or any other thick Java API,
> >>> thus we are *not* going to merge [3].
> >>> We plan to *fully rollback* [1] and [2] once cache data survival after
> >>> activation-deactivation cycle will be implemented.
> >>>
> >>> Is it correct? If we are on the same page, let's proceed this way.
> >>> I propose to:
> >>> - Create a JIRA issue for in-memory-data-safe deactivation (possibly,
> >>> without IEP and detailed design so far)
> >>> - Describe in the issue description what exact parts of API should be
> >>> removed under the issue scope.
> >>>
> >>> Also, a few questions on already merged [1]:
> >>> - We have removed GridClientClusterState#state(ClusterState) from Java
> >>> client API. Is it a legitimate thing to do? Don't we have to support
> API
> >>> compatibility for thin clients as well?
> >>> - In many places in the code I can see the following javadoc
> >>>
> >>>>@param forceDeactivation If {@code true}, cluster deactivation will
> >> be forced.
> >>>> As for me, this javadoc doesn't clarify anything. I'd suggest to
> >> describe
> >>> in which cases deactivation won't happen unless it's forced and which
> >>> impact forced deactivation will bring on the system.
> >>>
> >>> [1]: https://issues.apache.org/jira/browse/IGNITE-12701
> >>> [2]: https://issues.apache.org/jira/browse/IGNITE-12779
> >>> [3]: https://issues.apache.org/jira/browse/IGNITE-12614
> >>>
> >>> --
> >>> Ivan
> >>>
> >>> On Tue, Mar 24, 2020 at 7:18 PM Vladimir Steshin 
> >> wrote:
&

Re: Data vanished from cluster after INACTIVE/ACTIVE switch

2020-03-30 Thread Ivan Rakov
Vladimir,

@param forceDeactivation If {@code true}, cluster deactivation will be
> forced.

It's true that it's possible to infer semantics of forced deactivation from
other parts of API. I just wanted to highlight that exactly this
description explains something that can be guessed by the parameter name.
I suppose to shorten the lookup path and shed a light on deactivation
semantics a bit:

> @param forceDeactivation If {@code true}, cluster will be deactivated even
> if running in-memory caches are present. All data in the corresponding
> caches will be vanished as a result.

Does this make sense?

On Fri, Mar 27, 2020 at 12:00 PM Vladimir Steshin 
wrote:

> Ivan, hi.
>
>
> 1) >>> Is it correct? If we are on the same page, let's proceed this way
>
> It is correct.
>
>
> 2) - In many places in the code I can see the following javadoc
>
> >   @param forceDeactivation If {@code true}, cluster deactivation will be
> forced.
>
> In the internal params/flags. You can also find /@see
> ClusterState#INACTIVE/ and full description with several public APIs (
> like /Ignite.active(boolean)/ ):
>
> //
>
> /* /
>
> //
>
> /* NOTE:/
>
> //
>
> /* Deactivation clears in-memory caches (without persistence) including
> the system caches./
>
> Should be enough. Is not?
>
>
> 27.03.2020 10:51, Ivan Rakov пишет:
> > Vladimir, Igniters,
> >
> > Let's emphasize our final plan.
> >
> > We are going to add --force flags that will be necessary to pass for a
> > deactivation if there are in-memory caches to:
> > 1) Rest API (already implemented in [1])
> > 2) Command line utility (already implemented in [1])
> > 3) JMX bean (going to be implemented in [2])
> > We are *not* going to change IgniteCluster or any other thick Java API,
> > thus we are *not* going to merge [3].
> > We plan to *fully rollback* [1] and [2] once cache data survival after
> > activation-deactivation cycle will be implemented.
> >
> > Is it correct? If we are on the same page, let's proceed this way.
> > I propose to:
> > - Create a JIRA issue for in-memory-data-safe deactivation (possibly,
> > without IEP and detailed design so far)
> > - Describe in the issue description what exact parts of API should be
> > removed under the issue scope.
> >
> > Also, a few questions on already merged [1]:
> > - We have removed GridClientClusterState#state(ClusterState) from Java
> > client API. Is it a legitimate thing to do? Don't we have to support API
> > compatibility for thin clients as well?
> > - In many places in the code I can see the following javadoc
> >
> >>   @param forceDeactivation If {@code true}, cluster deactivation will
> be forced.
> >>
> >> As for me, this javadoc doesn't clarify anything. I'd suggest to
> describe
> > in which cases deactivation won't happen unless it's forced and which
> > impact forced deactivation will bring on the system.
> >
> > [1]: https://issues.apache.org/jira/browse/IGNITE-12701
> > [2]: https://issues.apache.org/jira/browse/IGNITE-12779
> > [3]: https://issues.apache.org/jira/browse/IGNITE-12614
> >
> > --
> > Ivan
> >
> > On Tue, Mar 24, 2020 at 7:18 PM Vladimir Steshin 
> wrote:
> >
> >> Hi, Igniters.
> >>
> >> I'd like to remind you that cluster can be deactivated by user with 3
> >> utilities: control.sh, *JMX and the REST*. Proposed in [1] solution is
> >> not about control.sh. It suggests same approach regardless of the
> >> utility user executes. The task touches *only* *API of the user calls*,
> >> not the internal APIs.
> >>
> >> The reasons why flag “--yes” and confirmation prompt hasn’t taken into
> >> account for control.sh are:
> >>
> >> -Various commands widely use “--yes” just to start. Even not dangerous
> >> ones require “--yes” to begin. “--force” is dedicated for *harmless
> >> actions*.
> >>
> >> -Checking of probable data erasure works after command start and
> >> “--force” may not be required at all.
> >>
> >> -There are also JMX and REST. They have no “—yes” but should work alike.
> >>
> >>   To get the deactivation safe I propose to merge last ticket with
> >> the JMX fixes [2]. In future releases, I believe, we should estimate
> >> jobs and fix memory erasure in general. For now, let’s prevent it. WDYT?
> >>
> >>
> >> [1] https://issues.apache.org/jira/browse/IGNITE-12614
> >>
> >> [2] https://issues.apache.org/jira/browse/IGNITE-12779
> >>

Re: Security Subject of thin client on remote nodes

2020-03-27 Thread Ivan Rakov
Denis,

In general, code changes look good to me. If we decide to keep security API
in its current state for a while, I highly recommend to extend its
documentation. We don't have descriptive javadocs or articles about
security API so far, so I expect that next contributors will face
difficulties in untangling security logic. Let's help them a bit.
See more details in my JIRA comment:
https://issues.apache.org/jira/browse/IGNITE-12759

On Thu, Mar 26, 2020 at 5:54 PM Ivan Rakov  wrote:

> Denis,
>
> I'll review your PR. If this issue is a subject to be included in 2.8.1 in
> emergency mode, I'm ok with the current API changes.
> Please think about driving creating IEP on security API overhaul prior to
> 2.9. I believe that you are the most suitable Ignite community member to
> drive this activity. I'd love to share some ideas as well.
>
> On Tue, Mar 24, 2020 at 2:04 PM Denis Garus  wrote:
>
>> Hi, guys!
>>
>>
>> I agree that we should rework the security API, but it can take a long
>> time.
>>
>> And currently, our users have certain impediments that are blockers for
>> their job.
>>
>> I think we have to fix bugs that IEP-41 [1] contains as soon as possible
>> to
>> support our users.
>>
>>  From my point of view, IEP-41 is the best place to track bug fixing.
>>
>>
>>
>>1.
>>
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-41%3A+Security+Context+of+thin+client+on+remote+nodes
>>
>>
>> вт, 24 мар. 2020 г. в 12:26, Ivan Rakov :
>>
>> > Alexey,
>> >
>> > That can be another version of our plan. If everyone agrees that
>> > SecurityContext and SecuritySubject should be merged, such fix of thin
>> > clients' issue will bring us closer to the final solution.
>> > Denis, what do you think?
>> >
>> > On Tue, Mar 24, 2020 at 10:38 AM Alexei Scherbakov <
>> > alexey.scherbak...@gmail.com> wrote:
>> >
>> > > Why can't we start gradually changing security API right now ?
>> > > I see no point in delaying with.
>> > > All changes will go to next 2.9 release anyway.
>> > >
>> > > My proposal:
>> > > 1. Get rid of security context. Doing this will bring security API to
>> > more
>> > > or less consistent state.
>> > > 2. Remove IEP-41 because it's no longer needed because of change [1]
>> > > 3. Propose an IEP to make security API avoid using internals.
>> > >
>> > >
>> > >
>> > > пн, 23 мар. 2020 г. в 19:53, Denis Garus :
>> > >
>> > > > Hello, Alexei, Ivan!
>> > > >
>> > > > >> Seems like security API is indeed a bit over-engineered
>> > > >
>> > > > Nobody has doubt we should do a reworking of GridSecurityProcessor.
>> > > > But this point is outside of scope thin client's problem that we are
>> > > > solving.
>> > > > I think we can create new IEP that will accumulate all ideas of
>> > Ignite's
>> > > > security improvements.
>> > > >
>> > > > >> Presence of the separate #securityContext(UUID) highlights that
>> user
>> > > > indeed should care
>> > > > >> about propagation of thin clients' contexts between the cluster
>> > nodes.
>> > > >
>> > > > I agree with Ivan. I've implemented both variants,
>> > > > and I like one with #securityContext(UUID) more.
>> > > >
>> > > > Could you please take a look at PR [1] for the issue [2]?
>> > > >
>> > > > 1. https://github.com/apache/ignite/pull/7523
>> > > > 2. https://issues.apache.org/jira/browse/IGNITE-12759
>> > > >
>> > > > пн, 23 мар. 2020 г. в 11:45, Ivan Rakov :
>> > > >
>> > > > > Alex, Denis,
>> > > > >
>> > > > > Seems like security API is indeed a bit over-engineered.
>> > > > >
>> > > > > Let's get rid of SecurityContext and use SecuritySubject instead.
>> > > > > > SecurityContext is just a POJO wrapper over
>> > > > > > SecuritySubject's
>> > > > > > org.apache.ignite.plugin.security.SecuritySubject#permissions.
>> > > > > > It's functionality can be easily moved to SecuritySubject.
>> > > > >
>> > > > > I totally agree. Both subject and context are implemented by
>> plugin
>

Re: Data vanished from cluster after INACTIVE/ACTIVE switch

2020-03-27 Thread Ivan Rakov
Vladimir, Igniters,

Let's emphasize our final plan.

We are going to add --force flags that will be necessary to pass for a
deactivation if there are in-memory caches to:
1) Rest API (already implemented in [1])
2) Command line utility (already implemented in [1])
3) JMX bean (going to be implemented in [2])
We are *not* going to change IgniteCluster or any other thick Java API,
thus we are *not* going to merge [3].
We plan to *fully rollback* [1] and [2] once cache data survival after
activation-deactivation cycle will be implemented.

Is it correct? If we are on the same page, let's proceed this way.
I propose to:
- Create a JIRA issue for in-memory-data-safe deactivation (possibly,
without IEP and detailed design so far)
- Describe in the issue description what exact parts of API should be
removed under the issue scope.

Also, a few questions on already merged [1]:
- We have removed GridClientClusterState#state(ClusterState) from Java
client API. Is it a legitimate thing to do? Don't we have to support API
compatibility for thin clients as well?
- In many places in the code I can see the following javadoc

>  @param forceDeactivation If {@code true}, cluster deactivation will be 
> forced.
>
> As for me, this javadoc doesn't clarify anything. I'd suggest to describe
in which cases deactivation won't happen unless it's forced and which
impact forced deactivation will bring on the system.

[1]: https://issues.apache.org/jira/browse/IGNITE-12701
[2]: https://issues.apache.org/jira/browse/IGNITE-12779
[3]: https://issues.apache.org/jira/browse/IGNITE-12614

--
Ivan

On Tue, Mar 24, 2020 at 7:18 PM Vladimir Steshin  wrote:

> Hi, Igniters.
>
> I'd like to remind you that cluster can be deactivated by user with 3
> utilities: control.sh, *JMX and the REST*. Proposed in [1] solution is
> not about control.sh. It suggests same approach regardless of the
> utility user executes. The task touches *only* *API of the user calls*,
> not the internal APIs.
>
> The reasons why flag “--yes” and confirmation prompt hasn’t taken into
> account for control.sh are:
>
> -Various commands widely use “--yes” just to start. Even not dangerous
> ones require “--yes” to begin. “--force” is dedicated for *harmless
> actions*.
>
> -Checking of probable data erasure works after command start and
> “--force” may not be required at all.
>
> -There are also JMX and REST. They have no “—yes” but should work alike.
>
>  To get the deactivation safe I propose to merge last ticket with
> the JMX fixes [2]. In future releases, I believe, we should estimate
> jobs and fix memory erasure in general. For now, let’s prevent it. WDYT?
>
>
> [1] https://issues.apache.org/jira/browse/IGNITE-12614
>
> [2] https://issues.apache.org/jira/browse/IGNITE-12779
>
>
> 24.03.2020 15:55, Вячеслав Коптилин пишет:
> > Hello Nikolay,
> >
> > I am talking about the interactive mode of the control utility, which
> > requires explicit confirmation from the user.
> > Please take a look at DeactivateCommand#prepareConfirmation and its
> usages.
> > It seems to me, this mode has the same aim as the forceDeactivation flag.
> > We can change the message returned by
> DeactivateCommand#confirmationPrompt
> > as follows:
> >  "Warning: the command will deactivate the cluster nnn and clear
> > in-memory caches (without persistence) including system caches."
> >
> > What do you think?
> >
> > Thanks,
> > S.
> >
> > вт, 24 мар. 2020 г. в 13:07, Nikolay Izhikov :
> >
> >> Hello, Slava.
> >>
> >> Are you talking about this commit [1] (sorry for commit message it’s due
> >> to the Github issue)?
> >>
> >> The message for this command for now
> >>
> >> «Deactivation stopped. Deactivation clears in-memory caches (without
> >> persistence) including the system caches.»
> >>
> >> Is it clear enough?
> >>
> >> [1]
> >>
> https://github.com/apache/ignite/commit/4921fcf1fecbd8a1ab02099e09cc2adb0b3ff88a
> >>
> >>
> >>> 24 марта 2020 г., в 13:02, Вячеслав Коптилин  >
> >> написал(а):
> >>> Hi Nikolay,
> >>>
>  1. We should add —force flag to the command.sh deactivation command.
> >>> I just checked and it seems that the deactivation command
> >>> (control-utility.sh) already has a confirmation option.
> >>> Perhaps, we need to clearly state the consequences of using this
> command
> >>> with in-memory caches.
> >>>
> >>> Thanks,
> >>> S.
> >>>
> >>> вт, 24 мар. 2020 г. в 12:51, Nikolay Izhikov :
> >>>
>  Hello, Alexey.
> 
>  I just repeat our agreement to be on the same page
> 
> > The confirmation should only present in the user-facing interfaces.
>  1. We should add —force flag to the command.sh deactivation command.
>  2. We should throw the exception if cluster has in-memory caches and
>  —force=false.
>  3. We shouldn’t change Java API for deactivation.
> 
>  Is it correct?
> 
> > The DROP TABLE command does not have a "yes I am sure" clause in it
>  I think it because the command itself has a «DROP» word in 

Re: Security Subject of thin client on remote nodes

2020-03-26 Thread Ivan Rakov
Denis,

I'll review your PR. If this issue is a subject to be included in 2.8.1 in
emergency mode, I'm ok with the current API changes.
Please think about driving creating IEP on security API overhaul prior to
2.9. I believe that you are the most suitable Ignite community member to
drive this activity. I'd love to share some ideas as well.

On Tue, Mar 24, 2020 at 2:04 PM Denis Garus  wrote:

> Hi, guys!
>
>
> I agree that we should rework the security API, but it can take a long
> time.
>
> And currently, our users have certain impediments that are blockers for
> their job.
>
> I think we have to fix bugs that IEP-41 [1] contains as soon as possible to
> support our users.
>
>  From my point of view, IEP-41 is the best place to track bug fixing.
>
>
>
>1.
>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-41%3A+Security+Context+of+thin+client+on+remote+nodes
>
>
> вт, 24 мар. 2020 г. в 12:26, Ivan Rakov :
>
> > Alexey,
> >
> > That can be another version of our plan. If everyone agrees that
> > SecurityContext and SecuritySubject should be merged, such fix of thin
> > clients' issue will bring us closer to the final solution.
> > Denis, what do you think?
> >
> > On Tue, Mar 24, 2020 at 10:38 AM Alexei Scherbakov <
> > alexey.scherbak...@gmail.com> wrote:
> >
> > > Why can't we start gradually changing security API right now ?
> > > I see no point in delaying with.
> > > All changes will go to next 2.9 release anyway.
> > >
> > > My proposal:
> > > 1. Get rid of security context. Doing this will bring security API to
> > more
> > > or less consistent state.
> > > 2. Remove IEP-41 because it's no longer needed because of change [1]
> > > 3. Propose an IEP to make security API avoid using internals.
> > >
> > >
> > >
> > > пн, 23 мар. 2020 г. в 19:53, Denis Garus :
> > >
> > > > Hello, Alexei, Ivan!
> > > >
> > > > >> Seems like security API is indeed a bit over-engineered
> > > >
> > > > Nobody has doubt we should do a reworking of GridSecurityProcessor.
> > > > But this point is outside of scope thin client's problem that we are
> > > > solving.
> > > > I think we can create new IEP that will accumulate all ideas of
> > Ignite's
> > > > security improvements.
> > > >
> > > > >> Presence of the separate #securityContext(UUID) highlights that
> user
> > > > indeed should care
> > > > >> about propagation of thin clients' contexts between the cluster
> > nodes.
> > > >
> > > > I agree with Ivan. I've implemented both variants,
> > > > and I like one with #securityContext(UUID) more.
> > > >
> > > > Could you please take a look at PR [1] for the issue [2]?
> > > >
> > > > 1. https://github.com/apache/ignite/pull/7523
> > > > 2. https://issues.apache.org/jira/browse/IGNITE-12759
> > > >
> > > > пн, 23 мар. 2020 г. в 11:45, Ivan Rakov :
> > > >
> > > > > Alex, Denis,
> > > > >
> > > > > Seems like security API is indeed a bit over-engineered.
> > > > >
> > > > > Let's get rid of SecurityContext and use SecuritySubject instead.
> > > > > > SecurityContext is just a POJO wrapper over
> > > > > > SecuritySubject's
> > > > > > org.apache.ignite.plugin.security.SecuritySubject#permissions.
> > > > > > It's functionality can be easily moved to SecuritySubject.
> > > > >
> > > > > I totally agree. Both subject and context are implemented by plugin
> > > > > provider, and I don't see any reason to keep both abstractions,
> > > > especially
> > > > > if we are going to get rid of transferring subject in node
> attributes
> > > > > (argument that subject is more lightweight won't work anymore).
> > > > >
> > > > > Also, there's kind of mess in node authentication logic. There are
> at
> > > > least
> > > > > two components responsible for it: DiscoverySpiNodeAuthenticator
> > (which
> > > > is
> > > > > forcibly set by GridDiscoveryManager, but in fact public) and
> > > > > GridSecurityProcessor (which performs actual node auth logic, but
> > > > private).
> > > > > I also don't understand why we need both
> > > > > #authenticate(AuthenticationContex

Re: Data vanished from cluster after INACTIVE/ACTIVE switch

2020-03-24 Thread Ivan Rakov
>
> I can’t agree with the «temporary» design.
> We have neither design nor IEP or contributor who can fix current behavior.
> And, if I understand Alexey Goncharyuk correctly, current behavior was
> implemented intentionally.

 Alex, what do you think? Are we on the same page that desired behavior for
the deactivation is to keep data of all in-memory caches, even though it
was intentionally implemented in 2.0 another way?

On Tue, Mar 24, 2020 at 12:21 PM Nikolay Izhikov 
wrote:

> Hello, Ivan.
>
> > I believe we should fix the issue instead of adapting API to temporary
> flaws.
>
> Agree. Let’s fix it.
>
> >  I think that clear description of active(false) impact in the
> documentation is more than enough
>
> I can’t agree with this point.
>
> We shouldn’t imply the assumption that every user reads the whole
> documentation and completely understand the consequences of the
> deactivation command.
>
> This whole thread shows that even active core developers don't understand
> it.
>
> So my proposal is to remove --force flag only after we fix deactivation.
>
> > To sum it up, the question is whether we should reflect temporary system
> design flaws in the API
>
> I can’t agree with the «temporary» design.
> We have neither design nor IEP or contributor who can fix current behavior.
> And, if I understand Alexey Goncharyuk correctly, current behavior was
> implemented intentionally.
>
> So, my understanding that current implementation would be here for a while.
> And after we fix it I totally support removing —force flag.
>
> > 24 марта 2020 г., в 12:06, Ivan Rakov 
> написал(а):
> >
> >>
> >> I think the only question is - Do we need —force flag in Java API or
> not.
> >
> > From my perspective, there's also no agreement that it should be present
> > in the thin clients' API. For instance, I think it shouldn't.
> >
> > As far as I know, IGNITE_REUSE_MEMORY_ON_DEACTIVATE is for *other*
> purpose.
> >> Can you provide a simple reproducer when in-memory data not cleared on
> >> deactivation?
> >
> > Preserving in-memory data isn't implemented so far, so I can't provide a
> > reproducer. My point is that we are halfway through it: we can build a
> > solution based on IGNITE_REUSE_MEMORY_ON_DEACTIVATE and additional logic
> > with reusing memory pages.
> >
> > For me, the ultimate value of Ignite into real production environment is
> >> user data.
> >> If we have some cases when data is lost - we should avoid it as hard as
> we
> >> can.
> >>
> >> So, for now, this flag required.
> >
> > Totally agree that sudden vanishing of user data is unacceptable. But I
> > don't see how it implies that we have to solve this issue by tangling
> > public API. If we see that system behaves incorrectly, I believe we
> should
> > fix the issue instead of adapting API to temporary flaws. I think that
> > clear description of active(false) impact in the documentation is more
> than
> > enough: on the one hand, if user didn't read documentation for the method
> > he calls, he can't complain about the consequences; on the other hand, if
> > user decided to deactivate the cluster for no matter what, -force flag
> will
> > barely stop him.
> > We anyway have enough time before 2.9 to implement a proper solution.
> >
> > To sum it up, the question is whether we should reflect temporary system
> > design flaws in the API. I think, we surely shouldn't: API certainly
> lives
> > longer and is not intended to collect workarounds for all bugs that are
> > already fixed or planned to be fixed.
> > We can collect more opinions on this.
> >
> > On Tue, Mar 24, 2020 at 10:22 AM Nikolay Izhikov 
> > wrote:
> >
> >> Alexey.
> >>
> >> Having the way to silently vanish user data is even worse.
> >> So I’m strictly against removing —force flag.
> >>
> >>> 24 марта 2020 г., в 10:16, Alexei Scherbakov <
> >> alexey.scherbak...@gmail.com> написал(а):
> >>>
> >>> Nikolay,
> >>>
> >>> I'm on the same page with Ivan.
> >>>
> >>> Having "force" flag in public API as preposterous as having it in
> >>> System.exit.
> >>> For me it looks like badly designed API.
> >>> If a call to some method is dangerous it should be clearly specified in
> >> the
> >>> javadoc.
> >>> I'm also against some "temporary" API.
> >>>
> >>> We should:
> >>>
> >>> 1. Parti

Re: Security Subject of thin client on remote nodes

2020-03-24 Thread Ivan Rakov
Alexey,

That can be another version of our plan. If everyone agrees that
SecurityContext and SecuritySubject should be merged, such fix of thin
clients' issue will bring us closer to the final solution.
Denis, what do you think?

On Tue, Mar 24, 2020 at 10:38 AM Alexei Scherbakov <
alexey.scherbak...@gmail.com> wrote:

> Why can't we start gradually changing security API right now ?
> I see no point in delaying with.
> All changes will go to next 2.9 release anyway.
>
> My proposal:
> 1. Get rid of security context. Doing this will bring security API to more
> or less consistent state.
> 2. Remove IEP-41 because it's no longer needed because of change [1]
> 3. Propose an IEP to make security API avoid using internals.
>
>
>
> пн, 23 мар. 2020 г. в 19:53, Denis Garus :
>
> > Hello, Alexei, Ivan!
> >
> > >> Seems like security API is indeed a bit over-engineered
> >
> > Nobody has doubt we should do a reworking of GridSecurityProcessor.
> > But this point is outside of scope thin client's problem that we are
> > solving.
> > I think we can create new IEP that will accumulate all ideas of Ignite's
> > security improvements.
> >
> > >> Presence of the separate #securityContext(UUID) highlights that user
> > indeed should care
> > >> about propagation of thin clients' contexts between the cluster nodes.
> >
> > I agree with Ivan. I've implemented both variants,
> > and I like one with #securityContext(UUID) more.
> >
> > Could you please take a look at PR [1] for the issue [2]?
> >
> > 1. https://github.com/apache/ignite/pull/7523
> > 2. https://issues.apache.org/jira/browse/IGNITE-12759
> >
> > пн, 23 мар. 2020 г. в 11:45, Ivan Rakov :
> >
> > > Alex, Denis,
> > >
> > > Seems like security API is indeed a bit over-engineered.
> > >
> > > Let's get rid of SecurityContext and use SecuritySubject instead.
> > > > SecurityContext is just a POJO wrapper over
> > > > SecuritySubject's
> > > > org.apache.ignite.plugin.security.SecuritySubject#permissions.
> > > > It's functionality can be easily moved to SecuritySubject.
> > >
> > > I totally agree. Both subject and context are implemented by plugin
> > > provider, and I don't see any reason to keep both abstractions,
> > especially
> > > if we are going to get rid of transferring subject in node attributes
> > > (argument that subject is more lightweight won't work anymore).
> > >
> > > Also, there's kind of mess in node authentication logic. There are at
> > least
> > > two components responsible for it: DiscoverySpiNodeAuthenticator (which
> > is
> > > forcibly set by GridDiscoveryManager, but in fact public) and
> > > GridSecurityProcessor (which performs actual node auth logic, but
> > private).
> > > I also don't understand why we need both
> > > #authenticate(AuthenticationContext) and #authenticateNode(ClusterNode,
> > > SecurityCredentials) methods while it's possible to set explicit
> > > SecuritySubjectType.REMOTE_NODE in AuthenticationContext (this is
> > arguable;
> > > perhaps there are strong reasons).
> > >
> > > Finally, areas of responsibility between IgniteSecurity and
> > > GridSecurityProcessor are kind of mixed. As far as I understand, the
> > first
> > > is responsible for Ignite-internal management of security logic
> (keeping
> > > thread-local context, caching security contexts, etc; we don't expect
> > > IgniteSecurity to be replaced by plugin provider) and the latter is
> > > responsible for user-custom authentication / authorization logic. To be
> > > honest, it took plenty of time to figure this out for me.
> > >
> > > From my point of view, we should make GridSecurityProcessor interface
> > > public, rename it (it requires plenty of time to find the difference
> from
> > > IgniteSecurity), make its API as simple and non-duplicating as possible
> > and
> > > clarify its area of responsibility (e.g. should it be responsible for
> > > propagation of successfully authenticated subject among all nodes or
> > not?)
> > > to make it easy to embed custom security logic in Ignite.
> > >
> > > Regarding thin clients fix: implementation made by Denis suits better
> to
> > > the very implicit contract that it's better to change API contracts of
> an
> > > internal IgniteSecurity than of internal GridSecurityProcessor (which
> > > actually mustn't be internal).
> > &

Re: Data vanished from cluster after INACTIVE/ACTIVE switch

2020-03-24 Thread Ivan Rakov
>
> I think the only question is - Do we need —force flag in Java API or not.

 From my perspective, there's also no agreement that it should be present
in the thin clients' API. For instance, I think it shouldn't.

As far as I know, IGNITE_REUSE_MEMORY_ON_DEACTIVATE is for *other* purpose.
> Can you provide a simple reproducer when in-memory data not cleared on
> deactivation?

 Preserving in-memory data isn't implemented so far, so I can't provide a
reproducer. My point is that we are halfway through it: we can build a
solution based on IGNITE_REUSE_MEMORY_ON_DEACTIVATE and additional logic
with reusing memory pages.

For me, the ultimate value of Ignite into real production environment is
> user data.
> If we have some cases when data is lost - we should avoid it as hard as we
> can.
>
> So, for now, this flag required.

Totally agree that sudden vanishing of user data is unacceptable. But I
don't see how it implies that we have to solve this issue by tangling
public API. If we see that system behaves incorrectly, I believe we should
fix the issue instead of adapting API to temporary flaws. I think that
clear description of active(false) impact in the documentation is more than
enough: on the one hand, if user didn't read documentation for the method
he calls, he can't complain about the consequences; on the other hand, if
user decided to deactivate the cluster for no matter what, -force flag will
barely stop him.
We anyway have enough time before 2.9 to implement a proper solution.

To sum it up, the question is whether we should reflect temporary system
design flaws in the API. I think, we surely shouldn't: API certainly lives
longer and is not intended to collect workarounds for all bugs that are
already fixed or planned to be fixed.
We can collect more opinions on this.

On Tue, Mar 24, 2020 at 10:22 AM Nikolay Izhikov 
wrote:

> Alexey.
>
> Having the way to silently vanish user data is even worse.
> So I’m strictly against removing —force flag.
>
> > 24 марта 2020 г., в 10:16, Alexei Scherbakov <
> alexey.scherbak...@gmail.com> написал(а):
> >
> > Nikolay,
> >
> > I'm on the same page with Ivan.
> >
> > Having "force" flag in public API as preposterous as having it in
> > System.exit.
> > For me it looks like badly designed API.
> > If a call to some method is dangerous it should be clearly specified in
> the
> > javadoc.
> > I'm also against some "temporary" API.
> >
> > We should:
> >
> > 1. Partially remove IGNITE-12701 except javadoc part. Note control.sh
> for a
> > long time has support for a confirmation on deactivation (interactive
> mode).
> > 2. IGNITE_REUSE_MEMORY_ON_DEACTIVATE=true already preserves memory
> content
> > after deactivation. We should start working on restoring page memory
> state
> > after subsequent reactivation.
> > 3. Describe current behavior for in-memory cache on deactivation in
> Ignite
> > documentation.
> >
> >
> > пн, 23 мар. 2020 г. в 21:22, Nikolay Izhikov :
> >
> >> Hello, Ivan.
> >>
> >>> Seems like we don't have a final agreement on whether we should add
> force
> >> flag to deactivation API.
> >>
> >> I think the only question is - Do we need —force flag in Java API or
> not.
> >>
> >>
> >>> As a final solution, I'd want to see behavior when all in-memory data
> is
> >> available after deactivation and further activation.
> >>
> >> Agree.
> >>
> >>> I believe it’s possible to don't deallocate memory
> >>> (like mentioned before, we already can achieve that with
> >> IGNITE_REUSE_MEMORY_ON_DEACTIVATE=true) and carefully reuse all loaded
> data
> >> pages on next activation and caches start.
> >>
> >> As far as I know, IGNITE_REUSE_MEMORY_ON_DEACTIVATE is for *other*
> purpose.
> >> Can you provide a simple reproducer when in-memory data not cleared on
> >> deactivation?
> >>
> >>> Considering this, do we really need to introduce force flag as a
> >> temporary precaution?
> >>
> >> My answer is yes we need it.
> >> Right now, we can’t prevent data loss on deactivation for in-memory
> caches.
> >>
> >> For me, the ultimate value of Ignite into real production environment is
> >> user data.
> >> If we have some cases when data is lost - we should avoid it as hard as
> we
> >> can.
> >>
> >> So, for now, this flag required.
> >>
> >>> I suggest to rollback [2] from AI master, stop working on [1] and focus
> >> on how to implement keeping in-memory d

Re: Data vanished from cluster after INACTIVE/ACTIVE switch

2020-03-23 Thread Ivan Rakov
Folks,

Let's revive this discussion until it's too late and all API changes are
merged to master [1].
Seems like we don't have a final agreement on whether we should add force
flag to deactivation API.

First of all, I think we are all on the same page that in-memory caches
data vanishing on deactivation is counter-intuitive and dangerous. As a
final solution, I'd want to see behavior when all in-memory data is
available after deactivation and further activation. I believe it's
possible to don't deallocate memory (like mentioned before, we already can
achieve that with IGNITE_REUSE_MEMORY_ON_DEACTIVATE=true) and carefully
reuse all loaded data pages on next activation and caches start.

Also, this is a wider question, but: do we understand what cluster
deactivation is actually intended for? I can only think of two cases:
- graceful cluster shutdown: an ability to cut checkpoints and to end
transactional load consistently prior to further stop of all nodes
- blocking all API (both reads and writes) due to some maintenance
Neither of them requires forcefully clearing all in-memory data on
deactivation. If everyone agrees, from now on we should assume data
clearing as system design flaw that should be fixed, not as possible
scenario which we should support on the API level.

Considering this, do we really need to introduce force flag as a temporary
precaution? I have at least two reasons against it:
1) Once API was changed and released, we have to support it until the next
major release. If we all understand that data vanishing issue is fixable, I
believe we shouldn't engrave in the API flags that will become pointless.
2) More personal one, but I'm against any force flags in the API. This
makes API harder to understand; more than that, the presence of such flags
just highlights that API is poorly designed.

I suggest to rollback [2] from AI master, stop working on [1] and focus on
how to implement keeping in-memory data after the deactivation.
I think we can still require user consent for deactivation via control.sh
(it already requires --yes) and JMX.

Thoughts?

[1]: https://issues.apache.org/jira/browse/IGNITE-12614
[2]: https://issues.apache.org/jira/browse/IGNITE-12701

--
Ivan


On Tue, Mar 17, 2020 at 2:26 PM Vladimir Steshin  wrote:

> Nikolay, I think we should reconsider clearing at least system caches
> when deactivating.
>
> 17.03.2020 14:18, Nikolay Izhikov пишет:
> > Hello, Vladimir.
> >
> > I don’t get it.
> >
> > What is your proposal?
> > What we should do?
> >
> >> 17 марта 2020 г., в 14:11, Vladimir Steshin 
> написал(а):
> >>
> >> Nikolay, hi.
> >>
> > And should be covered with the  —force parameter we added.
> >> As fix for user cases - yes. My idea is to emphasize overall ability to
> lose various objects, not only data. Probably might be reconsidered in
> future.
> >>
> >>
> >> 17.03.2020 13:49, Nikolay Izhikov пишет:
> >>> Hello, Vladimir.
> >>>
> >>> If there is at lease one persistent data region then system data
> region also becomes persistent.
> >>> Your example applies only to pure in-memory clusters.
> >>>
> >>> And should be covered with the —force parameter we added.
> >>>
> >>> What do you think?
> >>>
>  17 марта 2020 г., в 13:45, Vladimir Steshin 
> написал(а):
> 
>   Hi, all.
> 
>  Fixes for control.sh and the REST have been merged. Could anyone take
> a look to the previous email with an issue? Isn't this conductvery wierd?
> 
>


Re: Re[2]: Discuss idle_verify with moving partitions changes.

2020-03-23 Thread Ivan Rakov
Partial results are consistent though.
I'd add something like "Possible results are not full" instead.

On Mon, Mar 23, 2020 at 12:47 PM Zhenya Stanilovsky
 wrote:

>
> Guys thank for quick response, Ivan what do you think about Vlad`s
> proposal to add additional info like :
> "Possible results are not consistent due to rebalance still in progress" ?
> Thanks !
>
> >Понедельник, 23 марта 2020, 12:30 +03:00 от Ivan Rakov <
> ivan.glu...@gmail.com>:
> >
> >Zhenya,
> >
> >As for me, the current behavior of idle_verify looks correct.
> >There's no sense in checking MOVING partitions (on which we explicitly
> >inform user), however checking consistency between the rest of owners
> still
> >makes sense: they still can diverge and we can be aware of the presence of
> >the conflicts sooner.
> >In case cluster is not idle (in terms of user activities, not in terms of
> >internal cluster processes like rebalancing), utility will fail as
> expected.
> >
> >On Mon, Mar 23, 2020 at 11:23 AM Vladislav Pyatkov <
> vpyat...@gridgain.com >
> >wrote:
> >
> >> Hi Zhenya,
> >>
> >> I see your point. Need to show some message, because cluster is not idle
> >> (rebalance is going).
> >> When cluster not idle we cannot validate partitions honestly. After
> several
> >> minutes we can to get absolutely different result, without any client's
> >> operation of cache happened.
> >>
> >> May be enough showing some message more clear for end user. For example:
> >> "Result has not valid, rebalance is going."
> >>
> >> Another thing you meaning - issue in indexes, when rebalance is
> following.
> >> I think idex_validate should fail in this case, because indexes always
> in
> >> load during rebalance.
> >>
> >>
> >> On Mon, Mar 23, 2020 at 10:20 AM Zhenya Stanilovsky
> >> < arzamas...@mail.ru.invalid > wrote:
> >>
> >> >
> >> > Igniters, i found that near idle check commands only shows partitions
> in
> >> > MOVING states as info in log and not take into account this fact as
> >> > erroneous idle cluster state.
> >> > control.sh --cache idle_verify, control.sh --cache validate_indexes
> >> > --check-crc
> >> >
> >> > for example command would show something like :
> >> >
> >> > Arguments: --cache idle_verify --yes
> >> >
> >> >
> >>
> 
> >> > idle_verify task was executed with the following args: caches=[],
> >> > excluded=[], cacheFilter=[DEFAULT]
> >> > idle_verify check has finished, no conflicts have been found.
> >> > Verification was skipped for 21 MOVING partitions:
> >> > Skipped partition: PartitionKeyV2 [grpId=1544803905, grpName=default,
> >> > partId=7]
> >> > Partition instances: [PartitionHashRecordV2 [isPrimary=false,
> >> > consistentId=gridCommandHandlerTest2, updateCntr=3,
> >> partitionState=MOVING,
> >> > state=MOVING]] .. and so on
> >> >
> >> > I found this erroneous and can lead to further cluster index
> corruption,
> >> > for example in case when only command OK result checked.
> >> >
> >> > If no objections would be here, i plan to inform about moving states
> as
> >> > not OK exit code too.
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Vladislav Pyatkov
> >> Architect-Consultant "GridGain Rus" Llc.
> >>  +7-929-537-79-60
> >>
>
>
>
>


Re: Discuss idle_verify with moving partitions changes.

2020-03-23 Thread Ivan Rakov
Zhenya,

As for me, the current behavior of idle_verify looks correct.
There's no sense in checking MOVING partitions (on which we explicitly
inform user), however checking consistency between the rest of owners still
makes sense: they still can diverge and we can be aware of the presence of
the conflicts sooner.
In case cluster is not idle (in terms of user activities, not in terms of
internal cluster processes like rebalancing), utility will fail as expected.

On Mon, Mar 23, 2020 at 11:23 AM Vladislav Pyatkov 
wrote:

> Hi Zhenya,
>
> I see your point. Need to show some message, because cluster is not idle
> (rebalance is going).
> When cluster not idle we cannot validate partitions honestly. After several
> minutes we can to get absolutely different result, without any client's
> operation of cache happened.
>
> May be enough showing some message more clear for end user. For example:
> "Result has not valid, rebalance is going."
>
> Another thing you meaning - issue in indexes, when rebalance is following.
> I think idex_validate should fail in this case, because indexes always in
> load during rebalance.
>
>
> On Mon, Mar 23, 2020 at 10:20 AM Zhenya Stanilovsky
>  wrote:
>
> >
> > Igniters, i found that near idle check commands only shows partitions in
> > MOVING states as info in log and not take into account this fact as
> > erroneous idle cluster state.
> > control.sh --cache idle_verify, control.sh --cache validate_indexes
> > --check-crc
> >
> > for example command would show something like :
> >
> > Arguments: --cache idle_verify --yes
> >
> >
> 
> > idle_verify task was executed with the following args: caches=[],
> > excluded=[], cacheFilter=[DEFAULT]
> > idle_verify check has finished, no conflicts have been found.
> > Verification was skipped for 21 MOVING partitions:
> > Skipped partition: PartitionKeyV2 [grpId=1544803905, grpName=default,
> > partId=7]
> > Partition instances: [PartitionHashRecordV2 [isPrimary=false,
> > consistentId=gridCommandHandlerTest2, updateCntr=3,
> partitionState=MOVING,
> > state=MOVING]] .. and so on
> >
> > I found this erroneous and can lead to further cluster index corruption,
> > for example in case when only command OK result checked.
> >
> > If no objections would be here, i plan to inform about moving states as
> > not OK exit code too.
> >
> >
>
>
>
> --
> Vladislav Pyatkov
> Architect-Consultant "GridGain Rus" Llc.
> +7-929-537-79-60
>


Re: Security Subject of thin client on remote nodes

2020-03-23 Thread Ivan Rakov
Alex, Denis,

Seems like security API is indeed a bit over-engineered.

Let's get rid of SecurityContext and use SecuritySubject instead.
> SecurityContext is just a POJO wrapper over
> SecuritySubject's
> org.apache.ignite.plugin.security.SecuritySubject#permissions.
> It's functionality can be easily moved to SecuritySubject.

I totally agree. Both subject and context are implemented by plugin
provider, and I don't see any reason to keep both abstractions, especially
if we are going to get rid of transferring subject in node attributes
(argument that subject is more lightweight won't work anymore).

Also, there's kind of mess in node authentication logic. There are at least
two components responsible for it: DiscoverySpiNodeAuthenticator (which is
forcibly set by GridDiscoveryManager, but in fact public) and
GridSecurityProcessor (which performs actual node auth logic, but private).
I also don't understand why we need both
#authenticate(AuthenticationContext) and #authenticateNode(ClusterNode,
SecurityCredentials) methods while it's possible to set explicit
SecuritySubjectType.REMOTE_NODE in AuthenticationContext (this is arguable;
perhaps there are strong reasons).

Finally, areas of responsibility between IgniteSecurity and
GridSecurityProcessor are kind of mixed. As far as I understand, the first
is responsible for Ignite-internal management of security logic (keeping
thread-local context, caching security contexts, etc; we don't expect
IgniteSecurity to be replaced by plugin provider) and the latter is
responsible for user-custom authentication / authorization logic. To be
honest, it took plenty of time to figure this out for me.

>From my point of view, we should make GridSecurityProcessor interface
public, rename it (it requires plenty of time to find the difference from
IgniteSecurity), make its API as simple and non-duplicating as possible and
clarify its area of responsibility (e.g. should it be responsible for
propagation of successfully authenticated subject among all nodes or not?)
to make it easy to embed custom security logic in Ignite.

Regarding thin clients fix: implementation made by Denis suits better to
the very implicit contract that it's better to change API contracts of an
internal IgniteSecurity than of internal GridSecurityProcessor (which
actually mustn't be internal).

> My approach doesn't require any IEPs, just minor change in code and to
>
> org.apache.ignite.internal.processors.security.IgniteSecurity#authenticate(AuthenticationContext)
> contract.

Looks like a misuse of #authenticate method to me. It should perform
initial authentication based on credentials (this may include queries to
external authentication subsystem, e.g. LDAP). User may want to don't
authenticate thin client on every node (this will increase the number of
requests to auth subsystem unless user implicitly implements propagation of
thin clients' contexts between nodes and make #authenticate cluster-wide
idempotent: first call should perform actual authentication, next calls
should retrieve context of already authenticated client). Presence of the
separate #securityContext(UUID) highlights that user indeed should care
about propagation of thin clients' contexts between the cluster nodes.

--
Ivan

On Fri, Mar 20, 2020 at 12:22 PM Veena Mithare 
wrote:

> Hi Alexei, Denis,
>
> One of the main usecases of thin client authentication is to be able to
> audit the changes done using the thin client user.
> To enable that :
> We really need to resolve this concern as well :
> https://issues.apache.org/jira/browse/IGNITE-12781
>
> ( Incorrect security subject id is  associated with a cache_put event
> when the originator of the event is a thin client. )
>
> Regards,
> Veena
>
>
> -Original Message-
> From: Alexei Scherbakov 
> Sent: 18 March 2020 08:11
> To: dev 
> Subject: Re: Security Subject of thin client on remote nodes
>
> Denis Garus,
>
> Both variants are capable of solving the thin client security context
> problem.
>
> My approach doesn't require any IEPs, just minor change in code and to
>
> org.apache.ignite.internal.processors.security.IgniteSecurity#authenticate(AuthenticationContext)
> contract.
> We can add appropriate documentation to emphasize this.
> The argument "fragile" is not very convincing for me.
>
> I think we should collect more opinions before proceeding with IEP.
>
> Considering a fact we actually *may not care* about compatibility (I've
> already explained why), I'm thinking of another approach.
> Let's get rid of SecurityContext and use SecuritySubject instead.
> SecurityContext is just a POJO wrapper over SecuritySubject's
> org.apache.ignite.plugin.security.SecuritySubject#permissions.
> It's functionality can be easily moved to SecuritySubject.
>
> What do you think?
>
>
>
> пн, 16 мар. 2020 г. в 15:47, Denis Garus :
>
> >  Hello, Alexei!
> >
> > I agree with you if we may not care about compatibility at all, then
> > we can solve the problem much more straightforward 

Re: [DISCUSSION] Deprecation of obsolete rebalancing functionality

2020-02-13 Thread Ivan Rakov
Hello,

+1 from me for rebalance delay deprecation.
I can imagine only one actual case for this option: prevent excessive load
on the cluster in case of temporary short-term topology changes (e.g. node
is stopped for a while and then returned back).
Now it's handled by baseline auto adjustment in a much more correct way:
partitions are not reassigned within a maintenance interval (unlike with
the rebalance delay).
I also don't think that ability to configure rebalance delay per cache is
crucial.

> rebalanceOrder is also useless, agreed.
+1
Except for one case: we may want to rebalance caches with
CacheRebalanceMode.SYNC first. But anyway, this behavior doesn't require a
separate property to be enabled.

On Wed, Feb 12, 2020 at 4:54 PM Alexei Scherbakov <
alexey.scherbak...@gmail.com> wrote:

> Maxim,
>
> rebalanceDelay was introduced before the BLT appear in the product to solve
> scenarios which are now solved by BLT.
>
> It's pointless for me having it in the product since BLT was introduced.
>
> I do not think delaying rebalancing per cache group has any meaning. I
> cannot image any reason for it.
>
> rebalanceOrder is also useless, agreed.
>
>
>
>
> ср, 12 февр. 2020 г. в 16:19, Maxim Muzafarov :
>
> > Alexey,
> >
> > Why do you think delaying of historical rebalance (on BLT node join)
> > for particular cache groups is not the real world use case? Probably
> > the same topic may be started on user-list to collect more use cases
> > from real users.
> >
> > In general, I support reducing the number of available rebalance
> > configuration parameters, but we should do it really carefully.
> > I can also propose - rebalanceOrder param for removing.
> >
> > On Wed, 12 Feb 2020 at 15:50, Alexei Scherbakov
> >  wrote:
> > >
> > > Maxim,
> > >
> > > In general rebalanceDelay is used to delay/disable rebalance then
> > topology
> > > is changed.
> > > Right now we have BLT to avoid unnecesary rebalancing when topology is
> > > changed.
> > > If a node left from cluster topology no rebalancing happens until the
> > node
> > > explicitly removed from baseline topology.
> > >
> > > I would like to know real world scenarios which can not be covered by
> BLT
> > > configuration.
> > >
> > >
> > >
> > > ср, 12 февр. 2020 г. в 15:16, Maxim Muzafarov :
> > >
> > > > Alexey,
> > > >
> > > > > All scenarios where rebalanceDelay has meaning are handled by
> > baseline
> > > > topology now.
> > > >
> > > > Can you, please, provide more details here e.g. the whole list of
> > > > scenarios where rebalanceDelay is used and how these handled by
> > > > baseline topology?
> > > >
> > > > Actually, I doubt that it covers exactly all the cases due to
> > > > rebalanceDelay is a "per cache group property" rather than "baseline"
> > > > is meaningful for the whole topology.
> > > >
> > > > On Wed, 12 Feb 2020 at 12:58, Alexei Scherbakov
> > > >  wrote:
> > > > >
> > > > > I've meant baseline topology.
> > > > >
> > > > > ср, 12 февр. 2020 г. в 12:41, Alexei Scherbakov <
> > > > > alexey.scherbak...@gmail.com>:
> > > > >
> > > > > >
> > > > > > V.Pyatkov
> > > > > >
> > > > > > Doesn't rebalance topology solves it ?
> > > > > >
> > > > > > ср, 12 февр. 2020 г. в 12:31, V.Pyatkov :
> > > > > >
> > > > > >> Hi,
> > > > > >>
> > > > > >> I am sure we can to reduce this ability, but do not completely.
> > > > > >> We can use rebalance delay for disable it until manually
> > triggered.
> > > > > >>
> > > > > >> CacheConfiguration#setRebalanceDelay(-1)
> > > > > >>
> > > > > >> It may helpful for cluster where can not allow performance drop
> > from
> > > > > >> rebalance at any time.
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> --
> > > > > >> Sent from:
> http://apache-ignite-developers.2346864.n4.nabble.com/
> > > > > >>
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Best regards,
> > > > > > Alexei Scherbakov
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Best regards,
> > > > > Alexei Scherbakov
> > > >
> > >
> > >
> > > --
> > >
> > > Best regards,
> > > Alexei Scherbakov
> >
>
>
> --
>
> Best regards,
> Alexei Scherbakov
>


Re: [VOTE] Allow or prohibit a joint use of @deprecated and @IgniteExperimental

2020-02-10 Thread Ivan Rakov
-1 Prohibit

>From my point of view, deprecation of the existing API will confuse users
in case API suggested as a replacement is marked with @IgniteExperimental.

On Mon, Feb 10, 2020 at 12:20 PM Nikolay Izhikov 
wrote:

> +1
>
> > 10 февр. 2020 г., в 11:57, Andrey Mashenkov 
> написал(а):
> >
> > -1 Prohibit.
> >
> > We must not deprecate old API without have a new stable well-documented
> > alternative and a way to migrate to new one.
> >
> >
> > On Mon, Feb 10, 2020 at 11:02 AM Alexey Goncharuk  >
> > wrote:
> >
> >> Dear Apache Ignite community,
> >>
> >> We would like to conduct a formal vote on the subject of whether to
> allow
> >> or prohibit a joint existence of @deprecated annotation for an old API
> >> and @IgniteExperimental [1] for a new (replacement) API. The result of
> this
> >> vote will be formalized as an Apache Ignite development rule to be used
> in
> >> future.
> >>
> >> The discussion thread where you can address all non-vote messages is
> [2].
> >>
> >> The votes are:
> >> *[+1 Allow]* Allow to deprecate the old APIs even when new APIs are
> marked
> >> with @IgniteExperimental to explicitly notify users that an old APIs
> will
> >> be removed in the next major release AND new APIs are available.
> >> *[-1 Prohibit]* Never deprecate the old APIs unless the new APIs are
> stable
> >> and released without @IgniteExperimental. The old APIs javadoc may be
> >> updated with a reference to new APIs to encourage users to evaluate new
> >> APIs. The deprecation and new API release may happen simultaneously if
> the
> >> new API is not marked with @IgniteExperimental or the annotation is
> removed
> >> in the same release.
> >>
> >> Neither of the choices prohibits deprecation of an API without a
> >> replacement if community decides so.
> >>
> >> The vote will hold for 72 hours and will end on February 13th 2020 08:00
> >> UTC:
> >>
> >>
> https://www.timeanddate.com/countdown/to?year=2020=2=13=8=0=0=utc-1
> >>
> >> All votes count, there is no binding/non-binding status for this.
> >>
> >> [1]
> >>
> >>
> https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/lang/IgniteExperimental.java
> >> [2]
> >>
> >>
> http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSS-Public-API-deprecation-rules-td45647.html
> >>
> >> Thanks,
> >> --AG
> >>
> >
> >
> > --
> > Best regards,
> > Andrey V. Mashenkov
>
>


Re: Forbid mixed cache groups with both atomic and transactional caches

2020-02-05 Thread Ivan Rakov
Ivan,

Thanks for pointing this out. Less than one day is indeed too early to
treat this discussion thread as a "community conclusion". Still, the
consensus among the current participants made me feel that a conclusion
will be reached.
We'll surely get back to the discussion if opposite opinions will arise.

On Wed, Feb 5, 2020 at 1:11 PM Ivan Pavlukhin  wrote:

> Folks,
>
> A bit of offtop. Do we have some recommendations in the community how
> long should we wait until treating something as "a Community
> conclusion"? It worries me a little bit that I see a discussion for a
> first time and there is already a conclusion. And the discussion was
> started lesser than 24 hours ago. I suppose we should allow everyone
> interested to share an opinion (here I agree with the proposal) and it
> usually requires some time in open-source communities.
>
> ср, 5 февр. 2020 г. в 10:58, Ivan Rakov :
> >
> > Folks,
> >
> > Thanks for your feedback.
> > I've created a JIRA issue on this change:
> > https://issues.apache.org/jira/browse/IGNITE-12622
> >
> > On Tue, Feb 4, 2020 at 10:43 PM Denis Magda  wrote:
> >
> > > +1 from my end. It doesn't sound like a big deal if Ignite users need
> to
> > > define separate groups for atomic and transactional caches.
> > >
> > > -
> > > Denis
> > >
> > >
> > > On Tue, Feb 4, 2020 at 3:28 AM Ivan Rakov 
> wrote:
> > >
> > > > Igniters,
> > > >
> > > > Apparently it's possible in Ignite to configure a cache group with
> both
> > > > ATOMIC and TRANSACTIONAL caches.
> > > > Proof: IgniteCacheGroupsTest#testContinuousQueriesMultipleGroups*
> tests.
> > > > In my opinion, it would be better to remove such possibility from the
> > > > product. There are several reasons:
> > > >
> > > > 1) The original idea of grouping caches was optimizing storage
> overhead
> > > and
> > > > PME time by joining data of similar caches into the same partitions.
> > > ATOMIC
> > > > and TRANSACTIONAL caches provide different guarantees and are
> designed
> > > for
> > > > different use cases, thus they can hardly be called "similar".
> > > >
> > > > 2) Diving deeper: synchronization protocols and possible reasons for
> > > > primary-backup divergences are conceptually different for ATOMIC and
> > > > TRANSACTIONAL cases. In TRANSACTIONAL case, transactions recovery
> > > protocol
> > > > allows to recover consistency if any participating node will fail,
> but
> > > for
> > > > ATOMIC caches there's possible scenario with failure of primary node
> > > where
> > > > neither of backups will contain the most recent state of the data.
> > > Example:
> > > > one backup have received updates 1, 3, 5 while another have received
> 2, 4
> > > > (which is possible due to message reordering), and even tracking
> counters
> > > > [1] won't restore the consistency. The problem is that we can't
> > > distinguish
> > > > what kind of conflict we have faced in case update counters have
> diverged
> > > > in a mixed group.
> > > >
> > > > 3) Mixed groups are poorly tested. I can't find any tests except a
> couple
> > > > of smoke tests in IgniteCacheGroupsTest. We can't be sure that
> different
> > > > synchronization protocols will work correctly for such
> configurations,
> > > > especially under load and with a variety of dependent configuration
> > > > parameters.
> > > >
> > > > 4) I have never heard of any feedback on mixed groups. I have asked
> > > > different people on this and no one recalled any attempts to
> configure
> > > such
> > > > groups. I believe that in fact no one has ever tried to do it.
> > > >
> > > > Please let me know if you are aware of any cases where mixed groups
> are
> > > > used or reasons to keep them. Otherwise I'll create a ticket to
> prohibit
> > > > mixed configurations.
> > > >
> > > > [1]: https://issues.apache.org/jira/browse/IGNITE-11797
> > > >
> > > > --
> > > > Best Regards,
> > > > Ivan Rakov
> > > >
> > >
>
>
>
> --
> Best regards,
> Ivan Pavlukhin
>


Re: Forbid mixed cache groups with both atomic and transactional caches

2020-02-04 Thread Ivan Rakov
Folks,

Thanks for your feedback.
I've created a JIRA issue on this change:
https://issues.apache.org/jira/browse/IGNITE-12622

On Tue, Feb 4, 2020 at 10:43 PM Denis Magda  wrote:

> +1 from my end. It doesn't sound like a big deal if Ignite users need to
> define separate groups for atomic and transactional caches.
>
> -
> Denis
>
>
> On Tue, Feb 4, 2020 at 3:28 AM Ivan Rakov  wrote:
>
> > Igniters,
> >
> > Apparently it's possible in Ignite to configure a cache group with both
> > ATOMIC and TRANSACTIONAL caches.
> > Proof: IgniteCacheGroupsTest#testContinuousQueriesMultipleGroups* tests.
> > In my opinion, it would be better to remove such possibility from the
> > product. There are several reasons:
> >
> > 1) The original idea of grouping caches was optimizing storage overhead
> and
> > PME time by joining data of similar caches into the same partitions.
> ATOMIC
> > and TRANSACTIONAL caches provide different guarantees and are designed
> for
> > different use cases, thus they can hardly be called "similar".
> >
> > 2) Diving deeper: synchronization protocols and possible reasons for
> > primary-backup divergences are conceptually different for ATOMIC and
> > TRANSACTIONAL cases. In TRANSACTIONAL case, transactions recovery
> protocol
> > allows to recover consistency if any participating node will fail, but
> for
> > ATOMIC caches there's possible scenario with failure of primary node
> where
> > neither of backups will contain the most recent state of the data.
> Example:
> > one backup have received updates 1, 3, 5 while another have received 2, 4
> > (which is possible due to message reordering), and even tracking counters
> > [1] won't restore the consistency. The problem is that we can't
> distinguish
> > what kind of conflict we have faced in case update counters have diverged
> > in a mixed group.
> >
> > 3) Mixed groups are poorly tested. I can't find any tests except a couple
> > of smoke tests in IgniteCacheGroupsTest. We can't be sure that different
> > synchronization protocols will work correctly for such configurations,
> > especially under load and with a variety of dependent configuration
> > parameters.
> >
> > 4) I have never heard of any feedback on mixed groups. I have asked
> > different people on this and no one recalled any attempts to configure
> such
> > groups. I believe that in fact no one has ever tried to do it.
> >
> > Please let me know if you are aware of any cases where mixed groups are
> > used or reasons to keep them. Otherwise I'll create a ticket to prohibit
> > mixed configurations.
> >
> > [1]: https://issues.apache.org/jira/browse/IGNITE-11797
> >
> > --
> > Best Regards,
> > Ivan Rakov
> >
>


[jira] [Created] (IGNITE-12622) Forbid mixed cache groups with both atomic and transactional caches

2020-02-04 Thread Ivan Rakov (Jira)
Ivan Rakov created IGNITE-12622:
---

 Summary: Forbid mixed cache groups with both atomic and 
transactional caches
 Key: IGNITE-12622
 URL: https://issues.apache.org/jira/browse/IGNITE-12622
 Project: Ignite
  Issue Type: Improvement
  Components: cache
Reporter: Ivan Rakov
 Fix For: 2.9


Apparently it's possible in Ignite to configure a cache group with both ATOMIC 
and TRANSACTIONAL caches.
IgniteCacheGroupsTest#testContinuousQueriesMultipleGroups* tests.
As per discussed on dev list 
(http://apache-ignite-developers.2346864.n4.nabble.com/Forbid-mixed-cache-groups-with-both-atomic-and-transactional-caches-td45586.html),
 the community has concluded that such configurations should be prohibited.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Forbid mixed cache groups with both atomic and transactional caches

2020-02-04 Thread Ivan Rakov
Anton,

Indeed, that's +1 point for forbidding mixed configurations.

On Tue, Feb 4, 2020 at 2:36 PM Anton Vinogradov  wrote:

> Seems, we already started the separation by atomic operations restriction
> inside the transactions [1].
> See no reason to allow mixes in this case.
>
> [1] https://issues.apache.org/jira/browse/IGNITE-2313
>
> On Tue, Feb 4, 2020 at 2:28 PM Ivan Rakov  wrote:
>
> > Igniters,
> >
> > Apparently it's possible in Ignite to configure a cache group with both
> > ATOMIC and TRANSACTIONAL caches.
> > Proof: IgniteCacheGroupsTest#testContinuousQueriesMultipleGroups* tests.
> > In my opinion, it would be better to remove such possibility from the
> > product. There are several reasons:
> >
> > 1) The original idea of grouping caches was optimizing storage overhead
> and
> > PME time by joining data of similar caches into the same partitions.
> ATOMIC
> > and TRANSACTIONAL caches provide different guarantees and are designed
> for
> > different use cases, thus they can hardly be called "similar".
> >
> > 2) Diving deeper: synchronization protocols and possible reasons for
> > primary-backup divergences are conceptually different for ATOMIC and
> > TRANSACTIONAL cases. In TRANSACTIONAL case, transactions recovery
> protocol
> > allows to recover consistency if any participating node will fail, but
> for
> > ATOMIC caches there's possible scenario with failure of primary node
> where
> > neither of backups will contain the most recent state of the data.
> Example:
> > one backup have received updates 1, 3, 5 while another have received 2, 4
> > (which is possible due to message reordering), and even tracking counters
> > [1] won't restore the consistency. The problem is that we can't
> distinguish
> > what kind of conflict we have faced in case update counters have diverged
> > in a mixed group.
> >
> > 3) Mixed groups are poorly tested. I can't find any tests except a couple
> > of smoke tests in IgniteCacheGroupsTest. We can't be sure that different
> > synchronization protocols will work correctly for such configurations,
> > especially under load and with a variety of dependent configuration
> > parameters.
> >
> > 4) I have never heard of any feedback on mixed groups. I have asked
> > different people on this and no one recalled any attempts to configure
> such
> > groups. I believe that in fact no one has ever tried to do it.
> >
> > Please let me know if you are aware of any cases where mixed groups are
> > used or reasons to keep them. Otherwise I'll create a ticket to prohibit
> > mixed configurations.
> >
> > [1]: https://issues.apache.org/jira/browse/IGNITE-11797
> >
> > --
> > Best Regards,
> > Ivan Rakov
> >
>


Forbid mixed cache groups with both atomic and transactional caches

2020-02-04 Thread Ivan Rakov
Igniters,

Apparently it's possible in Ignite to configure a cache group with both
ATOMIC and TRANSACTIONAL caches.
Proof: IgniteCacheGroupsTest#testContinuousQueriesMultipleGroups* tests.
In my opinion, it would be better to remove such possibility from the
product. There are several reasons:

1) The original idea of grouping caches was optimizing storage overhead and
PME time by joining data of similar caches into the same partitions. ATOMIC
and TRANSACTIONAL caches provide different guarantees and are designed for
different use cases, thus they can hardly be called "similar".

2) Diving deeper: synchronization protocols and possible reasons for
primary-backup divergences are conceptually different for ATOMIC and
TRANSACTIONAL cases. In TRANSACTIONAL case, transactions recovery protocol
allows to recover consistency if any participating node will fail, but for
ATOMIC caches there's possible scenario with failure of primary node where
neither of backups will contain the most recent state of the data. Example:
one backup have received updates 1, 3, 5 while another have received 2, 4
(which is possible due to message reordering), and even tracking counters
[1] won't restore the consistency. The problem is that we can't distinguish
what kind of conflict we have faced in case update counters have diverged
in a mixed group.

3) Mixed groups are poorly tested. I can't find any tests except a couple
of smoke tests in IgniteCacheGroupsTest. We can't be sure that different
synchronization protocols will work correctly for such configurations,
especially under load and with a variety of dependent configuration
parameters.

4) I have never heard of any feedback on mixed groups. I have asked
different people on this and no one recalled any attempts to configure such
groups. I believe that in fact no one has ever tried to do it.

Please let me know if you are aware of any cases where mixed groups are
used or reasons to keep them. Otherwise I'll create a ticket to prohibit
mixed configurations.

[1]: https://issues.apache.org/jira/browse/IGNITE-11797

-- 
Best Regards,
Ivan Rakov


[jira] [Created] (IGNITE-12607) PartitionsExchangeAwareTest is flaky

2020-01-30 Thread Ivan Rakov (Jira)
Ivan Rakov created IGNITE-12607:
---

 Summary: PartitionsExchangeAwareTest is flaky
 Key: IGNITE-12607
 URL: https://issues.apache.org/jira/browse/IGNITE-12607
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Rakov
Assignee: Ivan Rakov
 Fix For: 2.9


Proof: 
https://ci.ignite.apache.org/buildConfiguration/IgniteTests24Java8_Cache6/4972239
Seems like cache update sometimes is not possible even before topologies are 
locked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Wrong results on Scan queries on REPLICATED caches during rebalance

2020-01-16 Thread Ivan Rakov
Hi Sergey,

Just FYI: a similar problem with replicated caches has been fixed in SQL
[1].
If you have a reproducer, you may check whether your issue is still actual.

[1]: https://issues.apache.org/jira/browse/IGNITE-12482

On Thu, Jan 16, 2020 at 1:51 PM Sergey-A Kosarev 
wrote:

> Classification: Public
> Hello, Igniters,
>
> Recently I've came across a problem with REPLICATED caches, so I've
> created an issue:
> https://issues.apache.org/jira/browse/IGNITE-12549
>
> Please look at this. I believe, it's a bug.
>
> Not sure I could fix it quickly, feel free to take it if you like.
>
> And as workaround I think PARTITIONED caches with Integer.MAX_VALUE
> backups can be used instead of REPLICATED caches.
>
> Will be glad for any feedback.
>
> Kind regards,
> Sergey Kosarev
>
>
>
>
> ---
> This e-mail may contain confidential and/or privileged information. If you
> are not the intended recipient (or have received this e-mail in error)
> please notify the sender immediately and delete this e-mail. Any
> unauthorized copying, disclosure or distribution of the material in this
> e-mail is strictly forbidden.
>
> Please refer to https://www.db.com/disclosures for additional EU
> corporate and regulatory disclosures and to
> http://www.db.com/unitedkingdom/content/privacy.htm for information about
> privacy.
>


-- 
Best Regards,
Ivan Rakov


[jira] [Created] (IGNITE-12545) Introduce listener interface for components to react to partition map exchange events

2020-01-15 Thread Ivan Rakov (Jira)
Ivan Rakov created IGNITE-12545:
---

 Summary: Introduce listener interface for components to react to 
partition map exchange events
 Key: IGNITE-12545
 URL: https://issues.apache.org/jira/browse/IGNITE-12545
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Rakov


It would be handly to have listener interface for components that should react 
to PME instead of just adding more and more calls to 
GridDhtPartitionsExchangeFuture.
In general, there are four possible moments when a compnent can be notified: on 
exchnage init (before and after topologies are updates and exchange latch is 
acquired) and on exchange done (before and after readyTopVer is incremented and 
user operations are unlocked).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Apache Ignite 2.8 RELEASE [Time, Scope, Manager]

2020-01-13 Thread Ivan Rakov
gt; > > > >>
> > > > >> >
> > > > >> >
> > > > >> > Agree with Nikolay, -1 from me, too.
> > > > >> >
> > > > >> > >Hello, Igniters.
> > > > >> > >
> > > > >> > >I’m -1 to include the read-only patch to 2.8.
> > > > >> > >I think we shouldn’t accept any patches to 2.8 except bug fixes
> > for
> > > > >> > blockers and major issues.
> > > > >> > >
> > > > >> > >Guys, we don’t release Apache Ignite for 13 months!
> > > > >> > >We should focus on the release and make it ASAP.
> > > > >> > >
> > > > >> > >We can’t extend the scope anymore.
> > > > >> > >
> > > > >> > >> 10 янв. 2020 г., в 04:29, Sergey Antonov <
> > > > antonovserge...@gmail.com >
> > > > >> > написал(а):
> > > > >> > >>
> > > > >> > >> Hello, Maxim!
> > > > >> > >>
> > > > >> > >>> This PR [2] doesn't look a very simple +5,517 −2,038, 111
> > files
> > > > >> > >> changed.
> > > > >> > >> Yes, PR is huge, but I wrote a lot of new tests and reworked
> > > > already
> > > > >> > >> presented. Changes in product code are minimal - only 30
> > changed
> > > > files
> > > > >> > in
> > > > >> > >> /src/main/ part. And most of them are new control.sh commands
> > and
> > > > >> > >> configuration.
> > > > >> > >>
> > > > >> > >>> Do we have customer requests for this feature or maybe users
> > who
> > > > are
> > > > >> > >> waiting for exactly that ENUM values exactly in 2.8 release
> > (not
> > > > the
> > > > >> > 2.8.1
> > > > >> > >> for instance)?
> > > > >> > >> Can we introduce in new features in maintanance release
> > (2.8.1)?
> > > > Cluster
> > > > >> > >> read-only mode will be new feature, if we remove
> > > > IgniteCluster#readOnly
> > > > >> > in
> > > > >> > >> 2.8 release. If all ok with that, lets remove
> > > > IgniteCluster#readOnly and
> > > > >> > >> move ticket [1] to 2.8.1 release.
> > > > >> > >>
> > > > >> > >>> Do we have extended test results report (on just only TC.Bot
> > green
> > > > >> > visa)
> > > > >> > >> on this feature to be sure that we will not add any blocker
> > issues
> > > > to
> > > > >> > the
> > > > >> > >> release?
> > > > >> > >> I'm preparing patch for 2.8 release and I will get new TC Bot
> > visa
> > > > vs
> > > > >> > >> release branch.
> > > > >> > >>
> > > > >> > >> [1]  https://issues.apache.org/jira/browse/IGNITE-12225
> > > > >> > >>
> > > > >> > >>
> > > > >> > >>
> > > > >> > >> чт, 9 янв. 2020 г. в 19:38, Maxim Muzafarov <
> > mmu...@apache.org
> > > > >:
> > > > >> > >>
> > > > >> > >>> Folks,
> > > > >> > >>>
> > > > >> > >>>
> > > > >> > >>> Let me remind you that we are working on the 2.8 release
> > branch
> > > > >> > >>> stabilization currently (please, keep it in mind).
> > > > >> > >>>
> > > > >> > >>>
> > > > >> > >>> Do we have a really STRONG reason for adding such a change
> > [1] to
> > > > the
> > > > >> > >>> ignite-2.8 branch? This PR [2] doesn't look a very simple
> > +5,517
> > > > >> > >>> −2,038, 111 files changed.
> > > > >> > >>> Do we have customer requests for this feature or maybe users
> > who
> > > > are
> > > > >> > >>

[jira] [Created] (IGNITE-12531) Cluster is unable to change BLT on 2.8 if storage was initially created on 2.7 or less

2020-01-13 Thread Ivan Rakov (Jira)
Ivan Rakov created IGNITE-12531:
---

 Summary: Cluster is unable to change BLT on 2.8 if storage was 
initially created on 2.7 or less
 Key: IGNITE-12531
 URL: https://issues.apache.org/jira/browse/IGNITE-12531
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.8
Reporter: Ivan Rakov
 Fix For: 2.8


Due to bug in https://issues.apache.org/jira/browse/IGNITE-10348, after storage 
migration from 2.7- to 2.8 any updates of metastorage are not persisted.

S2R:
(on 2.7)
- Activate persistent cluster with 2 nodes
- Shutdown the cluster

(on 2.8)
- Start cluster with 2 nodes based on persistent storage from 2.7
- Start 3rd node
- Change baseline
- Shutdown the cluster
- Start initial two nodes
- Start 3rd node (join is rejected: first two nodes has old BLT of two nodes, 
3rd node has new BLT of three nodes)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Hint for user that baseline topology should be changed in order to trigger rebalance

2020-01-09 Thread Ivan Rakov
Folks,

Since 2.4, Ignite cluster requires baseline topology in persistent mode.
That means if user wants to scale cluster and add more nodes, data won't be
redistributed among the whole node set until manual call of
IgniteCluster#setBaselineTopology.

Surely this behavior is well-documented, but don't we need to give user a
hint that baseline topology should be managed manually? I think, log
message with something like "Current set of nodes differs from baseline
topology, please call XXX in order to trigger rebalance and redistribute
your data" will make the situation a bit more transparent.

Right now we have only this message

> [2020-01-07T19:36:45,997][INFO
> ][exchange-worker-#39%blue-54.158.100.161%][GridCachePartitionExchangeManager]
>  Skipping
> rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=2,
> minorTopVer=0], force=false, evt=NODE_JOINED, node=57bc10fe-1505-4e8e-9987-
> 52c9c903c6ef]

which doesn't properly explain what's going on.


Re: Apache Ignite 2.8 RELEASE [Time, Scope, Manager]

2020-01-09 Thread Ivan Rakov
Maxim M. and anyone who is interested,

I suggest to include this fix to 2.8 release:
https://issues.apache.org/jira/browse/IGNITE-12225
Basically, it's a result of the following discussion:
http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSSION-Single-point-in-API-for-changing-cluster-state-td43665.html

The fix affects public API: IgniteCluster#readOnly methods that work with
boolean are replaced with ones that work with enum.
If we include it, we won't be obliged to keep deprecated boolean version of
API in the code (which is currently present in 2.8 branch) as it wasn't
published in any release.

On Tue, Dec 31, 2019 at 3:54 PM Ilya Kasnacheev 
wrote:

> Hello!
>
> I have ran dependency checker plugin and quote the following:
>
> One or more dependencies were identified with known vulnerabilities in
> ignite-urideploy:
> One or more dependencies were identified with known vulnerabilities in
> ignite-spring:
> One or more dependencies were identified with known vulnerabilities in
> ignite-spring-data:
> One or more dependencies were identified with known vulnerabilities in
> ignite-aop:
> One or more dependencies were identified with known vulnerabilities in
> ignite-visor-console:
>
> spring-core-4.3.18.RELEASE.jar
> (pkg:maven/org.springframework/spring-core@4.3.18.RELEASE,
> cpe:2.3:a:pivotal_software:spring_framework:4.3.18.release:*:*:*:*:*:*:*,
> cpe:2.3:a:springsource:spring_framework:4.3.18.release:*:*:*:*:*:*:*,
> cpe:2.3:a:vmware:springsource_spring_framework:4.3.18:*:*:*:*:*:*:*) :
> CVE-2018-15756
>
> One or more dependencies were identified with known vulnerabilities in
> ignite-spring-data_2.0:
>
> spring-core-5.0.8.RELEASE.jar
> (pkg:maven/org.springframework/spring-core@5.0.8.RELEASE,
> cpe:2.3:a:pivotal_software:spring_framework:5.0.8.release:*:*:*:*:*:*:*,
> cpe:2.3:a:springsource:spring_framework:5.0.8.release:*:*:*:*:*:*:*,
> cpe:2.3:a:vmware:springsource_spring_framework:5.0.8:*:*:*:*:*:*:*) :
> CVE-2018-15756
>
> One or more dependencies were identified with known vulnerabilities in
> ignite-rest-http:
>
> jetty-server-9.4.11.v20180605.jar
> (pkg:maven/org.eclipse.jetty/jetty-server@9.4.11.v20180605,
> cpe:2.3:a:eclipse:jetty:9.4.11:20180605:*:*:*:*:*:*,
> cpe:2.3:a:jetty:jetty:9.4.11.v20180605:*:*:*:*:*:*:*,
> cpe:2.3:a:mortbay_jetty:jetty:9.4.11:20180605:*:*:*:*:*:*) :
> CVE-2018-12545, CVE-2019-10241, CVE-2019-10247
> jackson-databind-2.9.6.jar
> (pkg:maven/com.fasterxml.jackson.core/jackson-databind@2.9.6,
> cpe:2.3:a:fasterxml:jackson:2.9.6:*:*:*:*:*:*:*,
> cpe:2.3:a:fasterxml:jackson-databind:2.9.6:*:*:*:*:*:*:*) :
> CVE-2018-1000873, CVE-2018-14718, CVE-2018-14719, CVE-2018-14720,
> CVE-2018-14721, CVE-2018-19360, CVE-2018-19361, CVE-2018-19362,
> CVE-2019-12086, CVE-2019-12384, CVE-2019-12814, CVE-2019-14379,
> CVE-2019-14439, CVE-2019-14540, CVE-2019-16335, CVE-2019-16942,
> CVE-2019-16943, CVE-2019-17267, CVE-2019-17531
>
> One or more dependencies were identified with known vulnerabilities in
> ignite-kubernetes:
> One or more dependencies were identified with known vulnerabilities in
> ignite-aws:
>
> jackson-databind-2.9.6.jar
> (pkg:maven/com.fasterxml.jackson.core/jackson-databind@2.9.6,
> cpe:2.3:a:fasterxml:jackson:2.9.6:*:*:*:*:*:*:*,
> cpe:2.3:a:fasterxml:jackson-databind:2.9.6:*:*:*:*:*:*:*) :
> CVE-2018-1000873, CVE-2018-14718, CVE-2018-14719, CVE-2018-14720,
> CVE-2018-14721, CVE-2018-19360, CVE-2018-19361, CVE-2018-19362,
> CVE-2019-12086, CVE-2019-12384, CVE-2019-12814, CVE-2019-14379,
> CVE-2019-14439, CVE-2019-14540, CVE-2019-16335, CVE-2019-16942,
> CVE-2019-16943, CVE-2019-17267, CVE-2019-17531
> bcprov-ext-jdk15on-1.54.jar
> (pkg:maven/org.bouncycastle/bcprov-ext-jdk15on@1.54) : CVE-2015-6644,
> CVE-2016-1000338, CVE-2016-1000339, CVE-2016-1000340, CVE-2016-1000341,
> CVE-2016-1000342, CVE-2016-1000343, CVE-2016-1000344, CVE-2016-1000345,
> CVE-2016-1000346, CVE-2016-1000352, CVE-2016-2427, CVE-2017-13098,
> CVE-2018-1000180, CVE-2018-1000613
>
> One or more dependencies were identified with known vulnerabilities in
> ignite-gce:
>
> httpclient-4.0.1.jar (pkg:maven/org.apache.httpcomponents/httpclient@4.0.1
> ,
> cpe:2.3:a:apache:httpclient:4.0.1:*:*:*:*:*:*:*) : CVE-2011-1498,
> CVE-2014-3577, CVE-2015-5262
> guava-jdk5-17.0.jar (pkg:maven/com.google.guava/guava-jdk5@17.0,
> cpe:2.3:a:google:guava:17.0:*:*:*:*:*:*:*) : CVE-2018-10237
>
> One or more dependencies were identified with known vulnerabilities in
> ignite-cloud:
>
> openstack-keystone-2.0.0.jar
> (pkg:maven/org.apache.jclouds.api/openstack-keystone@2.0.0,
> cpe:2.3:a:openstack:keystone:2.0.0:*:*:*:*:*:*:*,
> cpe:2.3:a:openstack:openstack:2.0.0:*:*:*:*:*:*:*) : CVE-2013-2014,
> CVE-2013-4222, CVE-2013-6391, CVE-2014-0204, CVE-2014-3476, CVE-2014-3520,
> CVE-2014-3621, CVE-2015-3646, CVE-2015-7546, CVE-2018-14432, CVE-2018-20170
> cloudstack-2.0.0.jar (pkg:maven/org.apache.jclouds.api/cloudstack@2.0.0,
> cpe:2.3:a:apache:cloudstack:2.0.0:*:*:*:*:*:*:*) : CVE-2013-2136,
> 

[jira] [Created] (IGNITE-12510) In-memory page eviction may fail in case very large entries are stored in the cache

2019-12-27 Thread Ivan Rakov (Jira)
Ivan Rakov created IGNITE-12510:
---

 Summary: In-memory page eviction may fail in case very large 
entries are stored in the cache
 Key: IGNITE-12510
 URL: https://issues.apache.org/jira/browse/IGNITE-12510
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.7.6
Reporter: Ivan Rakov


In-memory page eviction (both DataPageEvictionMode#RANDOM_LRU and 
DataPageEvictionMode#RANDOM_2_LRU) has limited number of attempts to choose 
candidate page for data removal:

{code:java}
if (sampleSpinCnt > SAMPLE_SPIN_LIMIT) { // 5000
LT.warn(log, "Too many attempts to choose data page: " + 
SAMPLE_SPIN_LIMIT);

return;
}
{code}
Large data entries are stored in several data pages which are sequentially 
linked to each other. Only "head" pages are suitable for eviction, because the 
whole entry is available only from "head" page (list of pages is singly linked; 
there are no reverse links from tail to head).
The problem is that if we put large enough entries to evictable cache (e.g. 
each entry needs more than 5000 pages to be stored), there are too few head 
pages and "Too many attempts to choose data page" error is likely to show up.
We need to perform something like full scan if we failed to find a head page in 
SAMPLE_SPIN_LIMIT attempts instead of just failing node with error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12509) CACHE_REBALANCE_STOPPED event raises for wrong caches in case of specified RebalanceDelay

2019-12-27 Thread Ivan Rakov (Jira)
Ivan Rakov created IGNITE-12509:
---

 Summary: CACHE_REBALANCE_STOPPED event raises for wrong caches in 
case of specified RebalanceDelay
 Key: IGNITE-12509
 URL: https://issues.apache.org/jira/browse/IGNITE-12509
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Rakov
 Fix For: 2.9


Steps to reproduce:
1. Start in-memory cluster with 2 server nodes
2. Start 3 caches with different rebalance delays (e.g. 5, 10 and 15 seconds) 
and upload some data
3. Start localListener for EVT_CACHE_REBALANCE_STOPPED event on one of the 
nodes.
4. Start one more server node.
5. Wait for 5 seconds, till rebalance delay is reached.
6. EVT_CACHE_REBALANCE_STOPPED event received 3 times (1 for each cache), but 
in fact only 1 cache was rebalanced. The same happens for the rest of the 
caches.
As result on rebalance finish we're getting event for each cache [CACHE_COUNT] 
times, instead of 1.
Reproducer attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12508) GridCacheProcessor#cacheDescriptor(int) has O(N) complexity

2019-12-27 Thread Ivan Rakov (Jira)
Ivan Rakov created IGNITE-12508:
---

 Summary: GridCacheProcessor#cacheDescriptor(int) has O(N) 
complexity
 Key: IGNITE-12508
 URL: https://issues.apache.org/jira/browse/IGNITE-12508
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Rakov
 Fix For: 2.9


See the method code:
{code}
@Nullable public DynamicCacheDescriptor cacheDescriptor(int cacheId) {
for (DynamicCacheDescriptor cacheDesc : cacheDescriptors().values()) {
CacheConfiguration ccfg = cacheDesc.cacheConfiguration();

assert ccfg != null : cacheDesc;

if (CU.cacheId(ccfg.getName()) == cacheId)
return cacheDesc;
}

return null;
}
{code}

This method is invoked in several hot paths which causes significant 
performance regression when the number of caches is large, for example, logical 
recovery and security check for indexing.

The method should be improved to use a hash map or similar data structure to 
get a better complexity



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12507) Implement cache size metric in bytes

2019-12-27 Thread Ivan Rakov (Jira)
Ivan Rakov created IGNITE-12507:
---

 Summary: Implement cache size metric in bytes
 Key: IGNITE-12507
 URL: https://issues.apache.org/jira/browse/IGNITE-12507
 Project: Ignite
  Issue Type: Improvement
  Components: cache
Reporter: Ivan Rakov
 Fix For: 2.9


There is a need to have cache size in bytes metric for pure in-memory case.

When all data is in RAM, it is not obvious to find out exactly how much space 
is consumed by cache data on a running node as the only things that could be 
watched on is number of keys in partition on a specific node and memory usage 
metrics on the machine.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12451) Introduce deadlock detection for cache entry reentrant locks

2019-12-13 Thread Ivan Rakov (Jira)
Ivan Rakov created IGNITE-12451:
---

 Summary: Introduce deadlock detection for cache entry reentrant 
locks
 Key: IGNITE-12451
 URL: https://issues.apache.org/jira/browse/IGNITE-12451
 Project: Ignite
  Issue Type: Improvement
Affects Versions: 2.7.6
Reporter: Ivan Rakov
 Fix For: 2.9


Aside from IGNITE-12365, we still have possible threat of cache-entry-level 
deadlock in case of careless usage of JCache mass operations (putAll, 
removeAll):
1. If two different user threads will perform putAll on the same two keys in 
reverse order (primary node for which is the same), there's a chance that 
sys-stripe threads will be deadlocked.
2. Even without direct contract violation from user side, HashMap can be passed 
as argument for putAll. Even if user threads have called mass operations with 
two keys in the same order, HashMap iteration order is not strictly defined, 
which may cause the same deadlock. 

Local deadlock detection should mitigate this issue. We can create a wrapper 
for ReentrantLock with logic that performs cycle detection in wait-for graph in 
case we are waiting for lock acquisition for too long. Exception will be thrown 
from one of the threads in such case, failing user operation, but letting the 
system make progress.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12429) Rework bytes-based WAL archive size management logic to make historical rebalance more predictable

2019-12-09 Thread Ivan Rakov (Jira)
Ivan Rakov created IGNITE-12429:
---

 Summary: Rework bytes-based WAL archive size management logic to 
make historical rebalance more predictable
 Key: IGNITE-12429
 URL: https://issues.apache.org/jira/browse/IGNITE-12429
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Rakov


Since 2.7 DataStorageConfiguration allows to specify size of WAL archive in 
bytes (see DataStorageConfiguration#maxWalArchiveSize), which is much more 
trasparent to user. 
Unfortunately, new logic may be unpredictable when it comes to the historical 
rebalance. WAL archive is truncated when one of the following conditions occur:
1. Total number of checkpoints in WAL archive is bigger than 
DataStorageConfiguration#walHistSize
2. Total size of WAL archive is bigger than 
DataStorageConfiguration#maxWalArchiveSize
Independently, in-memory checkpoint history contains only fixed number of last 
checkpoints (can be changed with IGNITE_PDS_MAX_CHECKPOINT_MEMORY_HISTORY_SIZE, 
100 by default).
All these particular qualities make it hard for user to cotrol usage of 
historical rebalance. Imagine the case when user has slight load (WAL gets 
rotated very slowly) and default checkpoint frequency. After 100 * 3 = 300 
minutes, all updates in WAL will be impossible to be received via historical 
rebalance even if:
1. User has configured large DataStorageConfiguration#maxWalArchiveSize
2. User has configured large DataStorageConfiguration#walHistSize
At the same time, setting large IGNITE_PDS_MAX_CHECKPOINT_MEMORY_HISTORY_SIZE 
will help (only with previous two points combined), but Ignite node heap usage 
may increase dramatically.
I propose to change WAL history management logic in the following way:
1. *Don't* cut WAL archive when number of checkpoint exceeds 
DataStorageConfiguration#walHistSize. WAL history should be managed only based 
on DataStorageConfiguration#maxWalArchiveSize.
2. Checkpoint history should contain fixed number of entries, but should cover 
the whole stored WAL archive (not only its more recent part with 
IGNITE_PDS_MAX_CHECKPOINT_MEMORY_HISTORY_SIZE last checkpoints). This can be 
achieved by making checkpoint history sparse: some intermediate checkpoints 
*may be not present in history*, but fixed number of checkpoints can be 
positioned either in uniform distribution (trying to keep fixed number of bytes 
between two neighbour checkpoints) or exponentially (trying to keep fixed ratio 
between (size of WAL from checkpoint(N-1) to current write pointer) and (size 
of WAL from checkpoint(N) to current write pointer).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Apache Ignite PMC Chair

2019-10-29 Thread Ivan Rakov

+1 for Dmitry Pavlov

Best Regards,
Ivan Rakov

On 29.10.2019 10:50, Ilya Kasnacheev wrote:

+1 for Nikolay Izhikov (binding)

Regards,


Re: Metric showing how many nodes may safely leave the cluster

2019-10-10 Thread Ivan Rakov

https://issues.apache.org/jira/browse/IGNITE-12278

Best Regards,
Ivan Rakov

On 07.10.2019 15:08, Ivan Rakov wrote:

Denis, Alex,

Sure, new metric will be integrated into new metrics framework.
Let's not expose its value to control.sh right now. I'll create an 
issue for aggregated "getMinimumNumberOfPartitionCopies" if everyone 
agrees.


Best Regards,
Ivan Rakov

On 04.10.2019 20:06, Denis Magda wrote:

I'm for the proposal to add new JMX metrics and enhance the existing
tooling. But I would encourage us to integrate this into the new metrics
framework Nikolay has been working on. Otherwise, we will be deprecating
these JMX metrics in a short time frame in favor of the new 
monitoring APIs.


-
Denis


On Fri, Oct 4, 2019 at 9:33 AM Alexey Goncharuk 


wrote:


I agree that we should have the ability to read any metric using simple
Ignite tooling. I am not sure if visor.sh is a good fit - if I
remember correctly, it will start a daemon node which will bump the
topology version with all related consequences. I believe in the 
long term

it will beneficial to migrate all visor.sh functionality to a more
lightweight protocol, such as used in control.sh.

As for the metrics, the metric suggested by Ivan totally makes sense 
to me

- it is a simple and, actually, quite critical metric. It will be
completely unusable to select a minimum of some metric for all cache 
groups
manually. A monitoring system, on the other hand, might not be 
available

when the metric is needed, or may not support aggregation.

--AG

пт, 4 окт. 2019 г. в 18:58, Ivan Rakov :


Nikolay,

Many users start to use Ignite with a small project without
production-level monitoring. When proof-of-concept appears to be 
viable,

they tend to expand Ignite usage by growing cluster and adding needed
environment (including monitoring systems).
Inability to find such basic thing as survival in case of next node
crash may affect overall product impression. We all want Ignite to be
successful and widespread.


Can you clarify, what do you mean, exactly?
Right now user can access metric mentioned by Alex and choose 
minimum of

all cache groups. I want to highlight that not every user understands
Ignite and its internals so much to find out that exactly these 
sequence

of actions will bring him to desired answer.


Can you clarify, what do you mean, exactly?
We have a ticket[1] to support metrics output via visor.sh.

My understanding: we should have an easy way to output metric values

for

each node in cluster.

[1] https://issues.apache.org/jira/browse/IGNITE-12191

I propose to add metric method for aggregated
"getMinimumNumberOfPartitionCopies" and expose it to control.sh.
My understanding: it's result is critical enough to be accessible in a
short path. I've started this topic due to request from user list, and
I've heard many similar complaints before.

Best Regards,
Ivan Rakov

On 04.10.2019 17:18, Nikolay Izhikov wrote:

Ivan.


We shouldn't force users to configure external tools and write extra

code for basic things.

Actually, I don't agree with you.
Having external monitoring system for any production cluster is a

*basic* thing.

Can you, please, define "basic things"?


single method for the whole cluster

Can you clarify, what do you mean, exactly?
We have a ticket[1] to support metrics output via visor.sh.

My understanding: we should have an easy way to output metric values

for

each node in cluster.

[1] https://issues.apache.org/jira/browse/IGNITE-12191


В Пт, 04/10/2019 в 17:09 +0300, Ivan Rakov пишет:

Max,

What if user simply don't have configured monitoring system?
Knowing whether cluster will survive node shutdown is critical 
for any

administrator that performs any manipulations with cluster topology.
Essential information should be easily accessed. We shouldn't force
users to configure external tools and write extra code for basic

things.

Alex,

Thanks, that's exact metric we need.
My point is that we should make it more accessible: via control.sh
command and single method for the whole cluster.

Best Regards,
Ivan Rakov

On 04.10.2019 16:34, Alex Plehanov wrote:

Ivan, there already exist metric
CacheGroupMetricsMXBean#getMinimumNumberOfPartitionCopies, which

shows

the

current redundancy level for the cache group.
We can lose up to ( getMinimumNumberOfPartitionCopies-1) nodes

without

data

loss in this cache group.

пт, 4 окт. 2019 г. в 16:17, Ivan Rakov :


Igniters,

I've seen numerous requests to find out an easy way to check 
whether

is

it safe to turn off cluster node. As we know, in Ignite protection

from

sudden node shutdown is implemented through keeping several backup
copies of each partition. However, this guarantee can be weakened

for

a

while in case cluster has recently experienced node restart and
rebalancing process is still in progress.
Example scenario is restarting nodes one by one in order to 
update a

local configuration parameter. User restarts one

[jira] [Created] (IGNITE-12278) Add metric showing how many nodes may safely leave the cluster wihout partition loss

2019-10-10 Thread Ivan Rakov (Jira)
Ivan Rakov created IGNITE-12278:
---

 Summary: Add metric showing how many nodes may safely leave the 
cluster wihout partition loss
 Key: IGNITE-12278
 URL: https://issues.apache.org/jira/browse/IGNITE-12278
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Rakov
 Fix For: 2.8


We already have getMinimumNumberOfPartitionCopies metrics that shows partitions 
redundancy number for a specific cache group.
It would be handy if user has single aggregated metric for all cache groups 
showing how many nodes may leave the cluster without partition loss in any 
cache.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Metric showing how many nodes may safely leave the cluster

2019-10-07 Thread Ivan Rakov

Denis, Alex,

Sure, new metric will be integrated into new metrics framework.
Let's not expose its value to control.sh right now. I'll create an issue 
for aggregated "getMinimumNumberOfPartitionCopies" if everyone agrees.


Best Regards,
Ivan Rakov

On 04.10.2019 20:06, Denis Magda wrote:

I'm for the proposal to add new JMX metrics and enhance the existing
tooling. But I would encourage us to integrate this into the new metrics
framework Nikolay has been working on. Otherwise, we will be deprecating
these JMX metrics in a short time frame in favor of the new monitoring APIs.

-
Denis


On Fri, Oct 4, 2019 at 9:33 AM Alexey Goncharuk 
wrote:


I agree that we should have the ability to read any metric using simple
Ignite tooling. I am not sure if visor.sh is a good fit - if I
remember correctly, it will start a daemon node which will bump the
topology version with all related consequences. I believe in the long term
it will beneficial to migrate all visor.sh functionality to a more
lightweight protocol, such as used in control.sh.

As for the metrics, the metric suggested by Ivan totally makes sense to me
- it is a simple and, actually, quite critical metric. It will be
completely unusable to select a minimum of some metric for all cache groups
manually. A monitoring system, on the other hand, might not be available
when the metric is needed, or may not support aggregation.

--AG

пт, 4 окт. 2019 г. в 18:58, Ivan Rakov :


Nikolay,

Many users start to use Ignite with a small project without
production-level monitoring. When proof-of-concept appears to be viable,
they tend to expand Ignite usage by growing cluster and adding needed
environment (including monitoring systems).
Inability to find such basic thing as survival in case of next node
crash may affect overall product impression. We all want Ignite to be
successful and widespread.


Can you clarify, what do you mean, exactly?

Right now user can access metric mentioned by Alex and choose minimum of
all cache groups. I want to highlight that not every user understands
Ignite and its internals so much to find out that exactly these sequence
of actions will bring him to desired answer.


Can you clarify, what do you mean, exactly?
We have a ticket[1] to support metrics output via visor.sh.

My understanding: we should have an easy way to output metric values

for

each node in cluster.

[1] https://issues.apache.org/jira/browse/IGNITE-12191

I propose to add metric method for aggregated
"getMinimumNumberOfPartitionCopies" and expose it to control.sh.
My understanding: it's result is critical enough to be accessible in a
short path. I've started this topic due to request from user list, and
I've heard many similar complaints before.

Best Regards,
Ivan Rakov

On 04.10.2019 17:18, Nikolay Izhikov wrote:

Ivan.


We shouldn't force users to configure external tools and write extra

code for basic things.

Actually, I don't agree with you.
Having external monitoring system for any production cluster is a

*basic* thing.

Can you, please, define "basic things"?


single method for the whole cluster

Can you clarify, what do you mean, exactly?
We have a ticket[1] to support metrics output via visor.sh.

My understanding: we should have an easy way to output metric values

for

each node in cluster.

[1] https://issues.apache.org/jira/browse/IGNITE-12191


В Пт, 04/10/2019 в 17:09 +0300, Ivan Rakov пишет:

Max,

What if user simply don't have configured monitoring system?
Knowing whether cluster will survive node shutdown is critical for any
administrator that performs any manipulations with cluster topology.
Essential information should be easily accessed. We shouldn't force
users to configure external tools and write extra code for basic

things.

Alex,

Thanks, that's exact metric we need.
My point is that we should make it more accessible: via control.sh
command and single method for the whole cluster.

Best Regards,
Ivan Rakov

On 04.10.2019 16:34, Alex Plehanov wrote:

Ivan, there already exist metric
CacheGroupMetricsMXBean#getMinimumNumberOfPartitionCopies, which

shows

the

current redundancy level for the cache group.
We can lose up to ( getMinimumNumberOfPartitionCopies-1) nodes

without

data

loss in this cache group.

пт, 4 окт. 2019 г. в 16:17, Ivan Rakov :


Igniters,

I've seen numerous requests to find out an easy way to check whether

is

it safe to turn off cluster node. As we know, in Ignite protection

from

sudden node shutdown is implemented through keeping several backup
copies of each partition. However, this guarantee can be weakened

for

a

while in case cluster has recently experienced node restart and
rebalancing process is still in progress.
Example scenario is restarting nodes one by one in order to update a
local configuration parameter. User restarts one node and

rebalancing

starts: when it will be completed, it will be safe to proceed

(backup

count=1). However, there's no transp

Re: Metric showing how many nodes may safely leave the cluster

2019-10-04 Thread Ivan Rakov

Nikolay,

Many users start to use Ignite with a small project without 
production-level monitoring. When proof-of-concept appears to be viable, 
they tend to expand Ignite usage by growing cluster and adding needed 
environment (including monitoring systems).
Inability to find such basic thing as survival in case of next node 
crash may affect overall product impression. We all want Ignite to be 
successful and widespread.



Can you clarify, what do you mean, exactly?


Right now user can access metric mentioned by Alex and choose minimum of 
all cache groups. I want to highlight that not every user understands 
Ignite and its internals so much to find out that exactly these sequence 
of actions will bring him to desired answer.



Can you clarify, what do you mean, exactly?
We have a ticket[1] to support metrics output via visor.sh.

My understanding: we should have an easy way to output metric values for each 
node in cluster.

[1] https://issues.apache.org/jira/browse/IGNITE-12191
I propose to add metric method for aggregated 
"getMinimumNumberOfPartitionCopies" and expose it to control.sh.
My understanding: it's result is critical enough to be accessible in a 
short path. I've started this topic due to request from user list, and 
I've heard many similar complaints before.


Best Regards,
Ivan Rakov

On 04.10.2019 17:18, Nikolay Izhikov wrote:

Ivan.


We shouldn't force users to configure external tools and write extra code for 
basic things.

Actually, I don't agree with you.
Having external monitoring system for any production cluster is a *basic* thing.

Can you, please, define "basic things"?


single method for the whole cluster

Can you clarify, what do you mean, exactly?
We have a ticket[1] to support metrics output via visor.sh.

My understanding: we should have an easy way to output metric values for each 
node in cluster.

[1] https://issues.apache.org/jira/browse/IGNITE-12191


В Пт, 04/10/2019 в 17:09 +0300, Ivan Rakov пишет:

Max,

What if user simply don't have configured monitoring system?
Knowing whether cluster will survive node shutdown is critical for any
administrator that performs any manipulations with cluster topology.
Essential information should be easily accessed. We shouldn't force
users to configure external tools and write extra code for basic things.

Alex,

Thanks, that's exact metric we need.
My point is that we should make it more accessible: via control.sh
command and single method for the whole cluster.

Best Regards,
Ivan Rakov

On 04.10.2019 16:34, Alex Plehanov wrote:

Ivan, there already exist metric
CacheGroupMetricsMXBean#getMinimumNumberOfPartitionCopies, which shows the
current redundancy level for the cache group.
We can lose up to ( getMinimumNumberOfPartitionCopies-1) nodes without data
loss in this cache group.

пт, 4 окт. 2019 г. в 16:17, Ivan Rakov :


Igniters,

I've seen numerous requests to find out an easy way to check whether is
it safe to turn off cluster node. As we know, in Ignite protection from
sudden node shutdown is implemented through keeping several backup
copies of each partition. However, this guarantee can be weakened for a
while in case cluster has recently experienced node restart and
rebalancing process is still in progress.
Example scenario is restarting nodes one by one in order to update a
local configuration parameter. User restarts one node and rebalancing
starts: when it will be completed, it will be safe to proceed (backup
count=1). However, there's no transparent way to determine whether
rebalancing is over.
   From my perspective, it would be very helpful to:
1) Add information about rebalancing and number of free-to-go nodes to
./control.sh --state command.
Examples of output:


Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
Cluster tag: new_tag




Cluster is active
All partitions are up-to-date.
3 node(s) can safely leave the cluster without partition loss.
Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
Cluster tag: new_tag




Cluster is active
Rebalancing is in progress.
1 node(s) can safely leave the cluster without partition loss.

2) Provide the same information via ClusterMetrics. For example:
ClusterMetrics#isRebalanceInProgress // boolean
ClusterMetrics#getSafeToLeaveNodesCount // int

Here I need to mention that this information can be calculated from
existing rebalance metrics (see CacheMetrics#*rebalance*). However, I
still think that we need more simple and understandable flag whether
cluster is in danger of data loss. Another point is that current metrics
are bound to specific cache, which makes this information even harder to
analyze.

Thoughts?

--
Best Regards,
Ivan Rakov




Re: Metric showing how many nodes may safely leave the cluster

2019-10-04 Thread Ivan Rakov

Max,

What if user simply don't have configured monitoring system?
Knowing whether cluster will survive node shutdown is critical for any 
administrator that performs any manipulations with cluster topology.
Essential information should be easily accessed. We shouldn't force 
users to configure external tools and write extra code for basic things.


Alex,

Thanks, that's exact metric we need.
My point is that we should make it more accessible: via control.sh 
command and single method for the whole cluster.


Best Regards,
Ivan Rakov

On 04.10.2019 16:34, Alex Plehanov wrote:

Ivan, there already exist metric
CacheGroupMetricsMXBean#getMinimumNumberOfPartitionCopies, which shows the
current redundancy level for the cache group.
We can lose up to ( getMinimumNumberOfPartitionCopies-1) nodes without data
loss in this cache group.

пт, 4 окт. 2019 г. в 16:17, Ivan Rakov :


Igniters,

I've seen numerous requests to find out an easy way to check whether is
it safe to turn off cluster node. As we know, in Ignite protection from
sudden node shutdown is implemented through keeping several backup
copies of each partition. However, this guarantee can be weakened for a
while in case cluster has recently experienced node restart and
rebalancing process is still in progress.
Example scenario is restarting nodes one by one in order to update a
local configuration parameter. User restarts one node and rebalancing
starts: when it will be completed, it will be safe to proceed (backup
count=1). However, there's no transparent way to determine whether
rebalancing is over.
  From my perspective, it would be very helpful to:
1) Add information about rebalancing and number of free-to-go nodes to
./control.sh --state command.
Examples of output:


Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
Cluster tag: new_tag




Cluster is active
All partitions are up-to-date.
3 node(s) can safely leave the cluster without partition loss.
Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
Cluster tag: new_tag




Cluster is active
Rebalancing is in progress.
1 node(s) can safely leave the cluster without partition loss.

2) Provide the same information via ClusterMetrics. For example:
ClusterMetrics#isRebalanceInProgress // boolean
ClusterMetrics#getSafeToLeaveNodesCount // int

Here I need to mention that this information can be calculated from
existing rebalance metrics (see CacheMetrics#*rebalance*). However, I
still think that we need more simple and understandable flag whether
cluster is in danger of data loss. Another point is that current metrics
are bound to specific cache, which makes this information even harder to
analyze.

Thoughts?

--
Best Regards,
Ivan Rakov




Metric showing how many nodes may safely leave the cluster

2019-10-04 Thread Ivan Rakov

Igniters,

I've seen numerous requests to find out an easy way to check whether is 
it safe to turn off cluster node. As we know, in Ignite protection from 
sudden node shutdown is implemented through keeping several backup 
copies of each partition. However, this guarantee can be weakened for a 
while in case cluster has recently experienced node restart and 
rebalancing process is still in progress.
Example scenario is restarting nodes one by one in order to update a 
local configuration parameter. User restarts one node and rebalancing 
starts: when it will be completed, it will be safe to proceed (backup 
count=1). However, there's no transparent way to determine whether 
rebalancing is over.

From my perspective, it would be very helpful to:
1) Add information about rebalancing and number of free-to-go nodes to 
./control.sh --state command.

Examples of output:


Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
Cluster tag: new_tag

Cluster is active
All partitions are up-to-date.
3 node(s) can safely leave the cluster without partition loss.
Cluster  ID: 125a6dce-74b1-4ee7-a453-c58f23f1f8fc
Cluster tag: new_tag

Cluster is active
Rebalancing is in progress.
1 node(s) can safely leave the cluster without partition loss.

2) Provide the same information via ClusterMetrics. For example:
ClusterMetrics#isRebalanceInProgress // boolean
ClusterMetrics#getSafeToLeaveNodesCount // int

Here I need to mention that this information can be calculated from 
existing rebalance metrics (see CacheMetrics#*rebalance*). However, I 
still think that we need more simple and understandable flag whether 
cluster is in danger of data loss. Another point is that current metrics 
are bound to specific cache, which makes this information even harder to 
analyze.


Thoughts?

--
Best Regards,
Ivan Rakov



Re: Apache Ignite 2.7.6 (Time, Scope, and Release manager)

2019-09-11 Thread Ivan Rakov

Alexey,

I've merged https://issues.apache.org/jira/browse/IGNITE-12163 to master 
and 2.7.6.


Best Regards,
Ivan Rakov

On 11.09.2019 18:13, Alexey Goncharuk wrote:

Good,

Please let me know when this is done, I will re-upload the release
artifacts.

ср, 11 сент. 2019 г. в 18:11, Alexandr Shapkin :


Alexey,

The changes already have been tested, so no TC problems expected.
If this is true, then we need just a few hours to merge them.

From: Alexey Goncharuk
Sent: Wednesday, September 11, 2019 6:03 PM
To: dev
Cc: Dmitriy Govorukhin; Anton Kalashnikov
Subject: Re: Re[2]: Apache Ignite 2.7.6 (Time, Scope, and Release manager)

Alexandr,

I almost sent the vote email :) When do you expect the fix to be in master
and 2.7.6?

ср, 11 сент. 2019 г. в 17:38, Alexandr Shapkin :


Folks,

A critical bug was detected in .NET [1].

I understand that it’s a little bit late, but I propose to include this
issue into the release scope.

PR is ready, currently waiting for a TC visa.

Thoughts?

[1] - https://issues.apache.org/jira/browse/IGNITE-12163


From: Alexey Goncharuk
Sent: Monday, September 9, 2019 5:11 PM
To: dev
Cc: Dmitriy Govorukhin; Anton Kalashnikov
Subject: Re: Re[2]: Apache Ignite 2.7.6 (Time, Scope, and Release

manager)

Igniters,

I just pushed the last ticket to ignite-2.7.6 branch; looks like we are
ready for the next iteration.

Given that Dmitriy Pavlov will be unavailable till the end of this week,

I

will take over the release. TC re-run is started.

чт, 5 сент. 2019 г. в 16:14, Dmitriy Govorukhin <
dmitriy.govoruk...@gmail.com>:


Hi Igniters,

I finished work on https://issues.apache.org/jira/browse/IGNITE-12127,

fix

already in master and ignite-2.7.6

On Wed, Sep 4, 2019 at 2:22 PM Dmitriy Govorukhin <
dmitriy.govoruk...@gmail.com> wrote:


Hi Alexey,

I think that I will finish work on the fix tomorrow. Fix already

completed

but I need to get VISA from TC bot.

On Mon, Sep 2, 2019 at 8:27 PM Alexey Goncharuk <
alexey.goncha...@gmail.com> wrote:


Folks, it looks like I was overly optimistic with the estimates for

the

mentioned two tickets.

Dmitriy, Anton,
Can you share your vision when the issues will be fixed? Perhaps, it

makes

sense to release 2.7.6 with the already fixed issues and schedule

2.7.7?

Neither of them is a regression, so it's ok to release 2.7.6 as it

is

now.

Thoughts?

сб, 31 авг. 2019 г. в 11:37, Alexey Goncharuk <

alexey.goncha...@gmail.com

:
Yes, my bad, forgot to include the link. That's the one.

пт, 30 авг. 2019 г. в 15:01, Maxim Muzafarov 
:

Alexey,

Does the issue [1] is related to this [2] discussion on the

user-list?

If yes, I think it is very important to include these fixes to

2.7.6.

[1] https://issues.apache.org/jira/browse/IGNITE-12127
[2]


http://apache-ignite-users.70518.x6.nabble.com/Node-failure-with-quot-Failed-to-write-buffer-quot-error-td29100.html

On Fri, 30 Aug 2019 at 14:26, Alexei Scherbakov
 wrote:

Alexey,

Looks like important fixes, better to include them.

пт, 30 авг. 2019 г. в 12:51, Alexey Goncharuk <

alexey.goncha...@gmail.com>:

Igniters,

Given that the RC1 vote did not succeed and we are still

waiting

for

a few

minor fixes, may I suggest including these two tickest to the

2.7.6

scope?

https://issues.apache.org/jira/browse/IGNITE-12127
https://issues.apache.org/jira/browse/IGNITE-12128

The first one has been already reported on the dev-list [1],

the

second one

may cause a state when an Ignite node cannot start on

existing

persisted

data. Looking at the tickets, the fixes should be reasonably

easy,

so

it

should not shift 2.7.6 release timeline much.

Thoughts?

ср, 28 авг. 2019 г. в 15:25, Nikolay Izhikov <

nizhi...@apache.org

:

Separate repos for different Spark version is a good idea

for

me.

Anyway, can you help with Spark version migration,  for

now?

В Ср, 28/08/2019 в 15:20 +0300, Alexey Zinoviev пишет:

Maybe the best solution today add for each new version of

Spark

the

sub-module (Spark-2.3, Spark-2.4) or the separate

repository

with

modules

for each version or another way with separate repository

and

different

branches like in

https://github.com/datastax/spark-cassandra-connector

3 ways to support different versions with the different

costs

of

support

In the case of separate repository I could help, for

example

ср, 28 авг. 2019 г. в 14:57, Nikolay Izhikov <

nizhi...@apache.org

:

Hello, Alexey.


But the
compatibility with Spark 2.3 will be broken, isn't

it?

Yes.


Do you have any
plans to support the different version of Spark

without

loosing

your

unique

expertise in Spark-Ignite integration?

What do you mean by "my unique expertise"? :)

How do you see support of several Spark version?


В Ср, 28/08/2019 в 14:29 +0300, Alexey Zinoviev пишет:

Dear Nikolay Izhikov
Are you going to update the Ignite-Spark integration

for

Spark 2.4.

But

the

compatibility with Spark 2.3 will be broken, isn't

it?

Do

you

Re: [VOTE] Release Apache Ignite 2.7.6-rc1

2019-08-23 Thread Ivan Rakov

+1
Downloaded binaries, successfully assembled cluster.

Best Regards,
Ivan Rakov

On 23.08.2019 19:07, Dmitriy Pavlov wrote:

+1

Checked: build from sources, startup node on Windows, simple topology,
version and copyright year output,
2.7.6-rc0 is used in the Apache Ignite Teamcity Bot since Sunday, Aug 18
2.7.6-rc1  (ver. 2.7.6#20190821-sha1:6b3acf40) installed as DB for the TC
Bot just now and the bot works well.

пт, 23 авг. 2019 г. в 18:58, Alexey Kuznetsov :


+1
Compiled from sources on Windows, started ignite.bat.

On Fri, Aug 23, 2019 at 10:52 PM Pavel Tupitsyn 
wrote:


+1, checked .NET node start and examples

On Fri, Aug 23, 2019 at 6:49 PM Alexei Scherbakov <
alexey.scherbak...@gmail.com> wrote:


+1

пт, 23 авг. 2019 г. в 18:33, Alexey Goncharuk <

alexey.goncha...@gmail.com

:
+1
Checked the source compilation and release package build, node start,

and a

few examples. Left a comment on the failed TC task in the discussion
thread.

пт, 23 авг. 2019 г. в 18:15, Andrey Gura :


+1

On Fri, Aug 23, 2019 at 3:32 PM Anton Vinogradov 

wrote:

-1 (binding)
Explained at discussion thread.

On Fri, Aug 23, 2019 at 11:17 AM Anton Vinogradov 
wrote:

Dmitriy,

Did you check RC using automated TeamCity task?

On Fri, Aug 23, 2019 at 11:09 AM Zhenya Stanilovsky
 wrote:


Build from sources, run yardstick test.
+1

--- Forwarded message ---
From: "Dmitriy Pavlov" < dpav...@apache.org >
To: dev < dev@ignite.apache.org >
Cc:
Subject: [VOTE] Release Apache Ignite 2.7.6-rc1
Date: Thu, 22 Aug 2019 20:11:58 +0300

Dear Community,

I have uploaded release candidate to
https://dist.apache.org/repos/dist/dev/ignite/2.7.6-rc1/


https://dist.apache.org/repos/dist/dev/ignite/packages_2.7.6-rc1/

The following staging can be used for any dependent project

for

testing:

https://repository.apache.org/content/repositories/orgapacheignite-1466/

This is the second maintenance release for 2.7.x with a

number

of

fixes.

Tag name is 2.7.6-rc1:


https://gitbox.apache.org/repos/asf?p=ignite.git;a=tag;h=refs/tags/2.7.6-rc1

2.7.6 changes:
   * Ignite work directory is now set to the current user's

home

directory

by
default, native persistence files will not be stored in the

Temp

directory

anymore
   * Fixed a bug that caused a SELECT query with an equality

predicate

on a

part of the primary compound key to return a single row even

if

the

query

matched multiple rows
   * Fixed an issue that could cause data corruption during

checkpointing

   * Fixed an issue where a row size was calculated

incorrectly

for

shared

cache groups, which caused a tree corruption
   * Reduced java heap footprint by optimizing

GridDhtPartitionsFullMessage

maps in exchange history
   * .NET: Native persistence now works with a custom

affinity

function

   * Fixed an issue where an outdated node with a destroyed

cache

caused

the

cluster to hang
   * Fixed a bug that made it impossible to change the

inline_size

property

of an existing index after it was dropped and recreated with

a

different

value

RELEASE NOTES:


https://gitbox.apache.org/repos/asf?p=ignite.git;a=blob;f=RELEASE_NOTES.txt;hb=ignite-2.7.6

Complete list of closed issues:


https://issues.apache.org/jira/issues/?jql=project%20%3D%20IGNITE%20AND%20fixVersion%20%3D%202.7.6

DEVNOTES


https://gitbox.apache.org/repos/asf?p=ignite.git;a=blob_plain;f=DEVNOTES.txt;hb=ignite-2.7.6

The vote is formal, see voting guidelines
https://www.apache.org/foundation/voting.html

+1 - to accept Apache Ignite 2.7.6-rc1
0 - don't care either way
-1 - DO NOT accept Apache Ignite Ignite 2.7.6-rc1 (explain

why)

See notes on how to verify release here
https://www.apache.org/info/verification.html
and


https://cwiki.apache.org/confluence/display/IGNITE/Release+Process#ReleaseProcess-P5.VotingonReleaseandReleaseVerification

This vote will be open for at least 3 days till Sun Aug 25,

18:00

UTC.

https://www.timeanddate.com/countdown/to?year=2019=8=25=18=0=0=utc-1

Best Regards,
Dmitriy Pavlov


--
Zhenya Stanilovsky



--

Best regards,
Alexei Scherbakov



--
Alexey Kuznetsov



Re: Replacing default work dir from tmp to current dir

2019-08-12 Thread Ivan Rakov

Choosing the smallest of two evils, I'll agree with user.dir.
Being able to run without preset env variables is strong benefit for 
Ignite as a product.


Best Regards,
Ivan Rakov

On 12.08.2019 19:02, Denis Magda wrote:

+1 for the user.dir as a default one.

Denis

On Monday, August 12, 2019, Dmitriy Pavlov  wrote:


+1 to user home directory. A number of open source products create their
dirs there. For me, it is a kind of expected behavior.

Ivan mentioned an important point: binary meta & marshaller. We should
update documentation and stop require PDS dir setup, but require home setup
(for older versions of Ignite, it is relevant anyway).

пн, 12 авг. 2019 г. в 18:49, Pavel Tupitsyn :


Hi Ivan,


  fail Ignite node in case neither IGNITE_HOME

nor IgniteConfiguration#igniteWorkDir is set
I strongly disagree, this is bad usability.
Ignition.start() should work without any extra configuration as is it

right

now.

Let's come up with reasonable defaults instead, user dir sounds good to

me.

On Mon, Aug 12, 2019 at 6:45 PM Stephen Darlington <
stephen.darling...@gridgain.com> wrote:


Yes, when data is a stake, fail early is the absolutely the right thing

to

do.

Regards,
Stephen


On 12 Aug 2019, at 16:37, Ivan Rakov  wrote:

Hi Anton,

Actually, the issue is even more unpleasant.

Official Ignite documentation says that it's possible to configure

path

where your persistence files will be stored:
https://apacheignite.readme.io/docs/distributed-persistent-store

However, even if you have set all path options (storage, WAL, WAL

archive), Ignite will still store crucial metadata in resolved work
directory (java.io.tmpdir by default). Example is binary metadata

files,

absence of which can make your data unavailable.

I propose to fail Ignite node in case neither IGNITE_HOME nor

IgniteConfiguration#igniteWorkDir is set. It's better to let user know
about missing configuration options during startup than let OS corrupt
storage by cleaning temp dirs.

Thoughts?

Best Regards,
Ivan Rakov

On 12.08.2019 18:10, Anton Kalashnikov wrote:

Hello, Igniters.

Currently, in the case, when work directory wasn't set by user

ignite

can resolve it to tmp directory which leads to some problem - tmp

directory

can be cleared at some unexpected moment by operation system and

different

types of critical data would be lost(ex. binary_meta, persistance

data).

Looks like it is not expected behaviour and maybe it is better

instead

of tmp directory use the current working directory("user.dir")? Or any
other idea?

A little more details you can find in the ticket -

https://issues.apache.org/jira/browse/IGNITE-12057

--
Best regards,
Anton Kalashnikov








Re: Replacing default work dir from tmp to current dir

2019-08-12 Thread Ivan Rakov

Hi Anton,

Actually, the issue is even more unpleasant.

Official Ignite documentation says that it's possible to configure path 
where your persistence files will be stored: 
https://apacheignite.readme.io/docs/distributed-persistent-store
However, even if you have set all path options (storage, WAL, WAL 
archive), Ignite will still store crucial metadata in resolved work 
directory (java.io.tmpdir by default). Example is binary metadata files, 
absence of which can make your data unavailable.


I propose to fail Ignite node in case neither IGNITE_HOME nor 
IgniteConfiguration#igniteWorkDir is set. It's better to let user know 
about missing configuration options during startup than let OS corrupt 
storage by cleaning temp dirs.


Thoughts?

Best Regards,
Ivan Rakov

On 12.08.2019 18:10, Anton Kalashnikov wrote:

Hello, Igniters.

Currently, in the case, when work directory wasn't set by user ignite can 
resolve it to tmp directory which leads to some problem - tmp directory can be 
cleared at some unexpected moment by operation system and different types of 
critical data would be lost(ex. binary_meta, persistance data).

Looks like it is not expected behaviour and maybe it is better instead of tmp directory 
use the current working directory("user.dir")? Or any other idea?

A little more details you can find in the ticket - 
https://issues.apache.org/jira/browse/IGNITE-12057
--
Best regards,
Anton Kalashnikov



Re: [DISCUSSION][IEP-35] Metrics configuration

2019-08-05 Thread Ivan Rakov

Hi guys,

DataStorageConfiguration#getMetricsSubIntervalCount was added by me as 
last resort to decrease number of intervals in HitRateMetrics in case of 
unexpected negative performance impact. As far as I know, no one ever 
used it - the precaution appeared to be premature. We can disregard its 
presence in DataStorageConfiguration.
From my point of view, there's no need to change intervals count in 
runtime - it affects only metric smoothness and should be chosen by 
developer who understands details of metric implementation.


Regarding metrics configuration change management: if we are going to 
add it to the product, it should be user friendly (persistent and 
changeable in the whole cluster by single toggle, at least). Needing to 
change configuration at every cluster node after every cluster restart 
would irritate user more than provide help. Only very hacky cluster 
admin will be able to deal with current solution.
Distributed Metastorage is a good candidate for storing and handling 
such configuration options.


Best Regards,
Ivan Rakov

On 05.08.2019 18:38, Nikolay Izhikov wrote:

Hello, Andrey.


Not necessary if we have exponential bounds' values for histograms.

What do you mean by "exponential bounds"?


Anyway, in current solution it looks ugly and not usable.

Thanks, for the feedback, appreciate you ownesty.


No. But we should admit that this is bad decision and do not include this 
change to the code base.

What is your proposal?
How metrics configuration should work?


Yes. But it still will not give enough accuracy.

Enough for what?

В Пн, 05/08/2019 в 18:29 +0300, Andrey Gura пишет:

- metric configuration is node local (not cluster wide).

This issue is easy to solve on the user-side and in Ignite core.

It's imaginary simplicity. The first, you need some additional
automation on user-side in order to configure all nodes of the
cluster. The second, new nodes can join to the cluster and
configuration will be different on new node and on other nodes of the
cluster. This leads to complication whole functionality. Anyway, I
don't like such simplified solution because at the moment it brings
more problems than value.


The easiest solution was implemented.
Do we want to make it more complex right now :)?

No. But we should admit that this is bad decision and do not include
this change to the code base.


The reason it exists in PR - we already have this parameter in 
DataStorageConfiguration#getMetricsSubIntervalCount

I believe this method should be deprecated and removed in major release.


I think the user should be able to configure buckets for histogram and 
rateTimeInterval for hitrate.

Not necessary if we have exponential bounds' values for histograms.
Anyway, in current solution it looks ugly and not usable.


Ignite has dozens of use-cases and deployment modes, seems,
we can't cover it all with the single predefined buckets/rateTimeInterval set.

Yes. But it still will not give enough accuracy.

On Mon, Aug 5, 2019 at 5:25 PM Nikolay Izhikov  wrote:

Hello, Andrey.


- metric configuration is node local (not cluster wide).

This issue is easy to solve on the user-side and in Ignite core.


- metric configuration doesn't survive node restart.

We decide to go with the simplest solution, for now.
The easiest solution was implemented.
Do we want to make it more complex right now :)?


- User shouldn't configure hit rate metrics at runtime in most cases.

I agree with you - the size of the counters array looks odd as a configuration 
parameter.
The reason it exists in PR - we already have this parameter in 
DataStorageConfiguration#getMetricsSubIntervalCount


- May be it is enough for user to have histograms with pre-configured buckets
So I think we should drop this change and idea about runtime histrogram and hit 
rate configuration.

I think the user should be able to configure buckets for histogram and 
rateTimeInterval for hitrate.

Ignite has dozens of use-cases and deployment modes, seems,
we can't cover it all with the single predefined buckets/rateTimeInterval set.

В Пн, 05/08/2019 в 16:59 +0300, Andrey Gura пишет:

Igniters,

I've took a look to the PR and I want follow up this discussion again.

Proposed solution has a couple of significant drawbacks:

- metric configuration is node local (not cluster wide).
- metric configuration doesn't survive node restart.

This drawbacks make configuration complex, annoying and useless in most cases.

Moreover, I think that:

- User shouldn't configure hit rate metrics at runtime in most cases.
Especially HitRateMetric.size because it's just details of
implementation. Purpose of size is plots smoothing and this parameter
could be fixed (e.g. 16 is enough). HitRate metric is just LongMetric
but with additional feature.
- May be it is enough for user to have histograms with pre-configured
buckets. The trick here is properly chosen bounds. It seems that
exponentially chosen values will fit for most cases. So we can avo

Re: Partition map exchange metrics

2019-07-24 Thread Ivan Rakov

Nikita and Maxim,


What if we just update current metric getCurrentPmeDuration behaviour
to show durations only for blocking PMEs?
Remain it as a long value and rename it to getCacheOperationsBlockedDuration.

No other changes will require.

WDYT?
I agree with these two metrics. I also think that current 
getCurrentPmeDuration will become redundant.


Anton,


It looks like we're trying to implement "extended debug" instead of
"monitoring".
It should not be interesting for real admin what phase of PME is in
progress and so on.


PME is mission critical cluster process. I agree that there's a fine 
line between monitoring and debug here. However, it's not good to add 
monitoring capabilities only for scenario when everything is alright.
If PME will really hang, *real admin* will be extremely interested how 
to return cluster back to working state. Metrics about stages completion 
time may really help here: e.g. if one specific node hasn't completed 
stage X while rest of the cluster has, it can be a signal that this node 
should be killed.


Of course, it's possible to build monitoring system that extract this 
information from logs, but:

- It's more resource intensive as it requires parsing logs for all the time
- It's less reliable as log messages may change

Best Regards,
Ivan Rakov

On 24.07.2019 14:57, Maxim Muzafarov wrote:

Folks,

+1 with Anton post.

What if we just update current metric getCurrentPmeDuration behaviour
to show durations only for blocking PMEs?
Remain it as a long value and rename it to getCacheOperationsBlockedDuration.

No other changes will require.

WDYT?

On Wed, 24 Jul 2019 at 14:02, Nikita Amelchev  wrote:

Nikolay,

The сacheOperationsBlockedDuration metric will show current blocking
duration or 0 if there is no blocking right now.

The totalCacheOperationsBlockedDuration metric will accumulate all
blocking durations that happen after node starts.

ср, 24 июл. 2019 г. в 13:35, Nikolay Izhikov :

Nikita

What is the difference between those two metrics?

ср, 24 июля 2019 г., 12:45 Nikita Amelchev :


Igniters, thanks for comments.

 From the discussion it can be seen that we need only two metrics for now:
- сacheOperationsBlockedDuration (long)
- totalCacheOperationsBlockedDuration (long)

I will prepare PR at the nearest time.

ср, 24 июл. 2019 г. в 09:11, Zhenya Stanilovsky 
:

+1 with Anton decisions.



Среда, 24 июля 2019, 8:44 +03:00 от Anton Vinogradov :

Folks,

It looks like we're trying to implement "extended debug" instead of
"monitoring".
It should not be interesting for real admin what phase of PME is in
progress and so on.
Interested metrics are
- total blocked time (will be used for real SLA counting)
- are we blocked right now (shows we have an SLA degradation right now)
Duration of the current blocking period can be easily presented using

any

modern monitoring tool by regular checks.
Initial true will means "period start", precision will be a result of
checks frequency.
Anyway, I'm ok to have current metric presented with long, where long

is a

duration, see no reason, but ok :)

All other features you mentioned are useful for code or
deployment improving and can (should) be taken from logs at the analysis
phase.

On Tue, Jul 23, 2019 at 7:22 PM Ivan Rakov < ivan.glu...@gmail.com >

wrote:

Folks, let me step in.

Nikita, thanks for your suggestions!


1. initialVersion. Topology version that initiates the exchange.
2. initTime. Time PME was started.
3. initEvent. Event that triggered PME.
4. partitionReleaseTime. Time when a node has finished waiting for

all

updates and translations on a previous topology.
5. sendSingleMessageTime. Time when a node sent a single message.
6. recieveFullMessageTime. Time when a node received a full message.
7. finishTime. Time PME was ended.

When new PME started all these metrics resets.

Every metric from Nikita's list looks useful and simple to implement.
I think that it would be better to change format of metrics 4, 5, 6

and

7 a bit: we can keep only difference between time of previous event

and

time of corresponding event. Such metrics would be easier to perceive:
they answer to specific questions "how much time did partition release
take?" or "how much time did awaiting of distributed phase end take?".
Also, if results of 4, 5, 6, 7 will be exported to monitoring system,
graphs will show how different stages times change from one PME to

another.

When PME cause no blocking, it's a good PME and I see no reason to

have

monitoring related to it

Agree with Anton here. These metrics should be measured only for true
distributed exchange. Saving results for client leave/join PMEs will
just complicate monitoring.


I agree with total blocking duration metric but
I still don't understand why instant value indicating that

operations are

blocked should be boolean.
Duration time since blocking has started looks more appropriate and

useful

Re: Partition map exchange metrics

2019-07-23 Thread Ivan Rakov

Folks, let me step in.

Nikita, thanks for your suggestions!


1. initialVersion. Topology version that initiates the exchange.
2. initTime. Time PME was started.
3. initEvent. Event that triggered PME.
4. partitionReleaseTime. Time when a node has finished waiting for all
updates and translations on a previous topology.
5. sendSingleMessageTime. Time when a node sent a single message.
6. recieveFullMessageTime. Time when a node received a full message.
7. finishTime. Time PME was ended.

When new PME started all these metrics resets.

Every metric from Nikita's list looks useful and simple to implement.
I think that it would be better to change format of metrics 4, 5, 6 and 
7 a bit: we can keep only difference between time of previous event and 
time of corresponding event. Such metrics would be easier to perceive: 
they answer to specific questions "how much time did partition release 
take?" or "how much time did awaiting of distributed phase end take?".
Also, if results of 4, 5, 6, 7 will be exported to monitoring system, 
graphs will show how different stages times change from one PME to another.



When PME cause no blocking, it's a good PME and I see no reason to have
monitoring related to it
Agree with Anton here. These metrics should be measured only for true 
distributed exchange. Saving results for client leave/join PMEs will 
just complicate monitoring.



I agree with total blocking duration metric but
I still don't understand why instant value indicating that operations are
blocked should be boolean.
Duration time since blocking has started looks more appropriate and useful.
It gives more information while semantic is left the same.
Totally agree with Pavel here. Both "accumulated block time" and 
"current PME block time" metrics are useful. Growth of accumulated 
metric for specific period of time (should be easy to check via 
monitoring system graph) will show for how much business operations were 
blocked in total, and non-zero current metric will show that we are 
experiencing issues right now. Boolean metric "are we blocked right now" 
is not needed as it's obviously can be inferred from "current PME block 
time".


Best Regards,
Ivan Rakov

On 23.07.2019 16:02, Pavel Kovalenko wrote:

Nikita,

I agree with total blocking duration metric but
I still don't understand why instant value indicating that operations are
blocked should be boolean.
Duration time since blocking has started looks more appropriate and useful.
It gives more information while semantic is left the same.



вт, 23 июл. 2019 г. в 11:42, Nikita Amelchev :


Folks,

All previous suggestions have some disadvantages. It can be several
exchanges between two metric updates and fast exchange can rewrite
previous long exchange.

We can introduce a metric of total blocking duration that will
accumulate at the end of the exchange. So, users will get actual
information about how long operations were blocked. Cluster metric
will be a maximum of local nodes metrics. And we need a boolean metric
that will indicate realtime status. It needs because of duration
metric updates at the end of the exchange.

So I propose to change the current metric that not released to the
totalCacheOperationsBlockingDuration metric and to add the
isCacheOperationsBlocked metric.

WDYT?

пн, 22 июл. 2019 г. в 09:27, Anton Vinogradov :

Nikolay,

Still see no reason to replace boolean with long.

On Mon, Jul 22, 2019 at 9:19 AM Nikolay Izhikov 

wrote:

Anton.

1. Value exported based on SPI settings, not in the moment it changed.

2. Clock synchronisation - if we export start time, we should also

export

node local timestamp.

пн, 22 июля 2019 г., 8:33 Anton Vinogradov :


Folks,

What's the reason for duration counting?
AFAIU, it's a monitoring system feature to count the durations.
Sine monitoring system checks metrics periodically it will know the
duration by its own log.

On Fri, Jul 19, 2019 at 7:32 PM Pavel Kovalenko 
wrote:


Nikita,

Yes, I mean duration not timestamp. For the metric name, I suggest
"cacheOperationsBlockingDuration", I think it cleaner represents

what

is

blocked during PME.
We can also combine both timestamp

"cacheOperationsBlockingStartTs" and

duration to have better correlation when cache operations were

blocked

and

how much time it's taken.
For instant view (like in JMX bean) a calculated value as you

mentioned

can be used.
For metrics are exported to some backend (IEP-35) a counter can be

used.

The counter is incremented by blocking time after blocking has

ended.

пт, 19 июл. 2019 г. в 19:10, Nikita Amelchev 
:

Pavel,

The main purpose of this metric is

how much time we wait for resuming cache operations

Seems I misunderstood you. Do you mean timestamp or duration here?

What do you think if we change the boolean value of metric to a

long

value that represents time in milliseconds when operations were

blocked?

This time can 

Re: Improvements for new security approach.

2019-07-18 Thread Ivan Rakov

Hello Max,

Thanks for your analysis!

Have you created a JIRA issue for discovered defects?

Best Regards,
Ivan Rakov

On 17.07.2019 17:08, Maksim Stepachev wrote:

Hello, Igniters.

 The main idea of the new security is propagation security context to
other nodes and does action with initial permission. The solution looks
fine but has imperfections.

1. ZookeaperDiscoveryImpl doesn't implement security into itself.
   As a result: Caused by: class org.apache.ignite.spi.IgniteSpiException:
Security context isn't certain.
2. The visor tasks lost permission.
The method VisorQueryUtils#scheduleQueryStart makes a new thread and loses
context.
3. The GridRestProcessor does tasks outside "withContext" section.  As
result context loses.
4. The GridRestProcessor isn't client, we can't read security subject from
node attribute.
We should transmit secCtx for fake nodes and secSubjId for real.
5. NoOpIgniteSecurityProcessor should include a disabled processor and
validate it too if it is not null. It is important for a client node.
For example:
Into IgniteKernal#securityProcessor method createComponent return a
GridSecurityProcessor. For server nodes are enabled, but for clients
aren't.  The clients aren't able to pass validation for this reason.

6. ATTR_SECURITY_SUBJECT was removed. It broke compatibility.

I going to fix it.



Re: Tx lock partial happens before

2019-07-15 Thread Ivan Rakov

Anton,


Step-by-step:
1) primary locked on key mention (get/put) at pessimistic/!read-committed tx
2) backups locked on prepare
3) primary unlocked on finish
4) backups unlocked on finish (after the primary)
correct?
Yes, this corresponds to my understanding of transactions protocol. With 
minor exception: steps 3 and 4 are inverted in case of one-phase commit.



Agree, but seems there is no need to acquire the lock, we have just to wait
until entry becomes unlocked.
- entry locked means that previous tx's "finish" phase is in progress
- entry unlocked means reading value is up-to-date (previous "finish" phase
finished)
correct?
Diving deeper, entry is locked if its GridCacheMapEntry.localCandidates 
queue is not empty (first item in queue is actually the transaction that 
owns lock).



we have just to wait
until entry becomes unlocked.

This may work.
If consistency checking code has acquired lock on primary, backup can be 
in two states:

- not locked - and new locks won't appear as we are holding lock on primary
- still locked by transaction that owned lock on primary just before our 
checking code - in such case checking code should just wait for lock release


Best Regards,
Ivan Rakov

On 15.07.2019 9:34, Anton Vinogradov wrote:

Ivan R.

Thanks for joining!

Got an idea, but not sure that got a way of a fix.

AFAIK (can be wrong, please correct if necessary), at 2PC, locks are
acquired on backups during the "prepare" phase and released at "finish"
phase after primary fully committed.
Step-by-step:
1) primary locked on key mention (get/put) at pessimistic/!read-committed tx
2) backups locked on prepare
3) primary unlocked on finish
4) backups unlocked on finish (after the primary)
correct?

So, acquiring locks on backups, not at the "prepare" phase, may cause
unexpected behavior in case of primary fail or other errors.
That's definitely possible to update failover to solve this issue, but it
seems to be an overcomplicated way.
The main question there, it there any simple way?


checking read from backup will just wait for commit if it's in progress.

Agree, but seems there is no need to acquire the lock, we have just to wait
until entry becomes unlocked.
- entry locked means that previous tx's "finish" phase is in progress
- entry unlocked means reading value is up-to-date (previous "finish" phase
finished)
correct?

On Mon, Jul 15, 2019 at 8:37 AM Павлухин Иван  wrote:


Anton,

I did not know mechanics locking entries on backups during prepare
phase. Thank you for pointing that out!

пт, 12 июл. 2019 г. в 22:45, Ivan Rakov :

Hi Anton,


Each get method now checks the consistency.
Check means:
1) tx lock acquired on primary
2) gained data from each owner (primary and backups)
3) data compared

Did you consider acquiring locks on backups as well during your check,
just like 2PC prepare does?
If there's HB between steps 1 (lock primary) and 2 (update primary +
lock backup + update backup), you may be sure that there will be no
false-positive results and no deadlocks as well. Protocol won't be
complicated: checking read from backup will just wait for commit if it's
in progress.

Best Regards,
Ivan Rakov

On 12.07.2019 9:47, Anton Vinogradov wrote:

Igniters,

Let me explain problem in detail.
Read Repair at pessimistic tx (locks acquired on primary, full sync,

2pc)

able to see consistency violation because backups are not updated yet.
This seems to be not a good idea to "fix" code to unlock primary only

when

backups updated, this definitely will cause a performance drop.
Currently, there is no explicit sync feature allows waiting for backups
updated during the previous tx.
Previous tx just sends GridNearTxFinishResponse to the originating

node.

Bad ideas how to handle this:
- retry some times (still possible to gain false positive)
- lock tx entry on backups (will definitely break failover logic)
- wait for same entry version on backups during some timeout (will

require

huge changes at "get" logic and false positive still possible)

Is there any simple fix for this issue?
Thanks for tips in advance.

Ivan,
thanks for your interest


4. Very fast and lucky txB writes a value 2 for the key on primary

and

backup.
AFAIK, reordering not possible since backups "prepared" before primary
releases lock.
So, consistency guaranteed by failover and by "prepare" feature of 2PC.
Seems, the problem is NOT with consistency at AI, but with consistency
detection implementation (RR) and possible "false positive" results.
BTW, checked 1PC case (only one data node at test) and gained no

issues.

On Fri, Jul 12, 2019 at 9:26 AM Павлухин Иван 

wrote:

Anton,

Is such behavior observed for 2PC or for 1PC optimization? Does not it
mean that the things can be even worse and an inconsistent write is
possible on a backup? E.g. in scenario:
1. txA writes a value 1 for the key on primar

Re: Tx lock partial happens before

2019-07-12 Thread Ivan Rakov

Hi Anton,


Each get method now checks the consistency.
Check means:
1) tx lock acquired on primary
2) gained data from each owner (primary and backups)
3) data compared
Did you consider acquiring locks on backups as well during your check, 
just like 2PC prepare does?
If there's HB between steps 1 (lock primary) and 2 (update primary + 
lock backup + update backup), you may be sure that there will be no 
false-positive results and no deadlocks as well. Protocol won't be 
complicated: checking read from backup will just wait for commit if it's 
in progress.


Best Regards,
Ivan Rakov

On 12.07.2019 9:47, Anton Vinogradov wrote:

Igniters,

Let me explain problem in detail.
Read Repair at pessimistic tx (locks acquired on primary, full sync, 2pc)
able to see consistency violation because backups are not updated yet.
This seems to be not a good idea to "fix" code to unlock primary only when
backups updated, this definitely will cause a performance drop.
Currently, there is no explicit sync feature allows waiting for backups
updated during the previous tx.
Previous tx just sends GridNearTxFinishResponse to the originating node.

Bad ideas how to handle this:
- retry some times (still possible to gain false positive)
- lock tx entry on backups (will definitely break failover logic)
- wait for same entry version on backups during some timeout (will require
huge changes at "get" logic and false positive still possible)

Is there any simple fix for this issue?
Thanks for tips in advance.

Ivan,
thanks for your interest


4. Very fast and lucky txB writes a value 2 for the key on primary and

backup.
AFAIK, reordering not possible since backups "prepared" before primary
releases lock.
So, consistency guaranteed by failover and by "prepare" feature of 2PC.
Seems, the problem is NOT with consistency at AI, but with consistency
detection implementation (RR) and possible "false positive" results.
BTW, checked 1PC case (only one data node at test) and gained no issues.

On Fri, Jul 12, 2019 at 9:26 AM Павлухин Иван  wrote:


Anton,

Is such behavior observed for 2PC or for 1PC optimization? Does not it
mean that the things can be even worse and an inconsistent write is
possible on a backup? E.g. in scenario:
1. txA writes a value 1 for the key on primary.
2. txA unlocks the key on primary.
3. txA freezes before updating backup.
4. Very fast and lucky txB writes a value 2 for the key on primary and
backup.
5. txB wakes up and writes 1 for the key.
6. As result there is 2 on primary and 1 on backup.

Naively it seems that locks should be released after all replicas are
updated.

ср, 10 июл. 2019 г. в 16:36, Anton Vinogradov :

Folks,

Investigating now unexpected repairs [1] in case of ReadRepair usage at
testAccountTxNodeRestart.
Updated [2] the test to check is there any repairs happen.
Test's name now is "testAccountTxNodeRestartWithReadRepair".

Each get method now checks the consistency.
Check means:
1) tx lock acquired on primary
2) gained data from each owner (primary and backups)
3) data compared

Sometime, backup may have obsolete value during such check.

Seems, this happen because tx commit on primary going in the following

way

(check code [2] for details):
1) performing localFinish (releases tx lock)
2) performing dhtFinish (commits on backups)
3) transferring control back to the caller

So, seems, the problem here is that "tx lock released on primary" does

not

mean that backups updated, but "commit() method finished at caller's
thread" does.
This means that, currently, there is no happens-before between
1) thread 1 committed data on primary and tx lock can be reobtained
2) thread 2 reads from backup
but still strong HB between "commit() finished" and "backup updated"

So, it seems to be possible, for example, to gain notification by a
continuous query, then read from backup and gain obsolete value.

Is this "partial happens before" behavior expected?

[1] https://issues.apache.org/jira/browse/IGNITE-11973
[2] https://github.com/apache/ignite/pull/6679/files
[3]


org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal#finishTx



--
Best regards,
Ivan Pavlukhin



Re: Lightweight version of partitions map exchange

2019-07-09 Thread Ivan Rakov

Hi Nikita,

I've checked out your branch, looked through the changes and run 
IgniteBaselineNodeLeaveExchangeTest. Some thoughts:


1. First of all, there's fundamental issue that backup and primary 
partitions behave differently:
- On primary, updating transaction needs to own exclusive lock (be on 
top of GridCacheMapEntry#localCandidates queue) on key object for the 
whole prepare-commit cycle. That's how two-phase commit works in Ignite.
- Primary node generates update counter via 
PartitionTxUpdateCounterImpl#reserve, while backup receives update and 
just applies it with provided counter.
So, if we'll perform PME in non-distributed way, we'll lose 
happen-before guarantees between updates of transactions mapped on 
previous topology and ones that are mapped to new topology. This may 
cause the following issues:
- New primary node may start behaving as primary (spawn DHT transaction 
instances and acquire exclusive locks) but still may receive updates 
from previous primary. I don't know how to handle these updates 
correctly as they may conflict with new updates and locks.
- New primary node should start generating update counters, but it 
actually doesn't know last update counter in cluster. If it 
optimistically will start from last known counter, partition consistency 
may break in case updates with actual last update counter will arrive (I 
guess, this issue should be reproduced as LWM > HWM assertion error).


2. According to current state of your test, testBltServerLeaveUnderLoad 
is called only with PickKeyOption#NO_DATA_ON_LEAVING_NODE (which means 
backups that are promoted to primaries without global synchronization 
are not affected by transactional load). However, it still fails with 
LWM > HWM assertion. I guess, there are another details in new partition 
counters implementation that require global happen-before between 
updates of transactions that are mapped to different topology versions.


Alex S,

backups that are promoted to primaries without global synchronization 
are not affected by transactional load
test still fails with LWM > HWM assertion 

Do you have any ideas why this may happen?
New primary node should start generating update counters, but it 
actually doesn't know last update counter in cluster. If it 
optimistically will start from last known counter, partition 
consistency may break in case updates with actual last update counter 
will arrive (I guess, this issue should be reproduced as LWM > HWM 
assertion error). 

How do you think, does this problem looks solvable?

Alex S and Alex G,
New primary node may start behaving as primary (spawn DHT transaction 
instances and acquire exclusive locks) but still may receive updates 
from previous primary. I don't know how to handle these updates 
correctly as they may conflict with new updates and locks.


How do you think, can we overcome this limitation with our existing 
implementation of transactions?


Best Regards,
Ivan Rakov

On 01.07.2019 11:13, Nikita Amelchev wrote:

Hi, Igniters.

I'm working on the implementation of lightweight PME for a baseline
node leave case. [1] In my implementation, each node recalculates a
new affinity and completes PME locally without distributed
communication. This is possible because there are all partitions are
distributed according to the baseline topology. And I found two
possible blockers to do it without blocking updates:

1. Finalize partitions counter. It seems that we can't correctly
collect gaps and process them without completing all txs. See the
GridDhtPartitionTopologyImpl#finalizeUpdateCounters method.

2. Apply update counters. We can't correctly set HWM counter if
primary left the cluster and sent updates to part of backups. Such
updates can be processed later and break guarantee that LWM<=HWM.

Is it impossible to leave a baseline node without waiting for all txs completed?

1. https://issues.apache.org/jira/browse/IGNITE-9913

ср, 5 июн. 2019 г. в 12:15, Nikita Amelchev :

Maksim,

I agree with you that we should implement current issue and do not
allow lightweight PME if there are MOVING partitions in the cluster.

But now I'm investigating issue about finalizing update counters cause
it assumes that finalizing happens on exchange and all cache updates
are completed. Here we can wrong process update counters gaps and
break recently merged IGNITE-10078.

And about phase 2, correct me if I misunderstood you.
You suggest do not move primary partitions on rebalancing completing
(do not change affinity assignment)? In this case, nodes recently join
to cluster will not have primary partitions and won't get a load after
rebalancing.

чт, 30 мая 2019 г. в 19:55, Maxim Muzafarov :

Igniters,


I've looked through Nikita's changes and I think for the current issue
[1] we should not allow the existence of MOVING partitions in the
cluster (it must be stable) to run the lightweight PME on BLT node
leave event occurred to achieve truly unlocked operations and here 

Re: Lightweight version of partitions map exchange

2019-07-09 Thread Ivan Rakov

My bad, I've sent the message accidentally. What I wanted to ask:


Alex S,

backups that are promoted to primaries without global synchronization 
are not affected by transactional load 
test still fails with LWM > HWM assertion

Do you have any ideas why this may happen?
New primary node should start generating update counters, but it 
actually doesn't know last update counter in cluster. If it 
optimistically will start from last known counter, partition 
consistency may break in case updates with actual last update counter 
will arrive (I guess, this issue should be reproduced as LWM > HWM 
assertion error). 


How do you think, does this problem looks solvable?

Alex S and Alex G,

New primary node may start behaving as primary (spawn DHT transaction 
instances and acquire exclusive locks) but still may receive updates 
from previous primary. I don't know how to handle these updates 
correctly as they may conflict with new updates and locks.
How do you think, can we overcome this limitation with existing 
transaction implementation?


Best Regards,
Ivan Rakov

On 10.07.2019 2:25, Ivan Rakov wrote:

Hi Nikita,

I've checked out your branch, looked through the changes and run 
IgniteBaselineNodeLeaveExchangeTest. Some thoughts:


1. First of all, there's fundamental issue that backup and primary 
partitions behave differently:
- On primary, updating transaction needs to own exclusive lock (be on 
top of GridCacheMapEntry#localCandidates queue) on key object for the 
whole prepare-commit cycle. That's how two-phase commit works in Ignite.
- Primary node generates update counter via 
PartitionTxUpdateCounterImpl#reserve, while backup receives update and 
just applies it with provided counter.
So, if we'll perform PME in non-distributed way, we'll lose 
happen-before guarantees between updates of transactions mapped on 
previous topology and updates of transactions that are mapped to new 
topology. This may cause the following issues:
- New primary node may start behaving as primary (spawn DHT 
transaction instances and acquire exclusive locks) but still may 
receive updates from previous primary. I don't know how to handle 
these updates correctly as they may conflict with new updates and locks.
- New primary node should start generating update counters, but it 
actually doesn't know last update counter in cluster. If it 
optimistically will start from last known counter, partition 
consistency may break in case updates with actual last update counter 
will arrive (I guess, this issue should be reproduced as LWM > HWM 
assertion error).


2. According to current state of your test, 
testBltServerLeaveUnderLoad is called only with 
PickKeyOption#NO_DATA_ON_LEAVING_NODE (which means backups that are 
promoted to primaries without global synchronization are not affected 
by transactional load). However, it still fails with LWM > HWM 
assertion. I guess, there are another details in new partition 
counters implementation that require global happen-before between 
updates of transactions that are mapped to different topology versions.


Alex S,

backups that are promoted to primaries without global synchronization 
are not affected by transactional load



Best Regards,
Ivan Rakov

On 01.07.2019 11:13, Nikita Amelchev wrote:

Hi, Igniters.

I'm working on the implementation of lightweight PME for a baseline
node leave case. [1] In my implementation, each node recalculates a
new affinity and completes PME locally without distributed
communication. This is possible because there are all partitions are
distributed according to the baseline topology. And I found two
possible blockers to do it without blocking updates:

1. Finalize partitions counter. It seems that we can't correctly
collect gaps and process them without completing all txs. See the
GridDhtPartitionTopologyImpl#finalizeUpdateCounters method.

2. Apply update counters. We can't correctly set HWM counter if
primary left the cluster and sent updates to part of backups. Such
updates can be processed later and break guarantee that LWM<=HWM.

Is it impossible to leave a baseline node without waiting for all txs 
completed?


1. https://issues.apache.org/jira/browse/IGNITE-9913

ср, 5 июн. 2019 г. в 12:15, Nikita Amelchev :

Maksim,

I agree with you that we should implement current issue and do not
allow lightweight PME if there are MOVING partitions in the cluster.

But now I'm investigating issue about finalizing update counters cause
it assumes that finalizing happens on exchange and all cache updates
are completed. Here we can wrong process update counters gaps and
break recently merged IGNITE-10078.

And about phase 2, correct me if I misunderstood you.
You suggest do not move primary partitions on rebalancing completing
(do not change affinity assignment)? In this case, nodes recently join
to cluster will not have primary partitions and won't get a load after
rebalancing.

чт, 30 мая 2019 г. в 19:55, Maxim Muzafarov :

Igniters

Re: Lightweight version of partitions map exchange

2019-07-09 Thread Ivan Rakov

Hi Nikita,

I've checked out your branch, looked through the changes and run 
IgniteBaselineNodeLeaveExchangeTest. Some thoughts:


1. First of all, there's fundamental issue that backup and primary 
partitions behave differently:
- On primary, updating transaction needs to own exclusive lock (be on 
top of GridCacheMapEntry#localCandidates queue) on key object for the 
whole prepare-commit cycle. That's how two-phase commit works in Ignite.
- Primary node generates update counter via 
PartitionTxUpdateCounterImpl#reserve, while backup receives update and 
just applies it with provided counter.
So, if we'll perform PME in non-distributed way, we'll lose 
happen-before guarantees between updates of transactions mapped on 
previous topology and updates of transactions that are mapped to new 
topology. This may cause the following issues:
- New primary node may start behaving as primary (spawn DHT transaction 
instances and acquire exclusive locks) but still may receive updates 
from previous primary. I don't know how to handle these updates 
correctly as they may conflict with new updates and locks.
- New primary node should start generating update counters, but it 
actually doesn't know last update counter in cluster. If it 
optimistically will start from last known counter, partition consistency 
may break in case updates with actual last update counter will arrive (I 
guess, this issue should be reproduced as LWM > HWM assertion error).


2. According to current state of your test, testBltServerLeaveUnderLoad 
is called only with PickKeyOption#NO_DATA_ON_LEAVING_NODE (which means 
backups that are promoted to primaries without global synchronization 
are not affected by transactional load). However, it still fails with 
LWM > HWM assertion. I guess, there are another details in new partition 
counters implementation that require global happen-before between 
updates of transactions that are mapped to different topology versions.


Alex S,

backups that are promoted to primaries without global synchronization 
are not affected by transactional load



Best Regards,
Ivan Rakov

On 01.07.2019 11:13, Nikita Amelchev wrote:

Hi, Igniters.

I'm working on the implementation of lightweight PME for a baseline
node leave case. [1] In my implementation, each node recalculates a
new affinity and completes PME locally without distributed
communication. This is possible because there are all partitions are
distributed according to the baseline topology. And I found two
possible blockers to do it without blocking updates:

1. Finalize partitions counter. It seems that we can't correctly
collect gaps and process them without completing all txs. See the
GridDhtPartitionTopologyImpl#finalizeUpdateCounters method.

2. Apply update counters. We can't correctly set HWM counter if
primary left the cluster and sent updates to part of backups. Such
updates can be processed later and break guarantee that LWM<=HWM.

Is it impossible to leave a baseline node without waiting for all txs completed?

1. https://issues.apache.org/jira/browse/IGNITE-9913

ср, 5 июн. 2019 г. в 12:15, Nikita Amelchev :

Maksim,

I agree with you that we should implement current issue and do not
allow lightweight PME if there are MOVING partitions in the cluster.

But now I'm investigating issue about finalizing update counters cause
it assumes that finalizing happens on exchange and all cache updates
are completed. Here we can wrong process update counters gaps and
break recently merged IGNITE-10078.

And about phase 2, correct me if I misunderstood you.
You suggest do not move primary partitions on rebalancing completing
(do not change affinity assignment)? In this case, nodes recently join
to cluster will not have primary partitions and won't get a load after
rebalancing.

чт, 30 мая 2019 г. в 19:55, Maxim Muzafarov :

Igniters,


I've looked through Nikita's changes and I think for the current issue
[1] we should not allow the existence of MOVING partitions in the
cluster (it must be stable) to run the lightweight PME on BLT node
leave event occurred to achieve truly unlocked operations and here are
my thoughts why.

In general, as Nikita mentioned above, the existence of MOVING
partitions in the cluster means that the rebalance procedure is
currently running. It owns cache partitions locally and sends in the
background (with additional timeout) the actual statuses of his local
partitions to the coordinator node. So, we will always have a lag
between local node partition states and all other cluster nodes
partitions states. This lag can be very huge since previous
#scheduleResendPartitions() is cancelled when a new cache group
rebalance finished. Without the fair partition states synchronization
(without full PME) and in case of local affinity recalculation on BLT
node leave event, other nodes will mark such partitions LOST in most
of the cases, which in fact are present in the cluster and saved on
some node under checkpoint. I see that 

Re: "Idle verify" to "Online verify"

2019-05-06 Thread Ivan Rakov

Anton,

Automatic quorum-based partition drop may work as a partial workaround 
for IGNITE-10078, but discussed approach surely doesn't replace 
IGNITE-10078 activity. We still don't know what do to when quorum can't 
be reached (2 partitions have hash X, 2 have hash Y) and keeping 
extended update counters is the only way to resolve such case.
On the other hand, precalculated partition hashes validation on PME can 
be a good addition to IGNITE-10078 logic: we'll be able to detect 
situations when extended update counters are equal, but for some reason 
(bug or whatsoever) partition contents are different.


Best Regards,
Ivan Rakov

On 06.05.2019 12:27, Anton Vinogradov wrote:

Ivan, just to make sure ...
The discussed case will fully solve the issue [1] in case we'll also add
some strategy to reject partitions with missed updates (updateCnt==Ok,
Hash!=Ok).
For example, we may use the Quorum strategy, when the majority wins.
Sounds correct?

[1] https://issues.apache.org/jira/browse/IGNITE-10078

On Tue, Apr 30, 2019 at 3:14 PM Anton Vinogradov  wrote:


Ivan,

Thanks for the detailed explanation.
I'll try to implement the PoC to check the idea.

On Mon, Apr 29, 2019 at 8:22 PM Ivan Rakov  wrote:


But how to keep this hash?

I think, we can just adopt way of storing partition update counters.
Update counters are:
1) Kept and updated in heap, see
IgniteCacheOffheapManagerImpl.CacheDataStoreImpl#pCntr (accessed during
regular cache operations, no page replacement latency issues)
2) Synchronized with page memory (and with disk) on every checkpoint,
see GridCacheOffheapManager#saveStoreMetadata
3) Stored in partition meta page, see PagePartitionMetaIO#setUpdateCounter
4) On node restart, we init onheap counter with value from disk (for the
moment of last checkpoint) and update it to latest value during WAL
logical records replay


2) PME is a rare operation on production cluster, but, seems, we have
to check consistency in a regular way.
Since we have to finish all operations before the check, should we
have fake PME for maintenance check in this case?

  From my experience, PME happens on prod clusters from time to time
(several times per week), which can be enough. In case it's needed to
check consistency more often than regular PMEs occur, we can implement
command that will trigger fake PME for consistency checking.

Best Regards,
Ivan Rakov

On 29.04.2019 18:53, Anton Vinogradov wrote:

Ivan, thanks for the analysis!


With having pre-calculated partition hash value, we can

automatically detect inconsistent partitions on every PME.
Great idea, seems this covers all broken synс cases.

It will check alive nodes in case the primary failed immediately
and will check rejoining node once it finished a rebalance (PME on
becoming an owner).
Recovered cluster will be checked on activation PME (or even before
that?).
Also, warmed cluster will be still warmed after check.

Have I missed some cases leads to broken sync except bugs?

1) But how to keep this hash?
- It should be automatically persisted on each checkpoint (it should
not require recalculation on restore, snapshots should be covered too)
(and covered by WAL?).
- It should be always available at RAM for every partition (even for
cold partitions never updated/readed on this node) to be immediately
used once all operations done on PME.

Can we have special pages to keep such hashes and never allow their
eviction?

2) PME is a rare operation on production cluster, but, seems, we have
to check consistency in a regular way.
Since we have to finish all operations before the check, should we
have fake PME for maintenance check in this case?

On Mon, Apr 29, 2019 at 4:59 PM Ivan Rakov mailto:ivan.glu...@gmail.com>> wrote:

 Hi Anton,

 Thanks for sharing your ideas.
 I think your approach should work in general. I'll just share my
 concerns about possible issues that may come up.

 1) Equality of update counters doesn't imply equality of
 partitions content under load.
 For every update, primary node generates update counter and then
 update is delivered to backup node and gets applied with the
 corresponding update counter. For example, there are two
 transactions (A and B) that update partition X by the following
 scenario:
 - A updates key1 in partition X on primary node and increments
 counter to 10
 - B updates key2 in partition X on primary node and increments
 counter to 11
 - While A is still updating another keys, B is finally committed
 - Update of key2 arrives to backup node and sets update counter to

11

 Observer will see equal update counters (11), but update of key 1
 is still missing in the backup partition.
 This is a fundamental problem which is being solved here:
 https://issues.apache.org/jira/browse/IGNITE-10078
 "Online verify" should operate with new complex update counters
 which take such "update holes" into 

Re: "Idle verify" to "Online verify"

2019-04-29 Thread Ivan Rakov

But how to keep this hash?

I think, we can just adopt way of storing partition update counters.
Update counters are:
1) Kept and updated in heap, see 
IgniteCacheOffheapManagerImpl.CacheDataStoreImpl#pCntr (accessed during 
regular cache operations, no page replacement latency issues)
2) Synchronized with page memory (and with disk) on every checkpoint, 
see GridCacheOffheapManager#saveStoreMetadata

3) Stored in partition meta page, see PagePartitionMetaIO#setUpdateCounter
4) On node restart, we init onheap counter with value from disk (for the 
moment of last checkpoint) and update it to latest value during WAL 
logical records replay


2) PME is a rare operation on production cluster, but, seems, we have 
to check consistency in a regular way.
Since we have to finish all operations before the check, should we 
have fake PME for maintenance check in this case?
From my experience, PME happens on prod clusters from time to time 
(several times per week), which can be enough. In case it's needed to 
check consistency more often than regular PMEs occur, we can implement 
command that will trigger fake PME for consistency checking.


Best Regards,
Ivan Rakov

On 29.04.2019 18:53, Anton Vinogradov wrote:

Ivan, thanks for the analysis!

>> With having pre-calculated partition hash value, we can 
automatically detect inconsistent partitions on every PME.

Great idea, seems this covers all broken synс cases.

It will check alive nodes in case the primary failed immediately
and will check rejoining node once it finished a rebalance (PME on 
becoming an owner).
Recovered cluster will be checked on activation PME (or even before 
that?).

Also, warmed cluster will be still warmed after check.

Have I missed some cases leads to broken sync except bugs?

1) But how to keep this hash?
- It should be automatically persisted on each checkpoint (it should 
not require recalculation on restore, snapshots should be covered too) 
(and covered by WAL?).
- It should be always available at RAM for every partition (even for 
cold partitions never updated/readed on this node) to be immediately 
used once all operations done on PME.


Can we have special pages to keep such hashes and never allow their 
eviction?


2) PME is a rare operation on production cluster, but, seems, we have 
to check consistency in a regular way.
Since we have to finish all operations before the check, should we 
have fake PME for maintenance check in this case?


On Mon, Apr 29, 2019 at 4:59 PM Ivan Rakov <mailto:ivan.glu...@gmail.com>> wrote:


Hi Anton,

Thanks for sharing your ideas.
I think your approach should work in general. I'll just share my
concerns about possible issues that may come up.

1) Equality of update counters doesn't imply equality of
partitions content under load.
For every update, primary node generates update counter and then
update is delivered to backup node and gets applied with the
corresponding update counter. For example, there are two
transactions (A and B) that update partition X by the following
scenario:
- A updates key1 in partition X on primary node and increments
counter to 10
- B updates key2 in partition X on primary node and increments
counter to 11
- While A is still updating another keys, B is finally committed
- Update of key2 arrives to backup node and sets update counter to 11
Observer will see equal update counters (11), but update of key 1
is still missing in the backup partition.
This is a fundamental problem which is being solved here:
https://issues.apache.org/jira/browse/IGNITE-10078
"Online verify" should operate with new complex update counters
which take such "update holes" into account. Otherwise, online
verify may provide false-positive inconsistency reports.

2) Acquisition and comparison of update counters is fast, but
partition hash calculation is long. We should check that update
counter remains unchanged after every K keys handled.

3)


Another hope is that we'll be able to pause/continue scan, for
example, we'll check 1/3 partitions today, 1/3 tomorrow, and in
three days we'll check the whole cluster.

Totally makes sense.
We may find ourselves into a situation where some "hot" partitions
are still unprocessed, and every next attempt to calculate
partition hash fails due to another concurrent update. We should
be able to track progress of validation (% of calculation time
wasted due to concurrent operations may be a good metric, 100% is
the worst case) and provide option to stop/pause activity.
I think, pause should return an "intermediate results report" with
information about which partitions have been successfully checked.
With such report, we can resume activity later: partitions from
report will be just skipped.

4)


Since "Idle verify" uses

Re: "Idle verify" to "Online verify"

2019-04-29 Thread Ivan Rakov

Hi Anton,

Thanks for sharing your ideas.
I think your approach should work in general. I'll just share my 
concerns about possible issues that may come up.


1) Equality of update counters doesn't imply equality of partitions 
content under load.
For every update, primary node generates update counter and then update 
is delivered to backup node and gets applied with the corresponding 
update counter. For example, there are two transactions (A and B) that 
update partition X by the following scenario:

- A updates key1 in partition X on primary node and increments counter to 10
- B updates key2 in partition X on primary node and increments counter to 11
- While A is still updating another keys, B is finally committed
- Update of key2 arrives to backup node and sets update counter to 11
Observer will see equal update counters (11), but update of key 1 is 
still missing in the backup partition.
This is a fundamental problem which is being solved here: 
https://issues.apache.org/jira/browse/IGNITE-10078
"Online verify" should operate with new complex update counters which 
take such "update holes" into account. Otherwise, online verify may 
provide false-positive inconsistency reports.


2) Acquisition and comparison of update counters is fast, but partition 
hash calculation is long. We should check that update counter remains 
unchanged after every K keys handled.


3)

Another hope is that we'll be able to pause/continue scan, for 
example, we'll check 1/3 partitions today, 1/3 tomorrow, and in three 
days we'll check the whole cluster.

Totally makes sense.
We may find ourselves into a situation where some "hot" partitions are 
still unprocessed, and every next attempt to calculate partition hash 
fails due to another concurrent update. We should be able to track 
progress of validation (% of calculation time wasted due to concurrent 
operations may be a good metric, 100% is the worst case) and provide 
option to stop/pause activity.
I think, pause should return an "intermediate results report" with 
information about which partitions have been successfully checked. With 
such report, we can resume activity later: partitions from report will 
be just skipped.


4)

Since "Idle verify" uses regular pagmem, I assume it replaces hot data 
with persisted.

So, we have to warm up the cluster after each check.
Are there any chances to check without cooling the cluster?
I don't see an easy way to achieve it with our page memory architecture. 
We definitely can't just read pages from disk directly: we need to 
synchronize page access with concurrent update operations and checkpoints.
From my point of view, the correct way to solve this issue is improving 
our page replacement [1] mechanics by making it truly scan-resistant.


P. S. There's another possible way of achieving online verify: instead 
of on-demand hash calculation, we can always keep up-to-date hash value 
for every partition. We'll need to update hash on every 
insert/update/remove operation, but there will be no reordering issues 
as per function that we use for aggregating hash results (+) is 
commutative. With having pre-calculated partition hash value, we can 
automatically detect inconsistent partitions on every PME. What do you 
think?


[1] - 
https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Durable+Memory+-+under+the+hood#IgniteDurableMemory-underthehood-Pagereplacement(rotationwithdisk)


Best Regards,
Ivan Rakov

On 29.04.2019 12:20, Anton Vinogradov wrote:

Igniters and especially Ivan Rakov,

"Idle verify" [1] is a really cool tool, to make sure that cluster is 
consistent.


1) But it required to have operations paused during cluster check.
At some clusters, this check requires hours (3-4 hours at cases I saw).
I've checked the code of "idle verify" and it seems it possible to 
make it "online" with some assumptions.


Idea:
Currently "Idle verify" checks that partitions hashes, generated this way
while (it.hasNextX()) {
CacheDataRow row = it.nextX();
partHash += row.key().hashCode();
partHash += 
Arrays.hashCode(row.value().valueBytes(grpCtx.cacheObjectContext()));

}
, are the same.

What if we'll generate same pairs updateCounter-partitionHash but will 
compare hashes only in case counters are the same?
So, for example, will ask cluster to generate pairs for 64 partitions, 
then will find that 55 have the same counters (was not updated during 
check) and check them.
The rest (64-55 = 9) partitions will be re-requested and rechecked 
with an additional 55.
This way we'll be able to check cluster is consistent even in сase 
operations are in progress (just retrying modified).


Risks and assumptions:
Using this strategy we'll check the cluster's consistency ... 
eventually, and the check will take more time even on an idle cluster.
In case operationsPerTimeToGeneratePartitionHashes > partitionsCount 
we'll definitely gain no progr

Re: IgniteConfigVariationsAbstractTest subclasses do not run

2019-04-26 Thread Ivan Rakov

Ivan P.,

Good catch, thanks.

I was wrong, test scenario is correct. The problem was in 
atomicityMode() method - it could have returned null (which was okay for 
config generation, but wasn't expected in the test code).
Please take a look at tx_out_test_fixed.patch (attached to 
IGNITE-11708). To sum it up, both issues should be fixed now.


Best Regards,
Ivan Rakov

On 26.04.2019 14:40, Павлухин Иван wrote:

Ivan R.,

As I can see IgniteCacheConfigVariationsFullApiTest#testGetOutTx does
not expect lock/unlock events due to line:
if (atomicityMode() == ATOMIC)
 return lockEvtCnt.get() == 0;

Could you please elaborate?

пт, 26 апр. 2019 г. в 13:32, Ivan Rakov :

Ivan,

Seems like IgniteCacheReadThroughEvictionSelfTest is broken. Test
scenario assumes that even after expiration entry will be present in
IgniteCache as per it will be loaded from CacheStore. However,
CacheStore is not specified in node config. I've added patch that
enables cache store factory, please check IGNITE-11708 attachments.
Regarding IgniteCacheConfigVariationsFullApiTest#testGetOutTx* tests:
from my point of view, test scenarios make no sense. We perform get()
operation from ATOMIC caches and expect that entries will be locked. I
don't understand why we should lock entries on ATOMIC get, therefore I
suppose to remove part of code where we listen and check
EVT_CACHE_OBJECT_LOCKED/UNLOCKED events.

Best Regards,
Ivan Rakov

On 17.04.2019 22:05, Ivan Rakov wrote:

Hi Ivan,

I've checked your branch. Seems like these tests fail due to real
issue in functionality.
I'll take a look.

Best Regards,
Ivan Rakov

On 17.04.2019 13:54, Ivan Fedotov wrote:

Hi, Igniters!

During work on iep-30[1] I discovered that
IgniteConfigVariationsAbstractTest subclasses - it is about 15_000
tests[2]
- do not work.
You can check it just run one of the tests with log output, for example
ConfigVariationsTestSuiteBuilderTest#LegacyLifecycleTest#test1 [3].
There is no warning notification in the console. The same situation with
other IgniteConfigVariationsAbstractTest subclasses - tests run, but
they
simply represent empty code.

So, I created a ticket on such issue [4] and it turned out that the
problem
is with ruleChain in IgniteConfigVariationsAbstractTest [5].
The rule that is responsible for running a test statement does not start
indeed [6] under ruleChain runRule. I suggested a solution - move
testsCfg
initialization to
IgniteConfigVariationsAbstractTest#beforeTestsStarted method. After such
changes ruleChain becomes not necessary.

But I faced another problem - multiple failures on TeamCity [7]. From
logs,
it seems that failures are related to what tests check, but not JUnit
error.
I can not track TeamCity history on that fact were tests failed or
not on
the previous JUnit version - the oldest log is dated the start of the
March
when JUnit4
already was implemented (for example, this [8] test).
Moreover, there are not so much failed tests, but because of running
with
multiple configurations
(InterceptorCacheConfigVariationsFullApiTestSuite_0
..._95) it turns out about 400 failed tests. TeamCity results also
confirm
that tests do not work in the master branch - duration time is less than
1ms. Now all tests are green and that is not surprising - under @Test
annotation, nothing happens.

Could some of us confirm or disprove my guess that tests are red
because of
its functionality, but not JUnit implementation?
And if it is true, how should I take such fact into account in this
ticket?

[1]
https://cwiki.apache.org/confluence/display/IGNITE/IEP-30%3A+Migration+to+JUnit+5

[2]
https://github.com/apache/ignite/blob/master/modules/core/src/test/java/org/apache/ignite/testsuites/InterceptorCacheConfigVariationsFullApiTestSuite.java

[3]
https://github.com/apache/ignite/blob/master/modules/core/src/test/java/org/apache/ignite/testframework/test/ConfigVariationsTestSuiteBuilderTest.java#L434

[4] https://issues.apache.org/jira/browse/IGNITE-11708
[5]
https://github.com/apache/ignite/blob/master/modules/core/src/test/java/org/apache/ignite/testframework/junits/IgniteConfigVariationsAbstractTest.java#L62

[6]
https://github.com/apache/ignite/blob/master/modules/core/src/test/java/org/apache/ignite/testframework/junits/GridAbstractTest.java#L181

[7]
https://mtcga.gridgain.com/pr.html?serverId=apache=IgniteTests24Java8_RunAll=pull/6434/head=Latest

[8]
https://ci.ignite.apache.org/project.html?tab=testDetails=IgniteTests24Java8=-9037806478172035481=8







Re: IgniteConfigVariationsAbstractTest subclasses do not run

2019-04-26 Thread Ivan Rakov

Ivan,

Seems like IgniteCacheReadThroughEvictionSelfTest is broken. Test 
scenario assumes that even after expiration entry will be present in 
IgniteCache as per it will be loaded from CacheStore. However, 
CacheStore is not specified in node config. I've added patch that 
enables cache store factory, please check IGNITE-11708 attachments.
Regarding IgniteCacheConfigVariationsFullApiTest#testGetOutTx* tests: 
from my point of view, test scenarios make no sense. We perform get() 
operation from ATOMIC caches and expect that entries will be locked. I 
don't understand why we should lock entries on ATOMIC get, therefore I 
suppose to remove part of code where we listen and check 
EVT_CACHE_OBJECT_LOCKED/UNLOCKED events.


Best Regards,
Ivan Rakov

On 17.04.2019 22:05, Ivan Rakov wrote:

Hi Ivan,

I've checked your branch. Seems like these tests fail due to real 
issue in functionality.

I'll take a look.

Best Regards,
Ivan Rakov

On 17.04.2019 13:54, Ivan Fedotov wrote:

Hi, Igniters!

During work on iep-30[1] I discovered that
IgniteConfigVariationsAbstractTest subclasses - it is about 15_000 
tests[2]

- do not work.
You can check it just run one of the tests with log output, for example
ConfigVariationsTestSuiteBuilderTest#LegacyLifecycleTest#test1 [3].
There is no warning notification in the console. The same situation with
other IgniteConfigVariationsAbstractTest subclasses - tests run, but 
they

simply represent empty code.

So, I created a ticket on such issue [4] and it turned out that the 
problem

is with ruleChain in IgniteConfigVariationsAbstractTest [5].
The rule that is responsible for running a test statement does not start
indeed [6] under ruleChain runRule. I suggested a solution - move 
testsCfg

initialization to
IgniteConfigVariationsAbstractTest#beforeTestsStarted method. After such
changes ruleChain becomes not necessary.

But I faced another problem - multiple failures on TeamCity [7]. From 
logs,
it seems that failures are related to what tests check, but not JUnit 
error.
I can not track TeamCity history on that fact were tests failed or 
not on
the previous JUnit version - the oldest log is dated the start of the 
March

when JUnit4
already was implemented (for example, this [8] test).
Moreover, there are not so much failed tests, but because of running 
with
multiple configurations 
(InterceptorCacheConfigVariationsFullApiTestSuite_0
..._95) it turns out about 400 failed tests. TeamCity results also 
confirm

that tests do not work in the master branch - duration time is less than
1ms. Now all tests are green and that is not surprising - under @Test
annotation, nothing happens.

Could some of us confirm or disprove my guess that tests are red 
because of

its functionality, but not JUnit implementation?
And if it is true, how should I take such fact into account in this 
ticket?


[1]
https://cwiki.apache.org/confluence/display/IGNITE/IEP-30%3A+Migration+to+JUnit+5 


[2]
https://github.com/apache/ignite/blob/master/modules/core/src/test/java/org/apache/ignite/testsuites/InterceptorCacheConfigVariationsFullApiTestSuite.java 


[3]
https://github.com/apache/ignite/blob/master/modules/core/src/test/java/org/apache/ignite/testframework/test/ConfigVariationsTestSuiteBuilderTest.java#L434 


[4] https://issues.apache.org/jira/browse/IGNITE-11708
[5]
https://github.com/apache/ignite/blob/master/modules/core/src/test/java/org/apache/ignite/testframework/junits/IgniteConfigVariationsAbstractTest.java#L62 


[6]
https://github.com/apache/ignite/blob/master/modules/core/src/test/java/org/apache/ignite/testframework/junits/GridAbstractTest.java#L181 


[7]
https://mtcga.gridgain.com/pr.html?serverId=apache=IgniteTests24Java8_RunAll=pull/6434/head=Latest 


[8]
https://ci.ignite.apache.org/project.html?tab=testDetails=IgniteTests24Java8=-9037806478172035481=8 





[jira] [Created] (IGNITE-11807) Index validation control.sh command may provide false-positive error results

2019-04-25 Thread Ivan Rakov (JIRA)
Ivan Rakov created IGNITE-11807:
---

 Summary: Index validation control.sh command may provide 
false-positive error results
 Key: IGNITE-11807
 URL: https://issues.apache.org/jira/browse/IGNITE-11807
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Rakov
 Fix For: 2.8


There are two possible issues in validate_indexes command:
1. In case index validation is performed under load, there's a chance that 
we'll fetch link from B+ tree and won't found this key in partition cache data 
store as per it was conurrently removed.
We may work it around by double-checking partition update counters (before and 
after indexes validation procedure).
2. Since indexes validation is subscribed to checkpoint start (reason: we 
perform CRC validation of file page store pages which is sensitive to 
concurrent disk page writes), we may bump into the following situation:
- User fairly stops all load
- A few moments later users triggers validate_indexes
- Checkpoint starts due to timeout, pages that were modified before 
validate_indexes start are being written to the disk
- validate_indexes fails
We may work it around by triggering checkpoint forcibly before start of indexes 
validation activities.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: IgniteConfigVariationsAbstractTest subclasses do not run

2019-04-17 Thread Ivan Rakov

Hi Ivan,

I've checked your branch. Seems like these tests fail due to real issue 
in functionality.

I'll take a look.

Best Regards,
Ivan Rakov

On 17.04.2019 13:54, Ivan Fedotov wrote:

Hi, Igniters!

During work on iep-30[1] I discovered that
IgniteConfigVariationsAbstractTest subclasses - it is about 15_000 tests[2]
- do not work.
You can check it just run one of the tests with log output, for example
ConfigVariationsTestSuiteBuilderTest#LegacyLifecycleTest#test1 [3].
There is no warning notification in the console. The same situation with
other IgniteConfigVariationsAbstractTest subclasses - tests run, but they
simply represent empty code.

So, I created a ticket on such issue [4] and it turned out that the problem
is with ruleChain in IgniteConfigVariationsAbstractTest [5].
The rule that is responsible for running a test statement does not start
indeed [6] under ruleChain runRule. I suggested a solution - move testsCfg
initialization to
IgniteConfigVariationsAbstractTest#beforeTestsStarted method. After such
changes ruleChain becomes not necessary.

But I faced another problem - multiple failures on TeamCity [7]. From logs,
it seems that failures are related to what tests check, but not JUnit error.
I can not track TeamCity history on that fact were tests failed or not on
the previous JUnit version - the oldest log is dated the start of the March
when JUnit4
already was implemented (for example, this [8] test).
Moreover, there are not so much failed tests, but because of running with
multiple configurations (InterceptorCacheConfigVariationsFullApiTestSuite_0
..._95) it turns out about 400 failed tests. TeamCity results also confirm
that tests do not work in the master branch - duration time is less than
1ms. Now all tests are green and that is not surprising - under @Test
annotation, nothing happens.

Could some of us confirm or disprove my guess that tests are red because of
its functionality, but not JUnit implementation?
And if it is true, how should I take such fact into account in this ticket?

[1]
https://cwiki.apache.org/confluence/display/IGNITE/IEP-30%3A+Migration+to+JUnit+5
[2]
https://github.com/apache/ignite/blob/master/modules/core/src/test/java/org/apache/ignite/testsuites/InterceptorCacheConfigVariationsFullApiTestSuite.java
[3]
https://github.com/apache/ignite/blob/master/modules/core/src/test/java/org/apache/ignite/testframework/test/ConfigVariationsTestSuiteBuilderTest.java#L434
[4] https://issues.apache.org/jira/browse/IGNITE-11708
[5]
https://github.com/apache/ignite/blob/master/modules/core/src/test/java/org/apache/ignite/testframework/junits/IgniteConfigVariationsAbstractTest.java#L62
[6]
https://github.com/apache/ignite/blob/master/modules/core/src/test/java/org/apache/ignite/testframework/junits/GridAbstractTest.java#L181
[7]
https://mtcga.gridgain.com/pr.html?serverId=apache=IgniteTests24Java8_RunAll=pull/6434/head=Latest
[8]
https://ci.ignite.apache.org/project.html?tab=testDetails=IgniteTests24Java8=-9037806478172035481=8



[jira] [Created] (IGNITE-11769) Investigate JVM crash in PDS Direct IO TeamCity suites

2019-04-17 Thread Ivan Rakov (JIRA)
Ivan Rakov created IGNITE-11769:
---

 Summary: Investigate JVM crash in PDS Direct IO TeamCity suites
 Key: IGNITE-11769
 URL: https://issues.apache.org/jira/browse/IGNITE-11769
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Rakov
 Fix For: 2.8


Both PDS Direct IO suites periodically fail with JVM crash.
The issue can be reproduced on Linux machine by running 
IgnitePdsWithTtlTest#testTtlIsAppliedAfterRestart using ignite-direct-io 
classpath.
The investigation is complicated because JVM crash report* is not generated* 
during this crash. After some point, JVM stays dormant for 2 minutes and then 
process gets killed by OS signal
{code:java}
Process finished with exit code 134 (interrupted by signal 6: SIGABRT
{code}
and the following error messages can be dumped to stderr before process death
{code:java}
`corrupted double-linked list`
`free(): corrupted unsorted chunks`
{code}
which appear to be libc error messages. Seems like Ignite corrupts virtual 
memory in sophisticated way which prevents normal JVM Crash flow.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11762) Test testClientStartCloseServersRestart causes hang of the whole Cache 2 suite in master

2019-04-16 Thread Ivan Rakov (JIRA)
Ivan Rakov created IGNITE-11762:
---

 Summary: Test testClientStartCloseServersRestart causes hang of 
the whole Cache 2 suite in master
 Key: IGNITE-11762
 URL: https://issues.apache.org/jira/browse/IGNITE-11762
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Rakov
Assignee: Pavel Kovalenko
 Fix For: 2.8


Attempt to restart server node in test hangs:

 
{code:java}
[2019-04-16 19:56:45,049][WARN ][restart-1][GridCachePartitionExchangeManager] 
Failed to wait for initial partition map exchange. Possible reasons are:
^-- Transactions in deadlock.
^-- Long running transactions (ignore if this is the case).
^-- Unreleased explicit locks.
{code}
The reason is that previous PME (late affinity assignment) still hangs due to 
pending transaction:

 

 
{code:java}
[2019-04-16 19:56:23,717][WARN 
][exchange-worker-#1039%cache.IgniteClientCacheStartFailoverTest3%][diagnostic] 
Pending transactions:
[2019-04-16 19:56:23,718][WARN 
][exchange-worker-#1039%cache.IgniteClientCacheStartFailoverTest3%][diagnostic] 
>>> [txVer=AffinityTopologyVersion [topVer=11, minorTopVer=0], exchWait=true, 
tx=GridDhtTxLocal [nearNodeId=8559bfe0-3d4a-4090-a457-6df0eba5, 
nearFutId=1edc7172a61-941f9dde-2b60-4a1f-8213-7d23d738bf33, nearMiniId=1, 
nearFinFutId=null, nearFinMiniId=0, nearXidVer=GridCacheVersion 
[topVer=166913752, order=1555433759036, nodeOrder=6], lb=null, 
super=GridDhtTxLocalAdapter [nearOnOriginatingNode=false, nearNodes=KeySetView 
[], dhtNodes=KeySetView [9ef33532-0e4a-4561-b57e-042afe10], 
explicitLock=false, super=IgniteTxLocalAdapter [completedBase=null, 
sndTransformedVals=false, depEnabled=false, txState=IgniteTxStateImpl 
[activeCacheIds=[-1062368467], recovery=false, mvccEnabled=true, 
mvccCachingCacheIds=[], txMap=HashSet []], super=IgniteTxAdapter 
[xidVer=GridCacheVersion [topVer=166913752, order=1555433759045, nodeOrder=10], 
writeVer=null, implicit=false, loc=true, threadId=1210, 
startTime=1555433762847, nodeId=0088e9b8-f859-4d14-8071-6388e473, 
startVer=GridCacheVersion [topVer=166913752, order=1555433759045, 
nodeOrder=10], endVer=null, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, 
timeout=0, sysInvalidate=false, sys=false, plc=2, commitVer=GridCacheVersion 
[topVer=166913752, order=1555433759045, nodeOrder=10], finalizing=NONE, 
invalidParts=null, state=MARKED_ROLLBACK, timedOut=false, 
topVer=AffinityTopologyVersion [topVer=11, minorTopVer=0], 
mvccSnapshot=MvccSnapshotResponse [futId=292, crdVer=1555433741506, cntr=395, 
opCntr=1, txs=[394], cleanupVer=390, tracking=0], skipCompletedVers=false, 
parentTx=null, duration=20866ms, onePhaseCommit=false], size=0

{code}
However, load threads don't start any explicit transactions: they either hang 
on put()/get() or on clientCache.close().

Rolling back IGNITE-10799 resolves the issue (however, test remains flaky with 
~10% fail rate due to unhandled TransactionSerializationException).

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11747) Document --tx control script commands

2019-04-15 Thread Ivan Rakov (JIRA)
Ivan Rakov created IGNITE-11747:
---

 Summary: Document --tx control script commands
 Key: IGNITE-11747
 URL: https://issues.apache.org/jira/browse/IGNITE-11747
 Project: Ignite
  Issue Type: Task
  Components: documentation
Reporter: Ivan Rakov


Along with consistency check utilities, ./control.sh script has --tx command 
which allows to display info about active transactions and even kill hanging 
transactions directly.

./control.sh provides just brief description of options:
{code:java}
List or kill transactions:
control.sh --tx [--xid XID] [--min-duration SECONDS] [--min-size SIZE] [--label 
PATTERN_REGEX] [--servers|--clients] [--nodes 
consistentId1[,consistentId2,,consistentIdN]] [--limit NUMBER] [--order 
DURATION|SIZE|START_TIME] [--kill] [--info] [--yes]
{code}
We should document possible use cases and options of the command, possibly 
somewhere close to [https://apacheignite-tools.readme.io/docs/control-script]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11735) Safely handle new closures of IGNITE-11392 in mixed cluster environment

2019-04-12 Thread Ivan Rakov (JIRA)
Ivan Rakov created IGNITE-11735:
---

 Summary: Safely handle new closures of IGNITE-11392 in mixed 
cluster environment
 Key: IGNITE-11735
 URL: https://issues.apache.org/jira/browse/IGNITE-11735
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Rakov
Assignee: Denis Chudov
 Fix For: 2.8


Under IGNITE-11392 we have added two new closures 
(FetchActiveTxOwnerTraceClosure and TxOwnerDumpRequestAllowedSettingClosure).
In case we'll assemble mixed cluster (some nodes contain the patch, some 
don't), we may bump into situation when closures are sent to node that doesn't 
contain corresponding classes in classpath. Normally, closurer will be deployed 
to "old" node via peer-to-peer class deployment. However, p2p may be disabled 
in configuration, which will cause ClassNotFoundException on "old" node.
We should register IGNITE-11392 in IgniteFeatures (recent example: 
IGNITE-11598) and filter out nodes that don't support new feature before 
sending compute.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11591) Add info about lock candidates that are ahead in queue to transaction timeout error message

2019-03-21 Thread Ivan Rakov (JIRA)
Ivan Rakov created IGNITE-11591:
---

 Summary: Add info about lock candidates that are ahead in queue to 
transaction timeout error message
 Key: IGNITE-11591
 URL: https://issues.apache.org/jira/browse/IGNITE-11591
 Project: Ignite
  Issue Type: Improvement
Reporter: Ivan Rakov
 Fix For: 2.8


If transaction is timed out due to lock acquisition failure, corresponding 
error will show up in server log on DHT node:
{code:java}
[2019-03-20 
21:13:10,831][ERROR][grid-timeout-worker-#23%transactions.TxRollbackOnTimeoutTest0%][GridDhtColocatedCache]
  Failed to acquire lock for request: GridNearLockRequest 
[topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], miniId=1, 
dhtVers=GridCacheVersion[] [null], subjId=651a30e1-45ac-4b35-86d2-028d1f81d8dc, 
taskNameHash=0, createTtl=-1, accessTtl=-1, flags=6, txLbl=null, filter=null, 
super=GridDistributedLockRequest [nodeId=651a30e1-45ac-4b35-86d2-028d1f81d8dc, 
nearXidVer=GridCacheVersion [topVer=164585585, order=1553105588524, 
nodeOrder=4], threadId=262, 
futId=5967e4c9961-d32ea2a6-1789-47d7-bdbf-aa66e6d8c35b, timeout=890, 
isInTx=true, isInvalidate=false, isRead=false, isolation=REPEATABLE_READ, 
retVals=[false], txSize=2, flags=0, keysCnt=1, super=GridDistributedBaseMessage 
[ver=GridCacheVersion [topVer=164585585, order=1553105588524, nodeOrder=4], 
committedVers=null, rolledbackVers=null, cnt=0, super=GridCacheIdMessage 
[cacheId=3556498
class org.apache.ignite.internal.transactions.IgniteTxTimeoutCheckedException: 
Failed to acquire lock within provided timeout for transaction [timeout=890, 
tx=GridDhtTxLocal[xid=f219e4c9961--09cf-6071--0001, 
xidVersion=GridCacheVersion [topVer=164585585, order=1553105588527, 
nodeOrder=1], concurrency=PESSIMISTIC, isolation=REPEATABLE_READ, 
state=MARKED_ROLLBACK, invalidate=false, rollbackOnly=true, 
nodeId=c7dccddb-dee1-4499-94b1-03896350, timeout=890, duration=891]]
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter$PostLockClosure1.apply(IgniteTxLocalAdapter.java:1766)
at 
org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter$PostLockClosure1.apply(IgniteTxLocalAdapter.java:1714)
at 
org.apache.ignite.internal.util.future.GridEmbeddedFuture$2.applyx(GridEmbeddedFuture.java:86)
at 
org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:292)
at 
org.apache.ignite.internal.util.future.GridEmbeddedFuture$AsyncListener1.apply(GridEmbeddedFuture.java:285)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:399)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:347)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:335)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:511)
at 
org.apache.ignite.internal.processors.cache.GridCacheCompoundIdentityFuture.onDone(GridCacheCompoundIdentityFuture.java:56)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:490)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.onComplete(GridDhtLockFuture.java:793)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture.access$900(GridDhtLockFuture.java:89)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture$LockTimeoutObject.onTimeout(GridDhtLockFuture.java:1189)
at 
org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:234)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.j
{code}
It would be much more useful if this message also contained information about 
transaction that actually owns corresponding lock (or information about all 
transactions that are ahead in queue if there are several).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11484) Get rid of ForkJoinPool#commonPool usage for csystem critical tasks

2019-03-05 Thread Ivan Rakov (JIRA)
Ivan Rakov created IGNITE-11484:
---

 Summary: Get rid of ForkJoinPool#commonPool usage for csystem 
critical tasks
 Key: IGNITE-11484
 URL: https://issues.apache.org/jira/browse/IGNITE-11484
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Rakov
Assignee: Ivan Rakov
 Fix For: 2.8


We use ForkJoinPool#commonPool for sorting checkpoint pages.
This may backfire if common pool is already utilized in current JVM: checkpoint 
may wait for sorting for a long time, which in turn will cause user load 
dropdown.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Please re-commit 3 last changes in the master

2019-03-04 Thread Ivan Rakov

Thanks for keeping track of it, I've re-applied the following commits:

IGNITE-11199 Add extra logging for client-server connections in TCP 
discovery - Fixes #6048. Andrey Kalinin* 04.03.2019 2:11
IGNITE-11322 [USABILITY] Extend Node FAILED message by add consistentId 
if it exist - Fixes #6180. Andrey Kalinin* 04.03.2019 2:03


Best Regards,
Ivan Rakov

On 04.03.2019 13:56, Dmitriy Pavlov wrote:

Thanks to Alexey Plehanov for noticing and Infra Team for fixing the issue:
https://issues.apache.org/jira/browse/INFRA-17950

пн, 4 мар. 2019 г. в 13:53, Dmitriy Pavlov :


Hi Developers,

Because of the sync issue, the following 3 commits were lost.

Please re-apply it to the master.

https://gitbox.apache.org/repos/asf?p=ignite.git;a=commit;h=b26bbb29d5fdd9d4de5187042778ebe3b8c6c42e


https://gitbox.apache.org/repos/asf?p=ignite.git;a=commit;h=6c562a997c0beb3a3cd9dd2976e016759a808f0c


https://gitbox.apache.org/repos/asf?p=ignite.git;a=commit;h=45c4dc98e0eac33cccd2e24acb3e9882f098cad1


Sorry for the inconvenience.

Sincerely,
Dmitriy Pavlov



[jira] [Created] (IGNITE-11465) Multiple client leave/join events may wipe affinity assignment history and cause transactions fail

2019-03-02 Thread Ivan Rakov (JIRA)
Ivan Rakov created IGNITE-11465:
---

 Summary: Multiple client leave/join events may wipe affinity 
assignment history and cause transactions fail
 Key: IGNITE-11465
 URL: https://issues.apache.org/jira/browse/IGNITE-11465
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Rakov
Assignee: Ivan Rakov
 Fix For: 2.8


We keep history of GridAffinityAssignmentCache#MAX_HIST_SIZE affinity 
assignments, however flood of client joins/leaves may wipe it out entirely and 
cause fail/hang of transaction that was started before the flood:
{code:java}
if (cache == null || cache.topologyVersion().compareTo(topVer) > 0) 
{
throw new IllegalStateException("Getting affinity for topology 
version earlier than affinity is " +
"calculated [locNode=" + ctx.discovery().localNode() +
", grp=" + cacheOrGrpName +
", topVer=" + topVer +
", head=" + head.get().topologyVersion() +
", history=" + affCache.keySet() +
']');
}
{code}
History is limited in order to prevent JVM heap overflow. At the same time, 
only "server event" affinity assignments are heavy: "client event" assignments 
are just shallow copies of "server event" assignments.
I suggest to limit history by the number of "server event" assignments.
Also, consider the provided fix, I don't see any need to keep 500 items in 
history. I changed history size to 40.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   >