[jira] [Comment Edited] (IGNITE-12074) setFailureDetectionTimeout causes Critical system error detected in log

2019-08-14 Thread Andrey Kuznetsov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-12074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907655#comment-16907655
 ] 

Andrey Kuznetsov edited comment on IGNITE-12074 at 8/14/19 9:56 PM:


By default, {{systemWorkerBlockedTimeout}} is equal to 
{{failureDetectionTimeout}}, so the latter affects blocked system workers 
detection. It looks like you need to increase {{systemWorkerBlockedTimeout}}. 
BTW, it's better to ask such a question at Ignite Users mailing list before 
creating an issue.


was (Author: andrey-kuznetsov):
By default, {{systemWorkerBlockedTimeout}} is equal to 
{{failureDetectionTimeout}}, so the latter affects blocked system workers 
detection. It looks like you need to increase {{systemWorkerBlockedTimeout}}. 
BTW, the it's better to ask such a question at Ignite Users mailing list before 
creating an issue.

> setFailureDetectionTimeout causes Critical system error detected in log
> ---
>
> Key: IGNITE-12074
> URL: https://issues.apache.org/jira/browse/IGNITE-12074
> Project: Ignite
>  Issue Type: Bug
>Reporter: chin
>Priority: Major
>
> If I do 
> setFailureDetectionTimeout(4000);
> then the log is filled with this
> {noformat}
> 2019-08-13T15:59:17.792 SEVERE 
> org.apache.ignite.internal.processors.failure.FailureProcessor.process
> DataGrid :: Critical system error detected. Will be handled accordingly to 
> configured handler [hnd=StopNodeFailureHandler [super=AbstractFailureHandler 
> [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext 
> [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker 
> [name=partition-exchanger, 
> igniteInstanceName=GwClientDGConnectionService-1565726338232--1-[abc], 
> finished=false, heartbeatTs=1565726352914]]]
> class org.apache.ignite.IgniteException: GridWorker 
> [name=partition-exchanger, 
> igniteInstanceName=GwClientDGConnectionService-1565726338232--1-[abc], 
> finished=false, heartbeatTs=1565726352914]
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
> at 
> org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)
> at 
> org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297)
> at 
> org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:221)
> at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> at java.lang.Thread.run(Thread.java:748)
> 2019-08-13T15:59:17.792 WARNING 
> org.apache.ignite.internal.util.IgniteUtils.dumpThreads
> DataGrid :: No deadlocked threads detected.{noformat}
> The nodes are all on my local machine so there's no network latency or 
> anything.
> Despite the logs, there doesn't seem to be any issue with the app running.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-12074) setFailureDetectionTimeout causes Critical system error detected in log

2019-08-14 Thread Andrey Kuznetsov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-12074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907655#comment-16907655
 ] 

Andrey Kuznetsov commented on IGNITE-12074:
---

By default, {{systemWorkerBlockedTimeout}} is equal to 
{{failureDetectionTimeout}}, so the latter affects blocked system workers 
detection. It looks like you need to increase {{systemWorkerBlockedTimeout}}. 
BTW, the it's better to ask such a question at Ignite Users mailing list before 
creating an issue.

> setFailureDetectionTimeout causes Critical system error detected in log
> ---
>
> Key: IGNITE-12074
> URL: https://issues.apache.org/jira/browse/IGNITE-12074
> Project: Ignite
>  Issue Type: Bug
>Reporter: chin
>Priority: Major
>
> If I do 
> setFailureDetectionTimeout(4000);
> then the log is filled with this
> {noformat}
> 2019-08-13T15:59:17.792 SEVERE 
> org.apache.ignite.internal.processors.failure.FailureProcessor.process
> DataGrid :: Critical system error detected. Will be handled accordingly to 
> configured handler [hnd=StopNodeFailureHandler [super=AbstractFailureHandler 
> [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext 
> [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker 
> [name=partition-exchanger, 
> igniteInstanceName=GwClientDGConnectionService-1565726338232--1-[abc], 
> finished=false, heartbeatTs=1565726352914]]]
> class org.apache.ignite.IgniteException: GridWorker 
> [name=partition-exchanger, 
> igniteInstanceName=GwClientDGConnectionService-1565726338232--1-[abc], 
> finished=false, heartbeatTs=1565726352914]
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
> at 
> org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)
> at 
> org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297)
> at 
> org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:221)
> at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> at java.lang.Thread.run(Thread.java:748)
> 2019-08-13T15:59:17.792 WARNING 
> org.apache.ignite.internal.util.IgniteUtils.dumpThreads
> DataGrid :: No deadlocked threads detected.{noformat}
> The nodes are all on my local machine so there's no network latency or 
> anything.
> Despite the logs, there doesn't seem to be any issue with the app running.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-12074) setFailureDetectionTimeout causes Critical system error detected in log

2019-08-14 Thread chin (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-12074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907588#comment-16907588
 ] 

chin commented on IGNITE-12074:
---

My question is:

With the default config, there's no error/warning in the log (as expected)

But just adding in setFailureDetectionTimeout(4000);, I see the mentioned 
error/warning in the log

So why does that innocent-looking setFailureDetectionTimeout(4000); trigger 
that error/warning? What am I missing here?

> setFailureDetectionTimeout causes Critical system error detected in log
> ---
>
> Key: IGNITE-12074
> URL: https://issues.apache.org/jira/browse/IGNITE-12074
> Project: Ignite
>  Issue Type: Bug
>Reporter: chin
>Priority: Major
>
> If I do 
> setFailureDetectionTimeout(4000);
> then the log is filled with this
> {noformat}
> 2019-08-13T15:59:17.792 SEVERE 
> org.apache.ignite.internal.processors.failure.FailureProcessor.process
> DataGrid :: Critical system error detected. Will be handled accordingly to 
> configured handler [hnd=StopNodeFailureHandler [super=AbstractFailureHandler 
> [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext 
> [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker 
> [name=partition-exchanger, 
> igniteInstanceName=GwClientDGConnectionService-1565726338232--1-[abc], 
> finished=false, heartbeatTs=1565726352914]]]
> class org.apache.ignite.IgniteException: GridWorker 
> [name=partition-exchanger, 
> igniteInstanceName=GwClientDGConnectionService-1565726338232--1-[abc], 
> finished=false, heartbeatTs=1565726352914]
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
> at 
> org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)
> at 
> org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297)
> at 
> org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:221)
> at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> at java.lang.Thread.run(Thread.java:748)
> 2019-08-13T15:59:17.792 WARNING 
> org.apache.ignite.internal.util.IgniteUtils.dumpThreads
> DataGrid :: No deadlocked threads detected.{noformat}
> The nodes are all on my local machine so there's no network latency or 
> anything.
> Despite the logs, there doesn't seem to be any issue with the app running.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-12074) setFailureDetectionTimeout causes Critical system error detected in log

2019-08-14 Thread Andrey Kuznetsov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-12074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907584#comment-16907584
 ] 

Andrey Kuznetsov commented on IGNITE-12074:
---

[~chinhodado], failure handler behavior you observe is described in 
[documentation|https://apacheignite.readme.io/docs/critical-failures-handling].

Could you please refine the intent of this issue?

> setFailureDetectionTimeout causes Critical system error detected in log
> ---
>
> Key: IGNITE-12074
> URL: https://issues.apache.org/jira/browse/IGNITE-12074
> Project: Ignite
>  Issue Type: Bug
>Reporter: chin
>Priority: Major
>
> If I do 
> setFailureDetectionTimeout(4000);
> then the log is filled with this
> {noformat}
> 2019-08-13T15:59:17.792 SEVERE 
> org.apache.ignite.internal.processors.failure.FailureProcessor.process
> DataGrid :: Critical system error detected. Will be handled accordingly to 
> configured handler [hnd=StopNodeFailureHandler [super=AbstractFailureHandler 
> [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext 
> [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker 
> [name=partition-exchanger, 
> igniteInstanceName=GwClientDGConnectionService-1565726338232--1-[abc], 
> finished=false, heartbeatTs=1565726352914]]]
> class org.apache.ignite.IgniteException: GridWorker 
> [name=partition-exchanger, 
> igniteInstanceName=GwClientDGConnectionService-1565726338232--1-[abc], 
> finished=false, heartbeatTs=1565726352914]
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
> at 
> org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)
> at 
> org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297)
> at 
> org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:221)
> at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> at java.lang.Thread.run(Thread.java:748)
> 2019-08-13T15:59:17.792 WARNING 
> org.apache.ignite.internal.util.IgniteUtils.dumpThreads
> DataGrid :: No deadlocked threads detected.{noformat}
> The nodes are all on my local machine so there's no network latency or 
> anything.
> Despite the logs, there doesn't seem to be any issue with the app running.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-10619) Add support files transmission between nodes over connection via CommunicationSpi

2019-08-14 Thread Maxim Muzafarov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-10619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907501#comment-16907501
 ] 

Maxim Muzafarov commented on IGNITE-10619:
--

Javadoc fix has been prepared and the suite is green.

[https://github.com/apache/ignite/pull/6776]

[1] 
[https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_Javadoc_IgniteTests24Java8=pull%2F6776%2Fhead=buildTypeStatusDiv]

> Add support files transmission between nodes over connection via 
> CommunicationSpi
> -
>
> Key: IGNITE-10619
> URL: https://issues.apache.org/jira/browse/IGNITE-10619
> Project: Ignite
>  Issue Type: Sub-task
>  Components: persistence
>Reporter: Maxim Muzafarov
>Assignee: Maxim Muzafarov
>Priority: Major
>  Labels: iep-28
> Fix For: 2.8
>
>  Time Spent: 10h 20m
>  Remaining Estimate: 0h
>
> Partition preloader must support cache partition file relocation from one 
> cluster node to another (the zero copy algorithm [1] assume to be used by 
> default). To achieve this, the file transfer machinery must be implemented at 
> Apache Ignite over Communication SPI.
> _CommunicationSpi_
> Ignite's Comminication SPI must support to:
> * establishing channel connections to the remote node to an arbitrary topic 
> (GridTopic) with predefined processing policy;
> * listening incoming channel creation events and registering connection 
> handlers on the particular node;
> * an arbitrary set of channel parameters on connection handshake;
> _FileTransmitProcessor_
> The file transmission manager must support to:
> * using different approaches of incoming data handling – buffered and direct 
> (zero-copy approach of FileChannel#transferTo);
> * transferring data by chunks of predefined size with saving intermediate 
> results;
> * re-establishing connection if an error occurs and continue file 
> upload\download;
> * limiting connection bandwidth (upload and download) at runtime;



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (IGNITE-12074) setFailureDetectionTimeout causes Critical system error detected in log

2019-08-14 Thread chin (JIRA)
chin created IGNITE-12074:
-

 Summary: setFailureDetectionTimeout causes Critical system error 
detected in log
 Key: IGNITE-12074
 URL: https://issues.apache.org/jira/browse/IGNITE-12074
 Project: Ignite
  Issue Type: Bug
Reporter: chin


If I do 

setFailureDetectionTimeout(4000);

then the log is filled with this
{noformat}
2019-08-13T15:59:17.792 SEVERE 
org.apache.ignite.internal.processors.failure.FailureProcessor.process
DataGrid :: Critical system error detected. Will be handled accordingly to 
configured handler [hnd=StopNodeFailureHandler [super=AbstractFailureHandler 
[ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext 
[type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker 
[name=partition-exchanger, 
igniteInstanceName=GwClientDGConnectionService-1565726338232--1-[abc], 
finished=false, heartbeatTs=1565726352914]]]
class org.apache.ignite.IgniteException: GridWorker [name=partition-exchanger, 
igniteInstanceName=GwClientDGConnectionService-1565726338232--1-[abc], 
finished=false, heartbeatTs=1565726352914]
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
at 
org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)
at org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297)
at 
org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:221)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
2019-08-13T15:59:17.792 WARNING 
org.apache.ignite.internal.util.IgniteUtils.dumpThreads
DataGrid :: No deadlocked threads detected.{noformat}

The nodes are all on my local machine so there's no network latency or anything.

Despite the logs, there doesn't seem to be any issue with the app running.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (IGNITE-12073) The doc should mention IGNITE_UPDATE_NOTIFIER has no effect if you're not the first node that started up

2019-08-14 Thread chin (JIRA)
chin created IGNITE-12073:
-

 Summary: The doc should mention IGNITE_UPDATE_NOTIFIER has no 
effect if you're not the first node that started up
 Key: IGNITE-12073
 URL: https://issues.apache.org/jira/browse/IGNITE-12073
 Project: Ignite
  Issue Type: Improvement
Reporter: chin


It drove me crazy

I wanted to disable the auto update check.

I found this page

[https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/IgniteSystemProperties.html#IGNITE_UPDATE_NOTIFIER]

 

spent a few hours trying to set the system property in different ways but 
couldn't get the update notification to go away.

 

Then I found IGNITE-2350.

 

That info should be mentioned clearly in the docs.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (IGNITE-11075) Index rebuild procedure over cache partition file

2019-08-14 Thread Sergey Kalashnikov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Kalashnikov reassigned IGNITE-11075:
---

Assignee: Sergey Kalashnikov

> Index rebuild procedure over cache partition file
> -
>
> Key: IGNITE-11075
> URL: https://issues.apache.org/jira/browse/IGNITE-11075
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Maxim Muzafarov
>Assignee: Sergey Kalashnikov
>Priority: Major
>  Labels: iep-28
>
> The node can own partition when partition data is rebalanced and cache 
> indexes are ready. For the message-based cluster rebalancing, approach 
> indexes are rebuilding simultaneously with cache data loading. For the 
> file-based rebalancing approach, the index rebuild procedure must be finished 
> before the partition state is set to the OWNING state. 
> We need to rebuild local SQL indexes (the {{index.bin}} file) when partition 
> file has been received. Crash-recovery guarantees must be supported by a node 
> since index-rebuild performs on the node in the topology.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-10808) Discovery message queue may build up with TcpDiscoveryMetricsUpdateMessage

2019-08-14 Thread Dmitriy Govorukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-10808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907346#comment-16907346
 ] 

Dmitriy Govorukhin commented on IGNITE-10808:
-

[~sergey-chugunov] Could you please to help with the review?

> Discovery message queue may build up with TcpDiscoveryMetricsUpdateMessage
> --
>
> Key: IGNITE-10808
> URL: https://issues.apache.org/jira/browse/IGNITE-10808
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.7
>Reporter: Stanislav Lukyanov
>Assignee: Denis Mekhanikov
>Priority: Major
>  Labels: discovery
> Fix For: 2.8
>
> Attachments: IgniteMetricsOverflowTest.java
>
>
> A node receives a new metrics update message every `metricsUpdateFrequency` 
> milliseconds, and the message will be put at the top of the queue (because it 
> is a high priority message).
> If processing one message takes more than `metricsUpdateFrequency` then 
> multiple `TcpDiscoveryMetricsUpdateMessage` will be in the queue. A long 
> enough delay (e.g. caused by a network glitch or GC) may lead to the queue 
> building up tens of metrics update messages which are essentially useless to 
> be processed. Finally, if processing a message on average takes a little more 
> than `metricsUpdateFrequency` (even for a relatively short period of time, 
> say, for a minute due to network issues) then the message worker will end up 
> processing only the metrics updates and the cluster will essentially hang.
> Reproducer is attached. In the test, the queue first builds up and then very 
> slowly being teared down, causing "Failed to wait for PME" messages.
> Need to change ServerImpl's SocketReader not to put another metrics update 
> message to the top of the queue if it already has one (or replace the one at 
> the top with new one).



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (IGNITE-12060) Incorrect row size calculation, lead to tree corruption

2019-08-14 Thread Dmitriy Pavlov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Pavlov updated IGNITE-12060:

Fix Version/s: (was: 2.8)
   2.7.6

> Incorrect row size calculation, lead to tree corruption
> ---
>
> Key: IGNITE-12060
> URL: https://issues.apache.org/jira/browse/IGNITE-12060
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Critical
> Fix For: 2.7.6
>
>
> We do not correctly calculate old row size and new row size for check 
> in-place update. One of them may include cacheId but other not. Size 
> dependent on shared group or not.
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.CacheDataStoreImpl#canUpdateOldRow



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-12060) Incorrect row size calculation, lead to tree corruption

2019-08-14 Thread Dmitriy Pavlov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-12060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907301#comment-16907301
 ] 

Dmitriy Pavlov commented on IGNITE-12060:
-

Cherry-picked to 2.7.6 
https://github.com/apache/ignite/commit/610f06e32bd045cfabaf5ae4813783a5616b0889

> Incorrect row size calculation, lead to tree corruption
> ---
>
> Key: IGNITE-12060
> URL: https://issues.apache.org/jira/browse/IGNITE-12060
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Critical
> Fix For: 2.7.6
>
>
> We do not correctly calculate old row size and new row size for check 
> in-place update. One of them may include cacheId but other not. Size 
> dependent on shared group or not.
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.CacheDataStoreImpl#canUpdateOldRow



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-10451) .NET: Persistence does not work with custom affinity function

2019-08-14 Thread Dmitriy Pavlov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-10451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907299#comment-16907299
 ] 

Dmitriy Pavlov commented on IGNITE-10451:
-

Cherry-picked to 2.7.6 
https://github.com/apache/ignite/pull/6775/commits/93d0f89cf59fa02b9e3dda2b463835d6608667a4

> .NET: Persistence does not work with custom affinity function
> -
>
> Key: IGNITE-10451
> URL: https://issues.apache.org/jira/browse/IGNITE-10451
> Project: Ignite
>  Issue Type: Bug
>  Components: platforms
>Reporter: Pavel Tupitsyn
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: .NET
> Fix For: 2.8
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> To reproduce: assign custom affinity function in 
> {{PersistenceTest.TestCacheDataSurvivesNodeRestart}}.
> As a result, node restart fails with the following exception:
> {code}
> Apache.Ignite.Core.Common.IgniteException : An error occurred during cache 
> configuration loading from file 
> [file=C:\Users\tps0\AppData\Local\Temp\Ignite_ihxso0zq.tw0\Store\node00-263cfb5e-ec70-4378-8cbb-62b6fcc8043b\cache-persistentCache\cache_data.dat]
>   > Apache.Ignite.Core.Common.JavaException : class 
> org.apache.ignite.IgniteException: An error occurred during cache 
> configuration loading from file 
> [file=C:\Users\tps0\AppData\Local\Temp\Ignite_ihxso0zq.tw0\Store\node00-263cfb5e-ec70-4378-8cbb-62b6fcc8043b\cache-persistentCache\cache_data.dat]
>   at 
> org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:1027)
>   at 
> org.apache.ignite.internal.processors.platform.PlatformAbstractBootstrap.start(PlatformAbstractBootstrap.java:48)
>   at 
> org.apache.ignite.internal.processors.platform.PlatformIgnition.start(PlatformIgnition.java:74)
> Caused by: class org.apache.ignite.IgniteCheckedException: An error occurred 
> during cache configuration loading from file 
> [file=C:\Users\tps0\AppData\Local\Temp\Ignite_ihxso0zq.tw0\Store\node00-263cfb5e-ec70-4378-8cbb-62b6fcc8043b\cache-persistentCache\cache_data.dat]
>   at 
> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.readCacheData(FilePageStoreManager.java:902)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.readCacheConfigurations(FilePageStoreManager.java:844)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.addCacheOnJoinFromConfig(GridCacheProcessor.java:891)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.restoreCacheConfigurations(GridCacheProcessor.java:756)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.access$1300(GridCacheProcessor.java:204)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor$CacheRecoveryLifecycle.onReadyForRead(GridCacheProcessor.java:5456)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.notifyMetastorageReadyForRead(GridCacheDatabaseSharedManager.java:412)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:724)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.notifyMetaStorageSubscribersOnReadyForRead(GridCacheDatabaseSharedManager.java:4473)
>   at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1047)
>   at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2040)
>   at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1732)
>   at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1158)
>   at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:656)
>   at 
> org.apache.ignite.internal.processors.platform.PlatformAbstractBootstrap.start(PlatformAbstractBootstrap.java:43)
>   ... 1 more
> Caused by: class org.apache.ignite.IgniteCheckedException: Failed to 
> deserialize object with given class loader: 
> sun.misc.Launcher$AppClassLoader@18b4aac2
>   at 
> org.apache.ignite.marshaller.jdk.JdkMarshaller.unmarshal0(JdkMarshaller.java:147)
>   at 
> org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.unmarshal(AbstractNodeNameAwareMarshaller.java:93)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.readCacheData(FilePageStoreManager.java:898)
>   ... 15 more
> Caused by: java.lang.IllegalArgumentException: Ignite instance name thread 
> local must be set or this method should be accessed under 
> org.apache.ignite.thread.IgniteThread
>   at 
> 

[jira] [Updated] (IGNITE-10451) .NET: Persistence does not work with custom affinity function

2019-08-14 Thread Dmitriy Pavlov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-10451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Pavlov updated IGNITE-10451:

Fix Version/s: (was: 2.8)
   2.7.6

> .NET: Persistence does not work with custom affinity function
> -
>
> Key: IGNITE-10451
> URL: https://issues.apache.org/jira/browse/IGNITE-10451
> Project: Ignite
>  Issue Type: Bug
>  Components: platforms
>Reporter: Pavel Tupitsyn
>Assignee: Pavel Tupitsyn
>Priority: Major
>  Labels: .NET
> Fix For: 2.7.6
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> To reproduce: assign custom affinity function in 
> {{PersistenceTest.TestCacheDataSurvivesNodeRestart}}.
> As a result, node restart fails with the following exception:
> {code}
> Apache.Ignite.Core.Common.IgniteException : An error occurred during cache 
> configuration loading from file 
> [file=C:\Users\tps0\AppData\Local\Temp\Ignite_ihxso0zq.tw0\Store\node00-263cfb5e-ec70-4378-8cbb-62b6fcc8043b\cache-persistentCache\cache_data.dat]
>   > Apache.Ignite.Core.Common.JavaException : class 
> org.apache.ignite.IgniteException: An error occurred during cache 
> configuration loading from file 
> [file=C:\Users\tps0\AppData\Local\Temp\Ignite_ihxso0zq.tw0\Store\node00-263cfb5e-ec70-4378-8cbb-62b6fcc8043b\cache-persistentCache\cache_data.dat]
>   at 
> org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:1027)
>   at 
> org.apache.ignite.internal.processors.platform.PlatformAbstractBootstrap.start(PlatformAbstractBootstrap.java:48)
>   at 
> org.apache.ignite.internal.processors.platform.PlatformIgnition.start(PlatformIgnition.java:74)
> Caused by: class org.apache.ignite.IgniteCheckedException: An error occurred 
> during cache configuration loading from file 
> [file=C:\Users\tps0\AppData\Local\Temp\Ignite_ihxso0zq.tw0\Store\node00-263cfb5e-ec70-4378-8cbb-62b6fcc8043b\cache-persistentCache\cache_data.dat]
>   at 
> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.readCacheData(FilePageStoreManager.java:902)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.readCacheConfigurations(FilePageStoreManager.java:844)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.addCacheOnJoinFromConfig(GridCacheProcessor.java:891)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.restoreCacheConfigurations(GridCacheProcessor.java:756)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.access$1300(GridCacheProcessor.java:204)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor$CacheRecoveryLifecycle.onReadyForRead(GridCacheProcessor.java:5456)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.notifyMetastorageReadyForRead(GridCacheDatabaseSharedManager.java:412)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:724)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.notifyMetaStorageSubscribersOnReadyForRead(GridCacheDatabaseSharedManager.java:4473)
>   at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1047)
>   at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2040)
>   at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1732)
>   at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1158)
>   at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:656)
>   at 
> org.apache.ignite.internal.processors.platform.PlatformAbstractBootstrap.start(PlatformAbstractBootstrap.java:43)
>   ... 1 more
> Caused by: class org.apache.ignite.IgniteCheckedException: Failed to 
> deserialize object with given class loader: 
> sun.misc.Launcher$AppClassLoader@18b4aac2
>   at 
> org.apache.ignite.marshaller.jdk.JdkMarshaller.unmarshal0(JdkMarshaller.java:147)
>   at 
> org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.unmarshal(AbstractNodeNameAwareMarshaller.java:93)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.readCacheData(FilePageStoreManager.java:898)
>   ... 15 more
> Caused by: java.lang.IllegalArgumentException: Ignite instance name thread 
> local must be set or this method should be accessed under 
> org.apache.ignite.thread.IgniteThread
>   at 
> org.apache.ignite.internal.IgnitionEx.localIgnite(IgnitionEx.java:1413)
>   at 
> 

[jira] [Updated] (IGNITE-12071) Test failures after IGNITE-9562 fix in IGFS suite

2019-08-14 Thread Dmitriy Pavlov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Pavlov updated IGNITE-12071:

Summary: Test failures after IGNITE-9562 fix in IGFS suite  (was: Test 
failures after IGNITE-9562 fix)

> Test failures after IGNITE-9562 fix in IGFS suite
> -
>
> Key: IGNITE-12071
> URL: https://issues.apache.org/jira/browse/IGNITE-12071
> Project: Ignite
>  Issue Type: Test
>Reporter: Dmitriy Pavlov
>Assignee: Eduard Shangareev
>Priority: Blocker
> Fix For: 2.7.6
>
>
> https://lists.apache.org/thread.html/50375927a1375189c0aeec7dcaabc43ba83b7acee94524a3483d0c1b@%3Cdev.ignite.apache.org%3E
> Unfortunately, since https://issues.apache.org/jira/browse/IGNITE-9562 is 
> planned to the 2.7.6 it is a blocker for the release 
> *New test failure in master-nightly 
> IgfsCachePerBlockLruEvictionPolicySelfTest.testFilePrimary 
> https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-8890685422557348790=%3Cdefault%3E=testDetails
>  *New test failure in master-nightly 
> IgfsCachePerBlockLruEvictionPolicySelfTest.testFileDualExclusion 
> https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=3724804704021179739=%3Cdefault%3E=testDetails
>  Changes may lead to failure were done by 
>  - eduard shangareev  
> https://ci.ignite.apache.org/viewModification.html?modId=889258



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (IGNITE-12071) Test failures after IGNITE-9562 fix

2019-08-14 Thread Dmitriy Pavlov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Pavlov updated IGNITE-12071:

Description: 
# 
https://lists.apache.org/thread.html/94424a86283ba720a9ebcff37adc4782d271a07bc6470e148b57a715@%3Cdev.ignite.apache.org%3E

Unfortunately, since https://issues.apache.org/jira/browse/IGNITE-9562 is 
planned to the 2.7.6 it is a blocker for the release 


 *New test failure in master-nightly 
DiskPageCompressionConfigValidationTest.testIncorrectStaticCacheConfiguration 
https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-2692660105095122533=%3Cdefault%3E=testDetails

 *New test failure in master-nightly 
DiskPageCompressionConfigValidationTest.testIncorrectDynamicCacheStartRequest 
https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=1915110918646717850=%3Cdefault%3E=testDetails
 Changes may lead to failure were done by 
 - eduard shangareev  
https://ci.ignite.apache.org/viewModification.html?modId=889258


  was:
https://lists.apache.org/thread.html/94424a86283ba720a9ebcff37adc4782d271a07bc6470e148b57a715@%3Cdev.ignite.apache.org%3E

Unfortunately, since https://issues.apache.org/jira/browse/IGNITE-9562 is 
planned to the 2.7.6 it is a blocker for the release 


 *New test failure in master-nightly 
DiskPageCompressionConfigValidationTest.testIncorrectStaticCacheConfiguration 
https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-2692660105095122533=%3Cdefault%3E=testDetails

 *New test failure in master-nightly 
DiskPageCompressionConfigValidationTest.testIncorrectDynamicCacheStartRequest 
https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=1915110918646717850=%3Cdefault%3E=testDetails
 Changes may lead to failure were done by 
 - eduard shangareev  
https://ci.ignite.apache.org/viewModification.html?modId=889258



> Test failures after IGNITE-9562 fix
> ---
>
> Key: IGNITE-12071
> URL: https://issues.apache.org/jira/browse/IGNITE-12071
> Project: Ignite
>  Issue Type: Test
>Reporter: Dmitriy Pavlov
>Assignee: Eduard Shangareev
>Priority: Blocker
> Fix For: 2.7.6
>
>
> # 
> https://lists.apache.org/thread.html/94424a86283ba720a9ebcff37adc4782d271a07bc6470e148b57a715@%3Cdev.ignite.apache.org%3E
> Unfortunately, since https://issues.apache.org/jira/browse/IGNITE-9562 is 
> planned to the 2.7.6 it is a blocker for the release 
>  *New test failure in master-nightly 
> DiskPageCompressionConfigValidationTest.testIncorrectStaticCacheConfiguration 
> https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-2692660105095122533=%3Cdefault%3E=testDetails
>  *New test failure in master-nightly 
> DiskPageCompressionConfigValidationTest.testIncorrectDynamicCacheStartRequest 
> https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=1915110918646717850=%3Cdefault%3E=testDetails
>  Changes may lead to failure were done by 
>- eduard shangareev  
> https://ci.ignite.apache.org/viewModification.html?modId=889258



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (IGNITE-12071) Test failures after IGNITE-9562 fix

2019-08-14 Thread Dmitriy Pavlov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Pavlov updated IGNITE-12071:

Description: 
https://lists.apache.org/thread.html/50375927a1375189c0aeec7dcaabc43ba83b7acee94524a3483d0c1b@%3Cdev.ignite.apache.org%3E

Unfortunately, since https://issues.apache.org/jira/browse/IGNITE-9562 is 
planned to the 2.7.6 it is a blocker for the release 

*New test failure in master-nightly 
IgfsCachePerBlockLruEvictionPolicySelfTest.testFilePrimary 
https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-8890685422557348790=%3Cdefault%3E=testDetails

 *New test failure in master-nightly 
IgfsCachePerBlockLruEvictionPolicySelfTest.testFileDualExclusion 
https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=3724804704021179739=%3Cdefault%3E=testDetails
 Changes may lead to failure were done by 
 - eduard shangareev  
https://ci.ignite.apache.org/viewModification.html?modId=889258


  was:
# 
https://lists.apache.org/thread.html/94424a86283ba720a9ebcff37adc4782d271a07bc6470e148b57a715@%3Cdev.ignite.apache.org%3E

Unfortunately, since https://issues.apache.org/jira/browse/IGNITE-9562 is 
planned to the 2.7.6 it is a blocker for the release 


 *New test failure in master-nightly 
DiskPageCompressionConfigValidationTest.testIncorrectStaticCacheConfiguration 
https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-2692660105095122533=%3Cdefault%3E=testDetails

 *New test failure in master-nightly 
DiskPageCompressionConfigValidationTest.testIncorrectDynamicCacheStartRequest 
https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=1915110918646717850=%3Cdefault%3E=testDetails
 Changes may lead to failure were done by 
 - eduard shangareev  
https://ci.ignite.apache.org/viewModification.html?modId=889258



> Test failures after IGNITE-9562 fix
> ---
>
> Key: IGNITE-12071
> URL: https://issues.apache.org/jira/browse/IGNITE-12071
> Project: Ignite
>  Issue Type: Test
>Reporter: Dmitriy Pavlov
>Assignee: Eduard Shangareev
>Priority: Blocker
> Fix For: 2.7.6
>
>
> https://lists.apache.org/thread.html/50375927a1375189c0aeec7dcaabc43ba83b7acee94524a3483d0c1b@%3Cdev.ignite.apache.org%3E
> Unfortunately, since https://issues.apache.org/jira/browse/IGNITE-9562 is 
> planned to the 2.7.6 it is a blocker for the release 
> *New test failure in master-nightly 
> IgfsCachePerBlockLruEvictionPolicySelfTest.testFilePrimary 
> https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-8890685422557348790=%3Cdefault%3E=testDetails
>  *New test failure in master-nightly 
> IgfsCachePerBlockLruEvictionPolicySelfTest.testFileDualExclusion 
> https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=3724804704021179739=%3Cdefault%3E=testDetails
>  Changes may lead to failure were done by 
>  - eduard shangareev  
> https://ci.ignite.apache.org/viewModification.html?modId=889258



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-9562) Destroyed cache that resurrected on an old offline node breaks PME

2019-08-14 Thread Dmitriy Pavlov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907266#comment-16907266
 ] 

Dmitriy Pavlov commented on IGNITE-9562:


Could you please cherry-pick commit to Ignite 2.7.6 branch once all tests are 
fixed in master 
?https://github.com/apache/ignite/commit/27e9f705c1f65baae20b7dc3c03e988217dbe3f6

> Destroyed cache that resurrected on an old offline node breaks PME
> --
>
> Key: IGNITE-9562
> URL: https://issues.apache.org/jira/browse/IGNITE-9562
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.5
>Reporter: Pavel Kovalenko
>Assignee: Eduard Shangareev
>Priority: Critical
> Fix For: 2.8
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Given:
> 2 nodes, persistence enabled.
> 1) Stop 1 node
> 2) Destroy cache through client
> 3) Start stopped node
> When the stopped node joins to cluster it starts all caches that it has seen 
> before stopping.
> If that cache was cluster-widely destroyed it leads to breaking the crash 
> recovery process or PME.
> Root cause - we don't start/collect caches from the stopped node on another 
> part of a cluster.
> In case of PARTITIONED cache mode that scenario breaks crash recovery:
> {noformat}
> java.lang.AssertionError: AffinityTopologyVersion [topVer=-1, minorTopVer=0]
>   at 
> org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.cachedAffinity(GridAffinityAssignmentCache.java:696)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl.updateLocal(GridDhtPartitionTopologyImpl.java:2449)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtPartitionTopologyImpl.afterStateRestored(GridDhtPartitionTopologyImpl.java:679)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restorePartitionStates(GridCacheDatabaseSharedManager.java:2445)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.applyLastUpdates(GridCacheDatabaseSharedManager.java:2321)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreState(GridCacheDatabaseSharedManager.java:1568)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.beforeExchange(GridCacheDatabaseSharedManager.java:1308)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1255)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:766)
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2577)
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2457)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}
> In case of REPLICATED cache mode that scenario breaks PME coordinator process:
> {noformat}
> [2018-09-12 
> 18:50:36,407][ERROR][sys-#148%distributed.CacheStopAndRessurectOnOldNodeTest0%][GridCacheIoManager]
>  Failed to process message [senderId=4b6fd0d4-b756-4a9f-90ca-f0ee2511, 
> messageType=class 
> o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionsSingleMessage]
> java.lang.AssertionError: 3080586
>   at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.clientTopology(GridCachePartitionExchangeManager.java:815)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.updatePartitionSingleMap(GridDhtPartitionsExchangeFuture.java:3621)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.processSingleMessage(GridDhtPartitionsExchangeFuture.java:2439)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.access$100(GridDhtPartitionsExchangeFuture.java:137)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$2.apply(GridDhtPartitionsExchangeFuture.java:2261)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$2.apply(GridDhtPartitionsExchangeFuture.java:2249)
>   at 
> 

[jira] [Created] (IGNITE-12072) Starting node with extra cache in cache group cause assertion error

2019-08-14 Thread Eduard Shangareev (JIRA)
Eduard Shangareev created IGNITE-12072:
--

 Summary: Starting node with extra cache in cache group cause 
assertion error
 Key: IGNITE-12072
 URL: https://issues.apache.org/jira/browse/IGNITE-12072
 Project: Ignite
  Issue Type: Bug
Reporter: Eduard Shangareev


Reproducer 
IgniteCacheGroupsWithRestartsTest#testNodeRestartWithNewStaticallyConfiguredCache

{code}
java.lang.AssertionError: AffinityTopologyVersion [topVer=-1, minorTopVer=0]

at 
org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.cachedAffinity(GridAffinityAssignmentCache.java:770)
at 
org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.cachedAffinity(GridAffinityAssignmentCache.java:747)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.updateLocal(GridDhtPartitionTopologyImpl.java:2571)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.afterStateRestored(GridDhtPartitionTopologyImpl.java:714)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.lambda$beforeExchange$38edadb$1(GridCacheDatabaseSharedManager.java:1415)
at 
org.apache.ignite.internal.util.IgniteUtils.lambda$null$1(IgniteUtils.java:11037)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (IGNITE-12059) DiskPageCompressionConfigValidationTest.testIncorrectStaticCacheConfiguration fails

2019-08-14 Thread Dmitriy Pavlov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Pavlov updated IGNITE-12059:

Fix Version/s: 2.7.6

> DiskPageCompressionConfigValidationTest.testIncorrectStaticCacheConfiguration 
> fails
> ---
>
> Key: IGNITE-12059
> URL: https://issues.apache.org/jira/browse/IGNITE-12059
> Project: Ignite
>  Issue Type: Bug
>Reporter: Eduard Shangareev
>Assignee: Eduard Shangareev
>Priority: Major
> Fix For: 2.8, 2.7.6
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> DiskPageCompressionConfigValidationTest.testIncorrectStaticCacheConfiguration 
> fails because validation was removed in IGNITE-9562.
> Need to restore this validation.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (IGNITE-12071) Test failures after IGNITE-9562 fix

2019-08-14 Thread Dmitriy Pavlov (JIRA)
Dmitriy Pavlov created IGNITE-12071:
---

 Summary: Test failures after IGNITE-9562 fix
 Key: IGNITE-12071
 URL: https://issues.apache.org/jira/browse/IGNITE-12071
 Project: Ignite
  Issue Type: Test
Reporter: Dmitriy Pavlov
Assignee: Eduard Shangareev
 Fix For: 2.7.6


https://lists.apache.org/thread.html/94424a86283ba720a9ebcff37adc4782d271a07bc6470e148b57a715@%3Cdev.ignite.apache.org%3E

Unfortunately, since https://issues.apache.org/jira/browse/IGNITE-9562 is 
planned to the 2.7.6 it is a blocker for the release 


 *New test failure in master-nightly 
DiskPageCompressionConfigValidationTest.testIncorrectStaticCacheConfiguration 
https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-2692660105095122533=%3Cdefault%3E=testDetails

 *New test failure in master-nightly 
DiskPageCompressionConfigValidationTest.testIncorrectDynamicCacheStartRequest 
https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=1915110918646717850=%3Cdefault%3E=testDetails
 Changes may lead to failure were done by 
 - eduard shangareev  
https://ci.ignite.apache.org/viewModification.html?modId=889258




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (IGNITE-12070) Document the new ability to track system/user time of transactions

2019-08-14 Thread Denis Chudov (JIRA)
Denis Chudov created IGNITE-12070:
-

 Summary: Document the new ability to track system/user time of 
transactions
 Key: IGNITE-12070
 URL: https://issues.apache.org/jira/browse/IGNITE-12070
 Project: Ignite
  Issue Type: Task
  Components: documentation
Reporter: Denis Chudov


Now there is ability to track system/user time of transactions. System time is 
the time that is spent for system activities - i.e. time while aquiring locks, 
preparing, commiting, etc.User time is the time that is spent for user 
activities when client node runs some code while holding transaction.

We have ability to log info about transactions that exceed some threshold 
execution timeout, or some percentage of all transactions. Log record in case 
of long-running transactions looks like following:
{code:java}
[2019-08-09 13:39:49,130][WARN ][sys-stripe-1-#101%client%][root] Long 
transaction time dump [startTime=13:39:47.970, totalTime=1160, systemTime=157, 
userTime=1003, cacheOperationsTime=141, prepareTime=15, commitTime=0, 
tx=GridNearTxLocal [...]]
{code}
In case of sampling of all transactions:


{code:java}
[2019-08-09 13:39:54,079][INFO ][sys-stripe-2-#102%client%][root] Transaction 
time dump [startTime=13:39:54.063, totalTime=15, systemTime=6, userTime=9, 
cacheOperationsTime=2, prepareTime=3, commitTime=0, tx=GridNearTxLocal [...]]
{code}
Also some of transactions can be skipped to not overflow the log, information 
about this log throttling looks like this:
{code:java}
[2019-08-09 13:39:55,109][INFO ][sys-stripe-0-#100%client%][root] Transaction 
time dumps skipped because of log throttling: 2
{code}
There are JMX parameters and JVM options to control this behavior:
1)
JVM option: IGNITE_LONG_TRANSACTION_TIME_DUMP_THRESHOLD
JMX parameter: TransactionsMXBean.longTransactionTimeDumpThreshold
Threshold timeout in milliseconds for long transactions, if transaction exceeds 
it, it will be dumped in log with information about how much time did it spent 
in system time and user time. Default value is 0. No info about system/user 
time of long transactions is dumped in log if this parameter is not set.
2) 
JVM option: IGNITE_TRANSACTION_TIME_DUMP_SAMPLES_COEFFICIENT
JMX parameter: TransactionsMXBean.transactionTimeDumpSamplesCoefficient
The coefficient for samples of completed transactions that will be dumped in 
log. Must be float value between 0.0 and 1.0 inclusive. Default value is 0.0.
3) 
JVM option: IGNITE_TRANSACTION_TIME_DUMP_SAMPLES_PER_SECOND_LIMIT
JMX parameter: TransactionsMXBean.transactionTimeDumpSamplesPerSecondLimit
The limit of samples of completed transactions that will be dumped in log per 
second, if IGNITE_TRANSACTION_TIME_DUMP_SAMPLES_COEFFICIENT is above 0.0. Must 
be integer value greater than 0. Default value is 5.

For the existing long running transaction warning was added information about 
current system and user time of transaction:
{code:java}
[2019-08-09 14:10:31,835][WARN ][grid-timeout-worker-#122%client%][root] First 
10 long running transactions [total=1]
[2019-08-09 14:10:31,835][WARN ][grid-timeout-worker-#122%client%][root] >>> 
Transaction [startTime=14:10:31.170, curTime=14:10:31.750, systemTime=32, 
userTime=548, tx=GridNearTxLocal [...]]
{code}

Also added following metrics to monitor system and user time for single node:
diagnostic.transactions.totalNodeSystemTime - Total transactions system time on 
node.
diagnostic.transactions.totalNodeUserTime - Total transactions user time on 
node.
diagnostic.transactions.nodeSystemTimeHistogram - Transactions system times on 
node represented as histogram.
diagnostic.transactions.nodeUserTimeHistogram - Transactions user times on node 
represented as histogram.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-12069) Create cache shared preloader

2019-08-14 Thread Ivan Pavlukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-12069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907230#comment-16907230
 ] 

Ivan Pavlukhin commented on IGNITE-12069:
-

[~xtern], thank you for a clarification.

> Create cache shared preloader
> -
>
> Key: IGNITE-12069
> URL: https://issues.apache.org/jira/browse/IGNITE-12069
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Maxim Muzafarov
>Assignee: Pavel Pereslegin
>Priority: Major
>  Labels: iep-28
>
> {{CacheSharedPreloader}} must do the following:
>  # build the map of partitions and corresponding supplier nodes from which 
> partitions will be loaded [1];
>  # switch cache data storage to {{no-op}} and back to original (HWM must be 
> fixed here for the needs of historical rebalance) under the checkpoint and 
> keep the partition update counter for each partition [1];
>  # run async the eviction indexes for the list of collected partitions (API 
> must be provided by IGNITE-11075) [2];
>  # send a request message to each node one by one with the list of partitions 
> to load [2];
>  # wait for files received (listening for the transmission handler) [2];
>  # run rebuild indexes async over the receiving partitions (API must be 
> provided by IGNITE-11075) [2];
>  # run historical rebalance from LWM to HWM collected above (LWM can be read 
> from the received file meta page) [1];
> The points marked with the label {{[1]}} must be done prior to {{[2]}}.
>  
> NOTE. The following things need to be checked:
>  # Rebalancing of MVCC cache groups;
>  # How LWM and HWM will be set for the historical rebalance;



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (IGNITE-11074) Implement catch-up temporary WAL

2019-08-14 Thread Maxim Muzafarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Muzafarov updated IGNITE-11074:
-
Description: 
This is an addition for the process of rebalancing caches via partition files 
for the place where historical rebalance is used by design (see the IEP-28 
confluence page).

While the demander node is in the partition file transmission state it must 
save all cache entries corresponding to the moving partition into a new 
temporary WAL storage. These entries will be applied later one by one on the 
received cache partition file. All asynchronous operations will be enrolled to 
the end of temporary WAL storage during storage reads until it becomes fully 
read. The file-based FIFO approach assumes to be used by this process.

The new write-ahead-log manager for writing temporary records must support to:
 * Unlimited number of WAL-files to store temporary data records;
 * Iterating over stored data records during an asynchronous writer thread 
inserts new records;
 * WAL-per-partition approach needs to be used;
 * Write operations to temporary WAL storage must have higher priority over 
reading operations;

  was:
While the demander node is in the partition file transmission state it must 
save all cache entries corresponding to the moving partition into a new 
temporary WAL storage. These entries will be applied later one by one on the 
received cache partition file. All asynchronous operations will be enrolled to 
the end of temporary WAL storage during storage reads until it becomes fully 
read. The file-based FIFO approach assumes to be used by this process.

The new write-ahead-log manager for writing temporary records must support to:
 * Unlimited number of wal-files to store temporary data records;
 * Iterating over stored data records during an asynchronous writer thread 
inserts new records;
 * WAL-per-partiton approach needs to be used;
 * Write operations to temporary WAL storage must have higher priority over 
reading operations;


> Implement catch-up temporary WAL
> 
>
> Key: IGNITE-11074
> URL: https://issues.apache.org/jira/browse/IGNITE-11074
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Maxim Muzafarov
>Priority: Major
>  Labels: iep-28
>
> This is an addition for the process of rebalancing caches via partition files 
> for the place where historical rebalance is used by design (see the IEP-28 
> confluence page).
> While the demander node is in the partition file transmission state it must 
> save all cache entries corresponding to the moving partition into a new 
> temporary WAL storage. These entries will be applied later one by one on the 
> received cache partition file. All asynchronous operations will be enrolled 
> to the end of temporary WAL storage during storage reads until it becomes 
> fully read. The file-based FIFO approach assumes to be used by this process.
> The new write-ahead-log manager for writing temporary records must support to:
>  * Unlimited number of WAL-files to store temporary data records;
>  * Iterating over stored data records during an asynchronous writer thread 
> inserts new records;
>  * WAL-per-partition approach needs to be used;
>  * Write operations to temporary WAL storage must have higher priority over 
> reading operations;



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (IGNITE-11074) Implement catch-up temporary WAL

2019-08-14 Thread Maxim Muzafarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Muzafarov updated IGNITE-11074:
-
Issue Type: Improvement  (was: Sub-task)
Parent: (was: IGNITE-8020)

> Implement catch-up temporary WAL
> 
>
> Key: IGNITE-11074
> URL: https://issues.apache.org/jira/browse/IGNITE-11074
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Maxim Muzafarov
>Priority: Major
>  Labels: iep-28
>
> While the demander node is in the partition file transmission state it must 
> save all cache entries corresponding to the moving partition into a new 
> temporary WAL storage. These entries will be applied later one by one on the 
> received cache partition file. All asynchronous operations will be enrolled 
> to the end of temporary WAL storage during storage reads until it becomes 
> fully read. The file-based FIFO approach assumes to be used by this process.
> The new write-ahead-log manager for writing temporary records must support to:
>  * Unlimited number of wal-files to store temporary data records;
>  * Iterating over stored data records during an asynchronous writer thread 
> inserts new records;
>  * WAL-per-partiton approach needs to be used;
>  * Write operations to temporary WAL storage must have higher priority over 
> reading operations;



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-12069) Create cache shared preloader

2019-08-14 Thread Pavel Pereslegin (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-12069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907220#comment-16907220
 ] 

Pavel Pereslegin commented on IGNITE-12069:
---

Hello, [~Pavlukhin].

Firstly, the naming is not yet final and may be inaccurate.

"Shared" is because, currently, one instance of preloader manages the 
rebalancing of a single cache group.
And due to some time limitations (partition snapshot creation on supplier 
should be done on checkpoint), the "p2p" preloader must manage the rebalancing 
process of all participating cache groups.

_> What entities do share it?_
All cache groups "share" single preloader.

> Create cache shared preloader
> -
>
> Key: IGNITE-12069
> URL: https://issues.apache.org/jira/browse/IGNITE-12069
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Maxim Muzafarov
>Assignee: Pavel Pereslegin
>Priority: Major
>  Labels: iep-28
>
> {{CacheSharedPreloader}} must do the following:
>  # build the map of partitions and corresponding supplier nodes from which 
> partitions will be loaded [1];
>  # switch cache data storage to {{no-op}} and back to original (HWM must be 
> fixed here for the needs of historical rebalance) under the checkpoint and 
> keep the partition update counter for each partition [1];
>  # run async the eviction indexes for the list of collected partitions (API 
> must be provided by IGNITE-11075) [2];
>  # send a request message to each node one by one with the list of partitions 
> to load [2];
>  # wait for files received (listening for the transmission handler) [2];
>  # run rebuild indexes async over the receiving partitions (API must be 
> provided by IGNITE-11075) [2];
>  # run historical rebalance from LWM to HWM collected above (LWM can be read 
> from the received file meta page) [1];
> The points marked with the label {{[1]}} must be done prior to {{[2]}}.
>  
> NOTE. The following things need to be checked:
>  # Rebalancing of MVCC cache groups;
>  # How LWM and HWM will be set for the historical rebalance;



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (IGNITE-12069) Create cache shared preloader

2019-08-14 Thread Maxim Muzafarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Muzafarov updated IGNITE-12069:
-
Description: 
{{CacheSharedPreloader}} must do the following:
 # build the map of partitions and corresponding supplier nodes from which 
partitions will be loaded [1];
 # switch cache data storage to {{no-op}} and back to original (HWM must be 
fixed here for the needs of historical rebalance) under the checkpoint and keep 
the partition update counter for each partition [1];
 # run async the eviction indexes for the list of collected partitions (API 
must be provided by IGNITE-11075) [2];
 # send a request message to each node one by one with the list of partitions 
to load [2];
 # wait for files received (listening for the transmission handler) [2];
 # run rebuild indexes async over the receiving partitions (API must be 
provided by IGNITE-11075) [2];
 # run historical rebalance from LWM to HWM collected above (LWM can be read 
from the received file meta page) [1];

The points marked with the label {{[1]}} must be done prior to {{[2]}}.

 

NOTE. The following things need to be checked:
 # Rebalancing of MVCC cache groups;
 # How LWM and HWM will be set for the historical rebalance;

  was:
{{CacheSharedPreloader}} must do the following:
 # build the map of partitions and corresponding supplier nodes from which 
partitions will be loaded [1];
 # switch cache data storage to {{no-op}} and back to original (HWM must be 
fixed here for the needs of historical rebalance) under the checkpoint and keep 
the partition update counter for each partition [1];
 # run async the eviction indexes for the list of collected partitions (API 
must be provided by IGNITE-11075) [2];
 # send a request message to each node one by one with the list of partitions 
to load [2];
 # wait for files received (listening for the transmission handler) [2];
 # run rebuild indexes async over the receiving partitions (API must be 
provided by IGNITE-11075) [2];
 # run historical rebalance from LWM to HWM collected above (LWM can be read 
from the received file meta page) [1];

The points marked with the label {{[1]}} must be done prior to {{[2]}}.

 

NOTE. Check the following things:
 # Rebalancing of MVCC cache groups;
 # How LWM and HWM will be set for the historical rebalance;


> Create cache shared preloader
> -
>
> Key: IGNITE-12069
> URL: https://issues.apache.org/jira/browse/IGNITE-12069
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Maxim Muzafarov
>Assignee: Pavel Pereslegin
>Priority: Major
>  Labels: iep-28
>
> {{CacheSharedPreloader}} must do the following:
>  # build the map of partitions and corresponding supplier nodes from which 
> partitions will be loaded [1];
>  # switch cache data storage to {{no-op}} and back to original (HWM must be 
> fixed here for the needs of historical rebalance) under the checkpoint and 
> keep the partition update counter for each partition [1];
>  # run async the eviction indexes for the list of collected partitions (API 
> must be provided by IGNITE-11075) [2];
>  # send a request message to each node one by one with the list of partitions 
> to load [2];
>  # wait for files received (listening for the transmission handler) [2];
>  # run rebuild indexes async over the receiving partitions (API must be 
> provided by IGNITE-11075) [2];
>  # run historical rebalance from LWM to HWM collected above (LWM can be read 
> from the received file meta page) [1];
> The points marked with the label {{[1]}} must be done prior to {{[2]}}.
>  
> NOTE. The following things need to be checked:
>  # Rebalancing of MVCC cache groups;
>  # How LWM and HWM will be set for the historical rebalance;



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (IGNITE-12069) Create cache shared preloader

2019-08-14 Thread Maxim Muzafarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Muzafarov updated IGNITE-12069:
-
Description: 
{{CacheSharedPreloader}} must do the following:
 # build the map of partitions and corresponding supplier nodes from which 
partitions will be loaded [1];
 # switch cache data storage to {{no-op}} and back to original (HWM must be 
fixed here for the needs of historical rebalance) under the checkpoint and keep 
the partition update counter for each partition [1];
 # run async the eviction indexes for the list of collected partitions (API 
must be provided by IGNITE-11075) [2];
 # send a request message to each node one by one with the list of partitions 
to load [2];
 # wait for files received (listening for the transmission handler) [2];
 # run rebuild indexes async over the receiving partitions (API must be 
provided by IGNITE-11075) [2];
 # run historical rebalance from LWM to HWM collected above (LWM can be read 
from the received file meta page) [1];

The points marked with the label {{[1]}} must be done prior to {{[2]}}.

 

NOTE. Check the following things:
 # Rebalancing of MVCC cache groups;
 # How LWM and HWM will be set for the historical rebalance;

  was:
{{CacheSharedPreloader}} must do the following:
 # build the map of partitions and corresponding supplier nodes from which 
partitions will be loaded [1];
 # witching cache data storage to {{no-op}} and back to original (HWM must be 
fixed here for the needs of historical rebalance) under the checkpoint and keep 
the partition update counter for each partition [1];
 # run async the eviction indexes for the list of collected partitions (API 
must be provided by IGNITE-11075) [2];
 # send a request message to each node one by one with the list of partitions 
to load [2];
 # listening for the transmission handler to receive files [2];
 # run rebuild indexes async over the receiving partitions (API must be 
provided by IGNITE-11075) [2];
 # run historical rebalance from LWM to HWM collected above (LWM can be read 
from the received file meta page) [1];

The points marked with the label {{[1]}} must be done prior to {{[2]}}.

 

NOTE. Check the following things:
 # Rebalancing of MVCC cache groups;
 # How LWM and HWM will be set for the historical rebalance;


> Create cache shared preloader
> -
>
> Key: IGNITE-12069
> URL: https://issues.apache.org/jira/browse/IGNITE-12069
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Maxim Muzafarov
>Assignee: Pavel Pereslegin
>Priority: Major
>  Labels: iep-28
>
> {{CacheSharedPreloader}} must do the following:
>  # build the map of partitions and corresponding supplier nodes from which 
> partitions will be loaded [1];
>  # switch cache data storage to {{no-op}} and back to original (HWM must be 
> fixed here for the needs of historical rebalance) under the checkpoint and 
> keep the partition update counter for each partition [1];
>  # run async the eviction indexes for the list of collected partitions (API 
> must be provided by IGNITE-11075) [2];
>  # send a request message to each node one by one with the list of partitions 
> to load [2];
>  # wait for files received (listening for the transmission handler) [2];
>  # run rebuild indexes async over the receiving partitions (API must be 
> provided by IGNITE-11075) [2];
>  # run historical rebalance from LWM to HWM collected above (LWM can be read 
> from the received file meta page) [1];
> The points marked with the label {{[1]}} must be done prior to {{[2]}}.
>  
> NOTE. Check the following things:
>  # Rebalancing of MVCC cache groups;
>  # How LWM and HWM will be set for the historical rebalance;



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-11953) BTree corruption caused by byte array values

2019-08-14 Thread Dmitriy Pavlov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907198#comment-16907198
 ] 

Dmitriy Pavlov commented on IGNITE-11953:
-

[~DmitriyGovorukhin], could you please refer to commit made for this ticket ?

I've tried to search using its name IGNITE-11953 but can't see anything 
https://github.com/apache/ignite/search?q=IGNITE-11953_q=IGNITE-11953

> BTree corruption caused by byte array values
> 
>
> Key: IGNITE-11953
> URL: https://issues.apache.org/jira/browse/IGNITE-11953
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Major
> Fix For: 2.7.6
>
>
> In some cases for caches with cache group, we can get BTree corruption 
> exception.
> {code}
> 09:53:58,890][SEVERE][sys-stripe-10-#11][] Critical system error detected. 
> Will be handled accordingly to configured handler [hnd=CustomFailureHandler 
> [ignoreCriticalErrors=false, disabled=false][StopNodeOrHaltFailureHandler 
> [tryStop=false, timeout=0]], failureCtx=FailureContext [type=CRITICAL_ERROR, 
> err=class o.a.i.i.transactions.IgniteTxHeuristicCheckedException: Committing 
> a transaction has produced runtime exception]]class 
> org.apache.ignite.internal.transactions.IgniteTxHeuristicCheckedException: 
> Committing a transaction has produced runtime exception
>   at 
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.heuristicException(IgniteTxAdapter.java:800)
>   at 
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.userCommit(IgniteTxLocalAdapter.java:922)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocalAdapter.localFinish(GridDhtTxLocalAdapter.java:799)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.localFinish(GridDhtTxLocal.java:608)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.finishTx(GridDhtTxLocal.java:478)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.commitDhtLocalAsync(GridDhtTxLocal.java:535)
>   at 
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finishDhtLocal(IgniteTxHandler.java:1055)
>   at 
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finish(IgniteTxHandler.java:931)
>   at 
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processNearTxFinishRequest(IgniteTxHandler.java:887)
>   at 
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$200(IgniteTxHandler.java:117)
>   at 
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:209)
>   at 
> org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$3.apply(IgniteTxHandler.java:207)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1129)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:594)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:393)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:319)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:109)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:308)
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1568)
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1196)
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1092)
>   at 
> org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:504)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: class 
> org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException:
>  Runtime failure on search row: SearchRow [key=KeyCacheObjectImpl [part=427, 
> val=Grkg1DUF3yQE6tC9Se50mi5w.T, hasValBytes=true], hash=1872857770, 
> cacheId=-420893003]
>   at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invoke(BPlusTree.java:1811)
>   at 
> 

[jira] [Commented] (IGNITE-12069) Create cache shared preloader

2019-08-14 Thread Ivan Pavlukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-12069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907196#comment-16907196
 ] 

Ivan Pavlukhin commented on IGNITE-12069:
-

[~Mmuzaf], [~xtern], just for my understanding, could you please elaborate why 
it is _shared_ preloader? What entities do share it?

> Create cache shared preloader
> -
>
> Key: IGNITE-12069
> URL: https://issues.apache.org/jira/browse/IGNITE-12069
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Maxim Muzafarov
>Assignee: Pavel Pereslegin
>Priority: Major
>  Labels: iep-28
>
> {{CacheSharedPreloader}} must do the following:
>  # build the map of partitions and corresponding supplier nodes from which 
> partitions will be loaded [1];
>  # witching cache data storage to {{no-op}} and back to original (HWM must be 
> fixed here for the needs of historical rebalance) under the checkpoint and 
> keep the partition update counter for each partition [1];
>  # run async the eviction indexes for the list of collected partitions (API 
> must be provided by IGNITE-11075) [2];
>  # send a request message to each node one by one with the list of partitions 
> to load [2];
>  # listening for the transmission handler to receive files [2];
>  # run rebuild indexes async over the receiving partitions (API must be 
> provided by IGNITE-11075) [2];
>  # run historical rebalance from LWM to HWM collected above (LWM can be read 
> from the received file meta page) [1];
> The points marked with the label {{[1]}} must be done prior to {{[2]}}.
>  
> NOTE. Check the following things:
>  # Rebalancing of MVCC cache groups;
>  # How LWM and HWM will be set for the historical rebalance;



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-11767) GridDhtPartitionsFullMessage retains huge maps on heap in exchange history

2019-08-14 Thread Dmitriy Pavlov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907194#comment-16907194
 ] 

Dmitriy Pavlov commented on IGNITE-11767:
-

[~ilyak], could you please cherry-pick this commit 
https://github.com/apache/ignite/commit/478277e5e3fe1a535ea905f8beab42926453825a

to 2.7.6 branch and assign fixVersion=2.7.6 once commit is there? 
Unfortunately, change can't be merged more or less automatically.

> GridDhtPartitionsFullMessage retains huge maps on heap in exchange history
> --
>
> Key: IGNITE-11767
> URL: https://issues.apache.org/jira/browse/IGNITE-11767
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.7
>Reporter: Ilya Kasnacheev
>Assignee: Ilya Kasnacheev
>Priority: Blocker
> Fix For: 2.8
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> ExchangeHistory keeps a FinishState for every topology version.
> FinishState contains msg, which contains at least two huge maps:
> partCntrs2 and partsSizesBytes.
> We should probably strip msg, removing those two data structures before 
> putting msg in exchFuts linked list to be stowed away.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (IGNITE-11074) Implement catch-up temporary WAL

2019-08-14 Thread Pavel Pereslegin (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin reassigned IGNITE-11074:
-

Assignee: (was: Pavel Pereslegin)

> Implement catch-up temporary WAL
> 
>
> Key: IGNITE-11074
> URL: https://issues.apache.org/jira/browse/IGNITE-11074
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Maxim Muzafarov
>Priority: Major
>  Labels: iep-28
>
> While the demander node is in the partition file transmission state it must 
> save all cache entries corresponding to the moving partition into a new 
> temporary WAL storage. These entries will be applied later one by one on the 
> received cache partition file. All asynchronous operations will be enrolled 
> to the end of temporary WAL storage during storage reads until it becomes 
> fully read. The file-based FIFO approach assumes to be used by this process.
> The new write-ahead-log manager for writing temporary records must support to:
>  * Unlimited number of wal-files to store temporary data records;
>  * Iterating over stored data records during an asynchronous writer thread 
> inserts new records;
>  * WAL-per-partiton approach needs to be used;
>  * Write operations to temporary WAL storage must have higher priority over 
> reading operations;



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (IGNITE-11339) FilePageStore: keep CRC invariant on write retry

2019-08-14 Thread Dmitry Lazurkin (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Lazurkin resolved IGNITE-11339.
--
   Resolution: Fixed
Fix Version/s: 2.7.5

Indirect fix in commit 469464fc80caf286b0498ed981a5bff449c2ef13

> FilePageStore: keep CRC invariant on write retry
> 
>
> Key: IGNITE-11339
> URL: https://issues.apache.org/jira/browse/IGNITE-11339
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Affects Versions: 2.7
>Reporter: Dmitry Lazurkin
>Priority: Major
> Fix For: 2.7.5
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> _FilePageStore#write_ doesn't keep CRC invariant on write retry if 
> _calculateCrc_ is false.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-12061) Silently fail while try to recreate already existing index with differ inline_size.

2019-08-14 Thread Ignite TC Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-12061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907126#comment-16907126
 ] 

Ignite TC Bot commented on IGNITE-12061:


{panel:title=Branch: [pull/6770/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=4495665buildTypeId=IgniteTests24Java8_RunAll]

> Silently fail while try to recreate already existing index with differ 
> inline_size.
> ---
>
> Key: IGNITE-12061
> URL: https://issues.apache.org/jira/browse/IGNITE-12061
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 2.5, 2.7, 2.7.5
>Reporter: Stanilovsky Evgeny
>Assignee: Stanilovsky Evgeny
>Priority: Major
> Fix For: 2.7.6
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> INLINE_SIZE differ from previous value is  not correctly sets.
> 1. create index idx0(c1, c2)
> 2. drop idx0
> 3. create index idx0(c1, c2) inline_size 100;
> inline_size remains the same, in this case default = 10.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (IGNITE-11073) Take consistent cache partitions snapshot

2019-08-14 Thread Maxim Muzafarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Muzafarov updated IGNITE-11073:
-
Due Date: 16/Sep/19

> Take consistent cache partitions snapshot
> -
>
> Key: IGNITE-11073
> URL: https://issues.apache.org/jira/browse/IGNITE-11073
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Maxim Muzafarov
>Assignee: Maxim Muzafarov
>Priority: Major
>  Labels: iep-28
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Checkpointer*
> When the supplier node receives the cache partition file demand request it 
> will send the file over the CommunicationSpi. The cache partition file can be 
> concurrently updated by checkpoint thread during its transmission. To 
> guarantee the file consistency Сheckpointer must use Copy-on-Write [3] 
> tehnique and save a copy of updated chunk into the temporary file.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-12069) Create cache shared preloader

2019-08-14 Thread Maxim Muzafarov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-12069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907125#comment-16907125
 ] 

Maxim Muzafarov commented on IGNITE-12069:
--

[~xtern],

As discussed with you privately, I've created and assigned to you this issue 
with the previously discussed problem. 

> Create cache shared preloader
> -
>
> Key: IGNITE-12069
> URL: https://issues.apache.org/jira/browse/IGNITE-12069
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Maxim Muzafarov
>Assignee: Pavel Pereslegin
>Priority: Major
>  Labels: iep-28
>
> {{CacheSharedPreloader}} must do the following:
>  # build the map of partitions and corresponding supplier nodes from which 
> partitions will be loaded [1];
>  # witching cache data storage to {{no-op}} and back to original (HWM must be 
> fixed here for the needs of historical rebalance) under the checkpoint and 
> keep the partition update counter for each partition [1];
>  # run async the eviction indexes for the list of collected partitions (API 
> must be provided by IGNITE-11075) [2];
>  # send a request message to each node one by one with the list of partitions 
> to load [2];
>  # listening for the transmission handler to receive files [2];
>  # run rebuild indexes async over the receiving partitions (API must be 
> provided by IGNITE-11075) [2];
>  # run historical rebalance from LWM to HWM collected above (LWM can be read 
> from the received file meta page) [1];
> The points marked with the label {{[1]}} must be done prior to {{[2]}}.
>  
> NOTE. Check the following things:
>  # Rebalancing of MVCC cache groups;
>  # How LWM and HWM will be set for the historical rebalance;



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (IGNITE-12069) Create cache shared preloader

2019-08-14 Thread Maxim Muzafarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Muzafarov updated IGNITE-12069:
-
Description: 
{{CacheSharedPreloader}} must do the following:
 # build the map of partitions and corresponding supplier nodes from which 
partitions will be loaded [1];
 # witching cache data storage to {{no-op}} and back to original (HWM must be 
fixed here for the needs of historical rebalance) under the checkpoint and keep 
the partition update counter for each partition [1];
 # run async the eviction indexes for the list of collected partitions (API 
must be provided by IGNITE-11075) [2];
 # send a request message to each node one by one with the list of partitions 
to load [2];
 # listening for the transmission handler to receive files [2];
 # run rebuild indexes async over the receiving partitions (API must be 
provided by IGNITE-11075) [2];
 # run historical rebalance from LWM to HWM collected above (LWM can be read 
from the received file meta page) [1];

The points marked with the label {{[1]}} must be done prior to {{[2]}}.

 

NOTE. Check the following things:
 # Rebalancing of MVCC cache groups;
 # How LWM and HWM will be set for the historical rebalance;

  was:
{{CacheSharedPreloader}} must do the following:
 # [1] build the map of partitions and corresponding supplier nodes from which 
partitions will be loaded;
 # [1] switching cache data storage to {{no-op}} and back to original (HWM must 
be fixed here for the needs of historical rebalance) under the checkpoint and 
keep the partition update counter for each partition;
 # [2] run async the eviction indexes for the list of collected partitions (API 
must be provided by IGNITE-11075);
 # [2] Send a request message to each node one by one with the list of 
partitions to load;
 # [2] Listening for the transmission handler to receive files;
 # [2] Run rebuild indexes async over the receiving partitions (API must be 
provided by IGNITE-11075);
 # [1] Run historical rebalance from LWM to HWM collected above (LWM can be 
read from received file meta page)

The points marked with the label {{[1]}} must be done prior to {{[2]}}.

 

NOTE. Check the following things:
 # Rebalancing of MVCC cache groups;
 # How LWM and HWM will be set for the historical rebalance;


> Create cache shared preloader
> -
>
> Key: IGNITE-12069
> URL: https://issues.apache.org/jira/browse/IGNITE-12069
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Maxim Muzafarov
>Assignee: Pavel Pereslegin
>Priority: Major
>  Labels: iep-28
>
> {{CacheSharedPreloader}} must do the following:
>  # build the map of partitions and corresponding supplier nodes from which 
> partitions will be loaded [1];
>  # witching cache data storage to {{no-op}} and back to original (HWM must be 
> fixed here for the needs of historical rebalance) under the checkpoint and 
> keep the partition update counter for each partition [1];
>  # run async the eviction indexes for the list of collected partitions (API 
> must be provided by IGNITE-11075) [2];
>  # send a request message to each node one by one with the list of partitions 
> to load [2];
>  # listening for the transmission handler to receive files [2];
>  # run rebuild indexes async over the receiving partitions (API must be 
> provided by IGNITE-11075) [2];
>  # run historical rebalance from LWM to HWM collected above (LWM can be read 
> from the received file meta page) [1];
> The points marked with the label {{[1]}} must be done prior to {{[2]}}.
>  
> NOTE. Check the following things:
>  # Rebalancing of MVCC cache groups;
>  # How LWM and HWM will be set for the historical rebalance;



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (IGNITE-12069) Create cache shared preloader

2019-08-14 Thread Maxim Muzafarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Muzafarov updated IGNITE-12069:
-
Ignite Flags:   (was: Docs Required)

> Create cache shared preloader
> -
>
> Key: IGNITE-12069
> URL: https://issues.apache.org/jira/browse/IGNITE-12069
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Maxim Muzafarov
>Assignee: Pavel Pereslegin
>Priority: Major
>  Labels: iep-28
>
> {{CacheSharedPreloader}} must do the following:
>  # [1] build the map of partitions and corresponding supplier nodes from 
> which partitions will be loaded;
>  # [1] switching cache data storage to {{no-op}} and back to original (HWM 
> must be fixed here for the needs of historical rebalance) under the 
> checkpoint and keep the partition update counter for each partition;
>  # [2] run async the eviction indexes for the list of collected partitions 
> (API must be provided by IGNITE-11075);
>  # [2] Send a request message to each node one by one with the list of 
> partitions to load;
>  # [2] Listening for the transmission handler to receive files;
>  # [2] Run rebuild indexes async over the receiving partitions (API must be 
> provided by IGNITE-11075);
>  # [1] Run historical rebalance from LWM to HWM collected above (LWM can be 
> read from received file meta page)
> The points marked with the label {{[1]}} must be done prior to {{[2]}}.
>  
> NOTE. Check the following things:
>  # Rebalancing of MVCC cache groups;
>  # How LWM and HWM will be set for the historical rebalance;



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (IGNITE-12069) Create cache shared preloader

2019-08-14 Thread Maxim Muzafarov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Muzafarov updated IGNITE-12069:
-
Description: 
{{CacheSharedPreloader}} must do the following:
 # [1] build the map of partitions and corresponding supplier nodes from which 
partitions will be loaded;
 # [1] switching cache data storage to {{no-op}} and back to original (HWM must 
be fixed here for the needs of historical rebalance) under the checkpoint and 
keep the partition update counter for each partition;
 # [2] run async the eviction indexes for the list of collected partitions (API 
must be provided by IGNITE-11075);
 # [2] Send a request message to each node one by one with the list of 
partitions to load;
 # [2] Listening for the transmission handler to receive files;
 # [2] Run rebuild indexes async over the receiving partitions (API must be 
provided by IGNITE-11075);
 # [1] Run historical rebalance from LWM to HWM collected above (LWM can be 
read from received file meta page)

The points marked with the label {{[1]}} must be done prior to {{[2]}}.

 

NOTE. Check the following things:
 # Rebalancing of MVCC cache groups;
 # How LWM and HWM will be set for the historical rebalance;

  was:
{{CacheSharedPreloader}} must do the following:
# build the map of partitions and corresponding supplier nodes from which 
partitions will be loaded;
# switching cache data storage to {{no-op}} and back to original (HWM must be 
fixed here for the needs of historical rebalance) under the checkpoint and keep 
the partition update counter for each partition;
# run async the eviction indexes for the list of collected partitions (API must 
be provided by IGNITE-11075);
# Send a request message to each node one by one with the list of partitions to 
load;
# Listening for the transmission handler to receive files;
# Run rebuild indexes async over the receiving partitions (API must be provided 
by IGNITE-11075);
# Run historical rebalance from LWM to HWM collected above (LWM can be read 
from received file meta page)


NOTE. Check the following things:
# Rebalancing of MVCC cache groups;
# How LWM and HWM will be set for the historical rebalance;


> Create cache shared preloader
> -
>
> Key: IGNITE-12069
> URL: https://issues.apache.org/jira/browse/IGNITE-12069
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Maxim Muzafarov
>Assignee: Pavel Pereslegin
>Priority: Major
>  Labels: iep-28
>
> {{CacheSharedPreloader}} must do the following:
>  # [1] build the map of partitions and corresponding supplier nodes from 
> which partitions will be loaded;
>  # [1] switching cache data storage to {{no-op}} and back to original (HWM 
> must be fixed here for the needs of historical rebalance) under the 
> checkpoint and keep the partition update counter for each partition;
>  # [2] run async the eviction indexes for the list of collected partitions 
> (API must be provided by IGNITE-11075);
>  # [2] Send a request message to each node one by one with the list of 
> partitions to load;
>  # [2] Listening for the transmission handler to receive files;
>  # [2] Run rebuild indexes async over the receiving partitions (API must be 
> provided by IGNITE-11075);
>  # [1] Run historical rebalance from LWM to HWM collected above (LWM can be 
> read from received file meta page)
> The points marked with the label {{[1]}} must be done prior to {{[2]}}.
>  
> NOTE. Check the following things:
>  # Rebalancing of MVCC cache groups;
>  # How LWM and HWM will be set for the historical rebalance;



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (IGNITE-12069) Create cache shared preloader

2019-08-14 Thread Maxim Muzafarov (JIRA)
Maxim Muzafarov created IGNITE-12069:


 Summary: Create cache shared preloader
 Key: IGNITE-12069
 URL: https://issues.apache.org/jira/browse/IGNITE-12069
 Project: Ignite
  Issue Type: Sub-task
Reporter: Maxim Muzafarov
Assignee: Pavel Pereslegin


{{CacheSharedPreloader}} must do the following:
# build the map of partitions and corresponding supplier nodes from which 
partitions will be loaded;
# switching cache data storage to {{no-op}} and back to original (HWM must be 
fixed here for the needs of historical rebalance) under the checkpoint and keep 
the partition update counter for each partition;
# run async the eviction indexes for the list of collected partitions (API must 
be provided by IGNITE-11075);
# Send a request message to each node one by one with the list of partitions to 
load;
# Listening for the transmission handler to receive files;
# Run rebuild indexes async over the receiving partitions (API must be provided 
by IGNITE-11075);
# Run historical rebalance from LWM to HWM collected above (LWM can be read 
from received file meta page)


NOTE. Check the following things:
# Rebalancing of MVCC cache groups;
# How LWM and HWM will be set for the historical rebalance;



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-12068) puzzling select result

2019-08-14 Thread Ivan Pavlukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-12068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907014#comment-16907014
 ] 

Ivan Pavlukhin commented on IGNITE-12068:
-

[~JerryKwan], yep you are right. And a workaround for integers also does not 
look good. Currently I can only think about changing a table structure =(

> puzzling select result
> --
>
> Key: IGNITE-12068
> URL: https://issues.apache.org/jira/browse/IGNITE-12068
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 2.7.5
> Environment: System version: CentOS Linux release 7.6.1810 (Core)
> Apache Ignite version: apache-ignite-2.7.5-1.noarch
>Reporter: JerryKwan
>Priority: Critical
>
> select using the first primary key only returns one record, but it should 
> return more records.
> The following is how to reproduce this problem
> 1, create a table using
> CREATE TABLE IF NOT EXISTS Person(
>  id int,
>  city_id int,
>  name varchar,
>  age int, 
>  company varchar,
>  PRIMARY KEY (id, city_id)
> );
> 2, insert some records
> INSERT INTO Person (id, name, city_id) VALUES (1, 'John Doe', 3);
> INSERT INTO Person (id, name, city_id) VALUES (1, 'John Dean', 4);
> INSERT INTO Person (id, name, city_id) VALUES (2, 'Alex', 4);
> 3, query using 'select * from Person' show all of the records, expected
> [http://www.passimage.in/i/03da31c8f23cf64580d5.png]
> 4, query using 'select * from Person where id=1', only get one record, NOT 
> expected
> [http://www.passimage.in/i/f5491491a70c5d796823.png]
> 5, query using 'select * from Person where city_id=4' get  two records, 
> expected
> [http://www.passimage.in/i/ff0ee4f5e882983d779d.png]
> Why  'select * from Person where id=1', only get one record? and how to fix 
> this? Is there any special operations/configurations to do?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-12063) Add ability to track system/user time held in transaction

2019-08-14 Thread Denis Chudov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-12063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906992#comment-16906992
 ] 

Denis Chudov commented on IGNITE-12063:
---

[~ivan.glukos], could you review my changes please?

> Add ability to track system/user time held in transaction
> -
>
> Key: IGNITE-12063
> URL: https://issues.apache.org/jira/browse/IGNITE-12063
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Denis Chudov
>Assignee: Denis Chudov
>Priority: Major
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We should dump user/system times in transaction to log on commit/rollback, if 
> duration of transaction more then threshold. I want to see in log on tx 
> coordinator node:
> # Transaction duration
> # System time:
> #* How long we were getting locks on keys?
> #* How long we were preparing transaction?
> #* How long we were commiting transaction?
> # User time (transaction time - total system time) 
> # Transaction status (commit/rollback)
> The threshold could be set by system property and overwrite by JMX. We 
> shouldn't dump times, if the property not set.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-12063) Add ability to track system/user time held in transaction

2019-08-14 Thread Ignite TC Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-12063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906986#comment-16906986
 ] 

Ignite TC Bot commented on IGNITE-12063:


{panel:title=Branch: [pull/6772/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=4496387buildTypeId=IgniteTests24Java8_RunAll]

> Add ability to track system/user time held in transaction
> -
>
> Key: IGNITE-12063
> URL: https://issues.apache.org/jira/browse/IGNITE-12063
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Denis Chudov
>Assignee: Denis Chudov
>Priority: Major
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We should dump user/system times in transaction to log on commit/rollback, if 
> duration of transaction more then threshold. I want to see in log on tx 
> coordinator node:
> # Transaction duration
> # System time:
> #* How long we were getting locks on keys?
> #* How long we were preparing transaction?
> #* How long we were commiting transaction?
> # User time (transaction time - total system time) 
> # Transaction status (commit/rollback)
> The threshold could be set by system property and overwrite by JMX. We 
> shouldn't dump times, if the property not set.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-7791) Ignite Client Nodes: failed test IgniteClientReconnectCacheTest.testReconnectCacheDestroyedAndCreated()

2019-08-14 Thread Igor Kamyshnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-7791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906975#comment-16906975
 ] 

Igor Kamyshnikov commented on IGNITE-7791:
--

Added for history indexing purposes:

This bug is seen for Ignite 2.5 for the following scenario:
1) run ignite server nodes
2) join with a client node
3) kill all server nodes
4) start server nodes again
5) await for the client to reconnect
6) client logs will have a NullPointerException (Failed to reinitialize local 
partitions (preloading will be stopped))
7) cache get/put operations executed in non-tx context will hang forever:
{noformat}
Client topology version mismatch, need remap lock request 
[reqTopVer=AffinityTopologyVersion [topVer=5, minorTopVer=0], 
locTopVer=AffinityTopologyVersion [topVer=5, minorTopVer=1], 
req=GridNearLockRequest [topVer=AffinityTopologyVersion [topVer=5, 
minorTopVer=0] ...
{noformat}
{noformat}
Starting (re)map for mappings [mappings=[GridNearLockMapping 
[node=TcpDiscoveryNode [id=02315613-a512-45af-8983-b06b64ee3dd2, 
addrs=[0:0:0:0:0:0:0:1%lo, 10.0.2.15, 10.0.3.23, 127.0.0.1], 
sockAddrs=[cas-3/10.0.3.23:47500, /10.0.2.15:47500, /0:0:0:0:0:0:0:1%lo:47500, 
/127.0.0.1:47500], discPort=47500, order=1, intOrder=1, 
lastExchangeTime=1565685652269, loc=false, ver=2.5.0#20190705-sha1:7d4d79b0, 
isClient=false],... futs=[false]
{noformat}

NPE:
{noformat}
 INFO  2019-08-13 08:26:20.282 (1565684780282) [exchange-worker-#47] [time] 
> Started exchange init [topVer=AffinityTopologyVersion [topVer=4, 
minorTopVer=1], crd=false, evt=DISCOVERY_CUSTOM_EVT, 
evtNode=941c65d5-3ae5-41d5-98fa-8d300edafc94, customEvt=CacheAffinityCh
angeMessage [id=346351a8c61-b59cbf8f-ca75-46fc-9ebe-a76d424a60af, 
topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], exchId=null, 
partsMsg=null, exchangeNeeded=true], allowMerge=false]
 ERROR 2019-08-13 08:26:20.366 (1565684780366) [exchange-worker-#47] 
[GridDhtPartitionsExchangeFuture] > Failed to reinitialize local partitions 
(preloading will be stopped): GridDhtPartitionExchangeId 
[topVer=AffinityTopologyVersion [topVer=4, minorTopVer=1], discoE
vt=DiscoveryCustomEvent [customMsg=CacheAffinityChangeMessage 
[id=346351a8c61-b59cbf8f-ca75-46fc-9ebe-a76d424a60af, 
topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], exchId=null, 
partsMsg=null, exchangeNeeded=true], affTopVer=AffinityTopologyVersion 
[topVer=4, mi
norTopVer=1], super=DiscoveryEvent [evtNode=TcpDiscoveryNode 
[id=941c65d5-3ae5-41d5-98fa-8d300edafc94, addrs=[0:0:0:0:0:0:0:1%lo, 10.0.2.15, 
10.0.3.22, 127.0.0.1], sockAddrs=[cas-2/10.0.3.22:47500, /10.0.2.15:47500, 
/0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500], discPort=
47500, order=1, intOrder=1, lastExchangeTime=1565684762464, loc=false, 
ver=2.5.0#20190705-sha1:7d4d79b0, isClient=false], topVer=4, nodeId8=e64e8e54, 
msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1565684780279]], nodeId=941c65d5, 
evt=DISCOVERY_CUSTOM_EVT]
java.lang.NullPointerException
at 
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager$8.applyx(CacheAffinitySharedManager.java:993)
at 
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager$8.applyx(CacheAffinitySharedManager.java:983)
at 
org.apache.ignite.internal.util.lang.IgniteInClosureX.apply(IgniteInClosureX.java:38)
at 
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.forAllCacheGroups(CacheAffinitySharedManager.java:1118)
at 
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.onChangeAffinityMessage(CacheAffinitySharedManager.java:983)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onAffinityChangeRequest(GridDhtPartitionsExchangeFuture.java:1011)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:656)
at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2419)
at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2299)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
at java.lang.Thread.run(Thread.java:748)
 INFO  2019-08-13 08:26:20.368 (1565684780368) [exchange-worker-#47] 
[GridDhtPartitionsExchangeFuture] > Finish exchange future 
[startVer=AffinityTopologyVersion [topVer=4, minorTopVer=1], resVer=null, 
err=java.lang.NullPointerException]
 ERROR 2019-08-13 08:26:20.457 (1565684780457) [exchange-worker-#47] 
[GridCachePartitionExchangeManager] > Failed to wait for completion of 
partition map exchange (preloading will not start): 
GridDhtPartitionsExchangeFuture [firstDiscoEvt=DiscoveryCustomEvent 
[customMsg=null, 

[jira] [Commented] (IGNITE-12068) puzzling select result

2019-08-14 Thread JerryKwan (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-12068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906952#comment-16906952
 ] 

JerryKwan commented on IGNITE-12068:


Hi [~Pavlukhin], thank for looking into this.
If the primary key is single, it must be only exists one result row, but if the 
primary key is composite, it depends
If the primary key is an interger, we can use the filtering condition WHERE id 
>= 1 and id < 2 as a temporary solution, but what should we do if it is an 
string?

> puzzling select result
> --
>
> Key: IGNITE-12068
> URL: https://issues.apache.org/jira/browse/IGNITE-12068
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 2.7.5
> Environment: System version: CentOS Linux release 7.6.1810 (Core)
> Apache Ignite version: apache-ignite-2.7.5-1.noarch
>Reporter: JerryKwan
>Priority: Critical
>
> select using the first primary key only returns one record, but it should 
> return more records.
> The following is how to reproduce this problem
> 1, create a table using
> CREATE TABLE IF NOT EXISTS Person(
>  id int,
>  city_id int,
>  name varchar,
>  age int, 
>  company varchar,
>  PRIMARY KEY (id, city_id)
> );
> 2, insert some records
> INSERT INTO Person (id, name, city_id) VALUES (1, 'John Doe', 3);
> INSERT INTO Person (id, name, city_id) VALUES (1, 'John Dean', 4);
> INSERT INTO Person (id, name, city_id) VALUES (2, 'Alex', 4);
> 3, query using 'select * from Person' show all of the records, expected
> [http://www.passimage.in/i/03da31c8f23cf64580d5.png]
> 4, query using 'select * from Person where id=1', only get one record, NOT 
> expected
> [http://www.passimage.in/i/f5491491a70c5d796823.png]
> 5, query using 'select * from Person where city_id=4' get  two records, 
> expected
> [http://www.passimage.in/i/ff0ee4f5e882983d779d.png]
> Why  'select * from Person where id=1', only get one record? and how to fix 
> this? Is there any special operations/configurations to do?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (IGNITE-12068) puzzling select result

2019-08-14 Thread Ivan Pavlukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Pavlukhin updated IGNITE-12068:

Priority: Critical  (was: Major)

> puzzling select result
> --
>
> Key: IGNITE-12068
> URL: https://issues.apache.org/jira/browse/IGNITE-12068
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 2.7.5
> Environment: System version: CentOS Linux release 7.6.1810 (Core)
> Apache Ignite version: apache-ignite-2.7.5-1.noarch
>Reporter: JerryKwan
>Priority: Critical
>
> select using the first primary key only returns one record, but it should 
> return more records.
> The following is how to reproduce this problem
> 1, create a table using
> CREATE TABLE IF NOT EXISTS Person(
>  id int,
>  city_id int,
>  name varchar,
>  age int, 
>  company varchar,
>  PRIMARY KEY (id, city_id)
> );
> 2, insert some records
> INSERT INTO Person (id, name, city_id) VALUES (1, 'John Doe', 3);
> INSERT INTO Person (id, name, city_id) VALUES (1, 'John Dean', 4);
> INSERT INTO Person (id, name, city_id) VALUES (2, 'Alex', 4);
> 3, query using 'select * from Person' show all of the records, expected
> [http://www.passimage.in/i/03da31c8f23cf64580d5.png]
> 4, query using 'select * from Person where id=1', only get one record, NOT 
> expected
> [http://www.passimage.in/i/f5491491a70c5d796823.png]
> 5, query using 'select * from Person where city_id=4' get  two records, 
> expected
> [http://www.passimage.in/i/ff0ee4f5e882983d779d.png]
> Why  'select * from Person where id=1', only get one record? and how to fix 
> this? Is there any special operations/configurations to do?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (IGNITE-12068) puzzling select result

2019-08-14 Thread Ivan Pavlukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-12068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906943#comment-16906943
 ] 

Ivan Pavlukhin commented on IGNITE-12068:
-

[~JerryKwan], you faced (for our shame) a bug related to an optimization for a 
primary key equality lookup. Erroneously it assumes that there should be only 
one result row. As a workaround I was able to get proper results using 
filtering condition {{WHERE id >= 1 and id < 2}}. But it is really weird, hope 
to fix it shortly.

> puzzling select result
> --
>
> Key: IGNITE-12068
> URL: https://issues.apache.org/jira/browse/IGNITE-12068
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 2.7.5
> Environment: System version: CentOS Linux release 7.6.1810 (Core)
> Apache Ignite version: apache-ignite-2.7.5-1.noarch
>Reporter: JerryKwan
>Priority: Major
>
> select using the first primary key only returns one record, but it should 
> return more records.
> The following is how to reproduce this problem
> 1, create a table using
> CREATE TABLE IF NOT EXISTS Person(
>  id int,
>  city_id int,
>  name varchar,
>  age int, 
>  company varchar,
>  PRIMARY KEY (id, city_id)
> );
> 2, insert some records
> INSERT INTO Person (id, name, city_id) VALUES (1, 'John Doe', 3);
> INSERT INTO Person (id, name, city_id) VALUES (1, 'John Dean', 4);
> INSERT INTO Person (id, name, city_id) VALUES (2, 'Alex', 4);
> 3, query using 'select * from Person' show all of the records, expected
> [http://www.passimage.in/i/03da31c8f23cf64580d5.png]
> 4, query using 'select * from Person where id=1', only get one record, NOT 
> expected
> [http://www.passimage.in/i/f5491491a70c5d796823.png]
> 5, query using 'select * from Person where city_id=4' get  two records, 
> expected
> [http://www.passimage.in/i/ff0ee4f5e882983d779d.png]
> Why  'select * from Person where id=1', only get one record? and how to fix 
> this? Is there any special operations/configurations to do?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)