[jira] [Commented] (IGNITE-12259) Create new module for support spring-5.2.X and spring-data-2.2.X

2019-10-17 Thread Surkov Aleksandr (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954277#comment-16954277
 ] 

Surkov Aleksandr commented on IGNITE-12259:
---

Please, review my changes.
The following was done:
 # Created module ignite-spring-data_2.0
 # Copy code from module ignite-spring-data_2.0 to ignite-spring-data_2.2
 # Changed version of Spring Data at file README.txt
 # In the file parent\pom.xml added properties

{code:java}
2.2.0.RELEASE  
5.2.0.RELEASE {code}
 # Changed using version of Spring in pom file
 # In the file 
modules\spring-data-2.2\src\test\java\org\apache\ignite\springdata\IgniteSpringDataQueriesSelfTest.java
changed creatin objects PageRequest thow new to of method, and new Sort on 
Sort.by
 # In the class _org.apache.ignite.springdata.misc.PersonRepository_ changed 
returned value of method _List removeByFirstName(String firstName)_ to 
_long removeByFirstName(String firstName)_. Corrected test 
_org.apache.ignite.springdata.IgniteSpringDataCrudSelfTest#testRemoveExpression_
 # Changed Spring version in the file README.txt and class 
org.apache.ignite.testsuites.IgniteSpringData2TestSuite

> Create new module for support spring-5.2.X and spring-data-2.2.X
> 
>
> Key: IGNITE-12259
> URL: https://issues.apache.org/jira/browse/IGNITE-12259
> Project: Ignite
>  Issue Type: Wish
>Reporter: Surkov Aleksandr
>Assignee: Surkov Aleksandr
>Priority: Minor
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The actual spring version is 
> [5.2.0.RELEASE|https://mvnrepository.com/artifact/org.springframework/spring-context/5.2.0.RELEASE],
>  spring data version is 
> [2.2.0.RELEASE.|https://mvnrepository.com/artifact/org.springframework.data/spring-data-commons/2.2.0.RELEASE]
> It would be nice to add a module to support these versions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-11087) GridJobCheckpointCleanupSelfTest.testCheckpointCleanup is flaky

2019-10-17 Thread Nikolai Kulagin (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954060#comment-16954060
 ] 

Nikolai Kulagin commented on IGNITE-11087:
--

Because the test task is very short, CheckpointRequestListener catches a 
message about saving a checkpoint, and the #onSessionEnd method after the task 
is finished, work simultaneously. In one moment task node add sessionId in 
closedSess map, and the listener finds sessionId in map. Task node removes the 
key from keymap for this session and removes checkpoint for this key.
{code:java}
closedSess.add(ses.getId());

// If on task node.
if (ses.getJobId() == null) {
Set keys = keyMap.remove(ses.getId());

if (keys != null) {
for (String key : keys)
getSpi(ses.getCheckpointSpi()).removeCheckpoint(key);
}
}{code}
Listener removes the key from keymap and removes checkpoint too (even if the 
key was not in the map).
{code:java}
if (closedSess.contains(sesId)) {
keyMap.remove(sesId, keys);

getSpi(req.getCheckpointSpi()).removeCheckpoint(req.getKey());
}{code}
For bugfix need add listener's check for contains key in keymap before removing 
key. And delete the checkpoint only if the key is found.
{code:java}
if (closedSess.contains(sesId)) {
if (keyMap.remove(sesId, keys)) 
getSpi(req.getCheckpointSpi()).removeCheckpoint(req.getKey());
}
{code}
After fixing a new bug appears.

Between create new keySet and add checkpoint key in the listener,
{code:java}
Set old = keyMap.putIfAbsent(sesId, (CheckpointSet)(keys = new 
CheckpointSet(ses)));

if (old != null)
keys = old;
}
<-- here
keys.add(req.getKey());
{code}
task node adds a session in closedSess map, remove empty keySet for session, 
but not found no one key (because the listener has not added key yet), and 
don't remove checkpoint.
{code:java}
Set keys = keyMap.remove(ses.getId());

if (keys != null) {
for (String key : keys){code}
Listener after added key did not find this key in keyMap, and did not remove 
checkpoint.
{code:java}
if (closedSess.contains(sesId)) {
if (keyMap.remove(sesId, keys)){code}
 

> GridJobCheckpointCleanupSelfTest.testCheckpointCleanup is flaky
> ---
>
> Key: IGNITE-11087
> URL: https://issues.apache.org/jira/browse/IGNITE-11087
> Project: Ignite
>  Issue Type: Bug
>Reporter: Nikolai Kulagin
>Assignee: Nikolai Kulagin
>Priority: Minor
>  Labels: MakeTeamcityGreenAgain
> Attachments: #removeCheckpoint is called once more.txt, 
> #removeCheckpoint isn't called.txt
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The method of remove a checkpoint is sometimes not called or is called once 
> more. Test has a very low fail rate, 1 per 366 runs on 
> [TeamCity|https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8=-7655052229521669617=testDetails_IgniteTests24Java8=%3Cdefault%3E]
>  and 1 per 412 on TC Bot. On local machine approximately 1 failure per 100 
> runs. Logs in the attachment.
> Test is flaky for a long time. Before replacing IP Finder in IGNITE-10555, 
> test was slower, which made fail rate even less.
>  
> {code:java}
> [2019-01-25 14:49:03,050][ERROR][main][root] Test failed.
> junit.framework.AssertionFailedError: expected:<1> but was:<0>
> at junit.framework.Assert.fail(Assert.java:57)
> at junit.framework.Assert.failNotEquals(Assert.java:329)
> at junit.framework.Assert.assertEquals(Assert.java:78)
> at junit.framework.Assert.assertEquals(Assert.java:234)
> at junit.framework.Assert.assertEquals(Assert.java:241)
> at 
> org.apache.ignite.internal.GridJobCheckpointCleanupSelfTest.testCheckpointCleanup(GridJobCheckpointCleanupSelfTest.java:88)
> at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.apache.ignite.testframework.junits.GridAbstractTest$6.run(GridAbstractTest.java:2088)
> at java.lang.Thread.run(Thread.java:748){code}
>  
> [^#removeCheckpoint isn't called.txt]
> ^_^
>  
> {code:java}
> [2019-01-25 14:50:03,282][ERROR][main][root] Test failed.
> junit.framework.AssertionFailedError: expected:<-1> but was:<0>
>  at junit.framework.Assert.fail(Assert.java:57)
>  at 

[jira] [Created] (IGNITE-12302) Test ZookeeperDiscoveryTopologyChangeAndReconnectTest.testDuplicatedNodeId is broken.

2019-10-17 Thread Amelchev Nikita (Jira)
Amelchev Nikita created IGNITE-12302:


 Summary: Test 
ZookeeperDiscoveryTopologyChangeAndReconnectTest.testDuplicatedNodeId is broken.
 Key: IGNITE-12302
 URL: https://issues.apache.org/jira/browse/IGNITE-12302
 Project: Ignite
  Issue Type: Bug
Reporter: Amelchev Nikita
Assignee: Amelchev Nikita
 Fix For: 2.8


Test checks that a new node will not be started with the same node id. 
Newly added SystemView creates table on node startup and fails with error:

{noformat}
java.lang.AssertionError: Unexpected exception
at 
org.apache.ignite.testframework.GridTestUtils.fail(GridTestUtils.java:622)
at 
org.apache.ignite.testframework.GridTestUtils.assertThrowsAnyCause(GridTestUtils.java:465)
at 
org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoveryTopologyChangeAndReconnectTest.testDuplicatedNodeId(ZookeeperDiscoveryTopologyChangeAndReconnectTest.java:582)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.apache.ignite.testframework.junits.GridAbstractTest$7.run(GridAbstractTest.java:2090)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to register 
system view.
at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1401)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2038)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1703)
at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1117)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:615)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:983)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:924)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:912)
at 
org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:878)
at 
org.apache.ignite.spi.discovery.zk.internal.ZookeeperDiscoveryTopologyChangeAndReconnectTest.lambda$testDuplicatedNodeId$0(ZookeeperDiscoveryTopologyChangeAndReconnectTest.java:583)
at 
org.apache.ignite.testframework.GridTestUtils.assertThrowsAnyCause(GridTestUtils.java:449)
... 11 more
Caused by: class org.apache.ignite.IgniteException: Failed to register system 
view.
at 
org.apache.ignite.internal.processors.query.h2.SchemaManager.createSystemView(SchemaManager.java:238)
at 
org.apache.ignite.internal.processors.query.h2.SchemaManager.createSystemViews(SchemaManager.java:247)
at 
org.apache.ignite.internal.processors.query.h2.SchemaManager.start(SchemaManager.java:195)
at 
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.start(IgniteH2Indexing.java:2083)
[2019-10-17 21:50:42,017][INFO ][main][root] >>> Stopping test: 
ZookeeperDiscoveryTopologyChangeAndReconnectTest#testDuplicatedNodeId in 261 ms 
<<<
at 
org.apache.ignite.internal.processors.query.GridQueryProcessor.start(GridQueryProcessor.java:250)
at 
org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1977)
at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1213)
... 21 more
Caused by: org.h2.jdbc.JdbcSQLException: Таблица "NODE_ATTRIBUTES" уже 
существует
Table "NODE_ATTRIBUTES" already exists; SQL statement:
CREATE TABLE NODE_ATTRIBUTES(NODE_ID UUID, NAME VARCHAR, VALUE VARCHAR) ENGINE 
"org.apache.ignite.internal.processors.query.h2.sys.SqlSystemTableEngine" 
[42101-197]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:357)
at org.h2.message.DbException.get(DbException.java:179)
at org.h2.message.DbException.get(DbException.java:155)
at org.h2.command.ddl.CreateTable.update(CreateTable.java:86)
at org.h2.command.CommandContainer.update(CommandContainer.java:102)
at org.h2.command.Command.executeUpdate(Command.java:261)
at 

[jira] [Commented] (IGNITE-6804) Print a warning if HashMap is passed into bulk update operations

2019-10-17 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-6804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16953880#comment-16953880
 ] 

Ignite TC Bot commented on IGNITE-6804:
---

{panel:title=Branch: [pull/6976/head] Base: [master] : Possible Blockers 
(3)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}Start Nodes{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=4697296]]
* IgniteStartStopRestartTestSuite: 
IgniteProjectionStartStopRestartSelfTest.testStartOneNode - Test has low fail 
rate in base branch 0,0% and is not flaky

{color:#d04437}Platform .NET (Inspections)*{color} [[tests 0 Failure on metric 
|https://ci.ignite.apache.org/viewLog.html?buildId=4698855]]

{color:#d04437}~Build Apache Ignite~{color} [[tests 0 Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=4705081]]

{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=4705196buildTypeId=IgniteTests24Java8_RunAll]

> Print a warning if HashMap is passed into bulk update operations
> 
>
> Key: IGNITE-6804
> URL: https://issues.apache.org/jira/browse/IGNITE-6804
> Project: Ignite
>  Issue Type: Improvement
>  Components: cache
>Reporter: Denis A. Magda
>Assignee: Ilya Kasnacheev
>Priority: Critical
>  Labels: usability
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Ignite newcomers tend to stumble on deadlocks simply because the keys are 
> passed in an unordered HashMap. Propose to do the following:
> * update bulk operations Java docs.
> * print out a warning if not SortedMap (e.g. HashMap, 
> Weak/Identity/Concurrent/Linked HashMap etc) is passed into
> a bulk method (instead of SortedMap) and contains more than 1 element. 
> However, we should make sure that we only print that warning once and not 
> every time the API is called.
> * do not produce warning for explicit optimistic transactions
> More details are here:
> http://apache-ignite-developers.2346864.n4.nabble.com/Re-Ignite-2-0-0-GridUnsafe-unmonitor-td23706.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-6804) Print a warning if HashMap is passed into bulk update operations

2019-10-17 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-6804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16953852#comment-16953852
 ] 

Ignite TC Bot commented on IGNITE-6804:
---

{panel:title=Branch: [pull/6976/head] Base: [master] : Possible Blockers 
(3)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}Start Nodes{color} [[tests 
1|https://ci.ignite.apache.org/viewLog.html?buildId=4697296]]
* IgniteStartStopRestartTestSuite: 
IgniteProjectionStartStopRestartSelfTest.testStartOneNode - Test has low fail 
rate in base branch 0,0% and is not flaky

{color:#d04437}Platform .NET (Inspections)*{color} [[tests 0 Failure on metric 
|https://ci.ignite.apache.org/viewLog.html?buildId=4698855]]

{color:#d04437}~Build Apache Ignite~{color} [[tests 0 Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=4704733]]

{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=4704848buildTypeId=IgniteTests24Java8_RunAll]

> Print a warning if HashMap is passed into bulk update operations
> 
>
> Key: IGNITE-6804
> URL: https://issues.apache.org/jira/browse/IGNITE-6804
> Project: Ignite
>  Issue Type: Improvement
>  Components: cache
>Reporter: Denis A. Magda
>Assignee: Ilya Kasnacheev
>Priority: Critical
>  Labels: usability
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Ignite newcomers tend to stumble on deadlocks simply because the keys are 
> passed in an unordered HashMap. Propose to do the following:
> * update bulk operations Java docs.
> * print out a warning if not SortedMap (e.g. HashMap, 
> Weak/Identity/Concurrent/Linked HashMap etc) is passed into
> a bulk method (instead of SortedMap) and contains more than 1 element. 
> However, we should make sure that we only print that warning once and not 
> every time the API is called.
> * do not produce warning for explicit optimistic transactions
> More details are here:
> http://apache-ignite-developers.2346864.n4.nabble.com/Re-Ignite-2-0-0-GridUnsafe-unmonitor-td23706.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IGNITE-1606) NPE during node stop due to nullified logger in TcpCommunicationSpi

2019-10-17 Thread Dmitriy Sorokin (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16953832#comment-16953832
 ] 

Dmitriy Sorokin edited comment on IGNITE-1606 at 10/17/19 3:11 PM:
---

The issue with access to nullified log field is present at the current master 
branch (2.8-SNAPSHOT) not only in TcpCommunicationSpi but also in 
TcpDiscoveryMulticastIpFinder, see stacktraces below:
{code:java}
[2019-09-24 15:31:19,018][ERROR][sys-stripe-0-#325%worker-8%][root] Critical 
system error detected. Will be handled accordingly to configured handler 
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, 
err=java.lang.NullPointerException]][2019-09-24 
15:31:19,018][ERROR][sys-stripe-0-#325%worker-8%][root] Critical system error 
detected. Will be handled accordingly to configured handler 
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, 
err=java.lang.NullPointerException]]
java.lang.NullPointerException
 at 
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2821)
 at 
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2805)
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2031)
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2128)
 at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1257)
 at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1296)
 at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.sendDeferredUpdateResponse(GridDhtAtomicCache.java:3619)
 at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$3300(GridDhtAtomicCache.java:142)
 at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout.run(GridDhtAtomicCache.java:3865)
 at 
org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:550)
 at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
 at java.lang.Thread.run(Thread.java:748)
{code}
 
{code:java}
[2019-09-24 
15:31:11,411][ERROR][tcp-client-disco-reconnector-#64%worker-4%][TcpDiscoverySpi]
 Runtime error caught during grid runnable execution: IgniteSpiThread 
[name=tcp-client-disco-reconnector-#64%worker-4%][2019-09-24 
15:31:11,411][ERROR][tcp-client-disco-reconnector-#64%worker-4%][TcpDiscoverySpi]
 Runtime error caught during grid runnable execution: IgniteSpiThread 
[name=tcp-client-disco-reconnector-#64%worker-4%]
java.lang.NullPointerException
 at 
org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder.requestAddresses(TcpDiscoveryMulticastIpFinder.java:637)
 at 
org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder.getRegisteredAddresses(TcpDiscoveryMulticastIpFinder.java:392)
 at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.registeredAddresses(TcpDiscoverySpi.java:1944)
 at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.resolvedAddresses(TcpDiscoverySpi.java:1892)
 at 
org.apache.ignite.spi.discovery.tcp.ClientImpl.joinTopology(ClientImpl.java:562)
 at 
org.apache.ignite.spi.discovery.tcp.ClientImpl.access$1100(ClientImpl.java:141)
 at 
org.apache.ignite.spi.discovery.tcp.ClientImpl$Reconnector.body(ClientImpl.java:1523)
{code}
 

 

I see two different solutions:

1) Replace constructions
{code:java}
if (log.isDebugEnabled())
log.debug(...);{code}
and
{code:java}
if (log.isTraceEnabled())
log.trace(...);{code}
by wrappers similar to U.error(...) and U.warn(...), where log reference will 
be checked for null before access.

 

2) Prevent nullifying of log references annotated by IgniteLogger at 
GridResourceProcessor.cleanup() method.

First solution seems more simple to me rather than second one, so I propose use 
that for resolving this issue.

Thoughts?

 

 

 


was (Author: cyberdemon):
The issue with access to nullified log field is present at the current master 
branch (2.8-SNAPSHOT) not only in TcpCommunicationSpi but also in 
TcpDiscoveryMulticastIpFinder, see stacktraces below:
{code:java}
[2019-09-24 15:31:19,018][ERROR][sys-stripe-0-#325%worker-8%][root] Critical 
system error detected. Will be handled accordingly to configured handler 
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
super=AbstractFailureHandler 

[jira] [Commented] (IGNITE-1606) NPE during node stop due to nullified logger in TcpCommunicationSpi

2019-10-17 Thread Dmitriy Sorokin (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16953832#comment-16953832
 ] 

Dmitriy Sorokin commented on IGNITE-1606:
-

The issue with access to nullified log field is present at the current master 
branch (2.8-SNAPSHOT) not only in TcpCommunicationSpi but also in 
TcpDiscoveryMulticastIpFinder, see stacktraces below:
{code:java}
[2019-09-24 15:31:19,018][ERROR][sys-stripe-0-#325%worker-8%][root] Critical 
system error detected. Will be handled accordingly to configured handler 
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, 
err=java.lang.NullPointerException]][2019-09-24 
15:31:19,018][ERROR][sys-stripe-0-#325%worker-8%][root] Critical system error 
detected. Will be handled accordingly to configured handler 
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, 
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet 
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], 
failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, 
err=java.lang.NullPointerException]]java.lang.NullPointerException at 
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2821)
 at 
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2805)
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2031)
 at 
org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2128)
 at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1257)
 at 
org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1296)
 at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.sendDeferredUpdateResponse(GridDhtAtomicCache.java:3619)
 at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$3300(GridDhtAtomicCache.java:142)
 at 
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout.run(GridDhtAtomicCache.java:3865)
 at 
org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:550)
 at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) 
at java.lang.Thread.run(Thread.java:748)
{code}
 
{code:java}
[2019-09-24 
15:31:11,411][ERROR][tcp-client-disco-reconnector-#64%worker-4%][TcpDiscoverySpi]
 Runtime error caught during grid runnable execution: IgniteSpiThread 
[name=tcp-client-disco-reconnector-#64%worker-4%][2019-09-24 
15:31:11,411][ERROR][tcp-client-disco-reconnector-#64%worker-4%][TcpDiscoverySpi]
 Runtime error caught during grid runnable execution: IgniteSpiThread 
[name=tcp-client-disco-reconnector-#64%worker-4%]java.lang.NullPointerException 
at 
org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder.requestAddresses(TcpDiscoveryMulticastIpFinder.java:637)
 at 
org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder.getRegisteredAddresses(TcpDiscoveryMulticastIpFinder.java:392)
 at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.registeredAddresses(TcpDiscoverySpi.java:1944)
 at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.resolvedAddresses(TcpDiscoverySpi.java:1892)
 at 
org.apache.ignite.spi.discovery.tcp.ClientImpl.joinTopology(ClientImpl.java:562)
 at 
org.apache.ignite.spi.discovery.tcp.ClientImpl.access$1100(ClientImpl.java:141) 
at 
org.apache.ignite.spi.discovery.tcp.ClientImpl$Reconnector.body(ClientImpl.java:1523)
{code}
 

 

I see two different solutions:

1) Replace constructions
{code:java}
if (log.isDebugEnabled())
log.debug(...);{code}
and
{code:java}
if (log.isTraceEnabled())
log.trace(...);{code}
by wrappers similar to U.error(...) and U.warn(...), where log reference will 
be checked for null before access.

 

2) Prevent nullifying of log references annotated by IgniteLogger at 
GridResourceProcessor.cleanup() method.

First solution seems more simple to me rather than second one, so I propose use 
that for resolving this issue.

Thoughts?

 

 

 

> NPE during node stop due to nullified logger in TcpCommunicationSpi
> ---
>
> Key: IGNITE-1606
> URL: https://issues.apache.org/jira/browse/IGNITE-1606
> Project: Ignite
>  Issue Type: Bug
>  Components: general
>Reporter: Valentin Kulichenko
>Assignee: Dmitriy Sorokin
>Priority: Major
>
> Probably we should check other components as well. Not sure why we need to 
> nullify 

[jira] [Comment Edited] (IGNITE-12186) TDE - Phase-2. Master key rotation.

2019-10-17 Thread Amelchev Nikita (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16953775#comment-16953775
 ] 

Amelchev Nikita edited comment on IGNITE-12186 at 10/17/19 2:04 PM:


The issue is ready for review.  See the 
[design|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652381]
 and [description of the PR|https://github.com/apache/ignite/pull/6937] for 
details.


was (Author: nsamelchev):
The issue is ready for review.  See the design and description of the PR for 
details.

> TDE - Phase-2. Master key rotation.
> ---
>
> Key: IGNITE-12186
> URL: https://issues.apache.org/jira/browse/IGNITE-12186
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Amelchev Nikita
>Assignee: Amelchev Nikita
>Priority: Major
>  Labels: IEP-18
> Fix For: 2.9
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Need to implement master key rotation process. Master key(MK) rotation 
> required in case of it compromising or at the end of crypto period(key 
> validity period). 
> [Design 
> (cwiki).|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652381]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-12186) TDE - Phase-2. Master key rotation.

2019-10-17 Thread Amelchev Nikita (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16953775#comment-16953775
 ] 

Amelchev Nikita commented on IGNITE-12186:
--

The issue is ready for review.  See the design and description of the PR for 
details.

> TDE - Phase-2. Master key rotation.
> ---
>
> Key: IGNITE-12186
> URL: https://issues.apache.org/jira/browse/IGNITE-12186
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Amelchev Nikita
>Assignee: Amelchev Nikita
>Priority: Major
>  Labels: IEP-18
> Fix For: 2.9
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Need to implement master key rotation process. Master key(MK) rotation 
> required in case of it compromising or at the end of crypto period(key 
> validity period). 
> [Design 
> (cwiki).|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652381]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12301) Free-lists system view

2019-10-17 Thread Aleksey Plekhanov (Jira)
Aleksey Plekhanov created IGNITE-12301:
--

 Summary: Free-lists system view
 Key: IGNITE-12301
 URL: https://issues.apache.org/jira/browse/IGNITE-12301
 Project: Ignite
  Issue Type: Sub-task
Reporter: Aleksey Plekhanov
Assignee: Aleksey Plekhanov


Implement a system view for free-lists monitoring.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-12186) TDE - Phase-2. Master key rotation.

2019-10-17 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16953757#comment-16953757
 ] 

Ignite TC Bot commented on IGNITE-12186:


{panel:title=Branch: [pull/6937/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=4687563buildTypeId=IgniteTests24Java8_RunAll]

> TDE - Phase-2. Master key rotation.
> ---
>
> Key: IGNITE-12186
> URL: https://issues.apache.org/jira/browse/IGNITE-12186
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Amelchev Nikita
>Assignee: Amelchev Nikita
>Priority: Major
>  Labels: IEP-18
> Fix For: 2.9
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Need to implement master key rotation process. Master key(MK) rotation 
> required in case of it compromising or at the end of crypto period(key 
> validity period). 
> [Design 
> (cwiki).|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652381]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-12300) ComputeJob#cancel executes with wrong SecurityContext

2019-10-17 Thread Denis Garus (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denis Garus updated IGNITE-12300:
-
Description: 
ComputeJob#cancel executes with the security context of a current node rather 
than a security context of a node that initiates ComputeJob.

 

Reproducer:

[https://github.com/apache/ignite/pull/6984/files]

  was:
ComputeJob#cancel executes with the context of a current node rather than 
should be executed with context of a node that initiates ComputeJob.

 

Reproducer:

[https://github.com/apache/ignite/pull/6984/files]


> ComputeJob#cancel executes with wrong SecurityContext
> -
>
> Key: IGNITE-12300
> URL: https://issues.apache.org/jira/browse/IGNITE-12300
> Project: Ignite
>  Issue Type: Bug
>Reporter: Denis Garus
>Priority: Major
> Attachments: ComputeJobCancelReproducerTest.java
>
>
> ComputeJob#cancel executes with the security context of a current node rather 
> than a security context of a node that initiates ComputeJob.
>  
> Reproducer:
> [https://github.com/apache/ignite/pull/6984/files]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-12300) ComputeJob#cancel executes with wrong SecurityContext

2019-10-17 Thread Denis Garus (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denis Garus updated IGNITE-12300:
-
Attachment: ComputeJobCancelReproducerTest.java

> ComputeJob#cancel executes with wrong SecurityContext
> -
>
> Key: IGNITE-12300
> URL: https://issues.apache.org/jira/browse/IGNITE-12300
> Project: Ignite
>  Issue Type: Bug
>Reporter: Denis Garus
>Priority: Major
> Attachments: ComputeJobCancelReproducerTest.java
>
>
> ComputeJob#cancel executes with the context of a current node rather than 
> should be executed with context of a node that initiates ComputeJob.
>  
> Reproducer:
> [https://github.com/apache/ignite/pull/6984/files]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-12300) ComputeJob#cancel executes with wrong SecurityContext

2019-10-17 Thread Denis Garus (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denis Garus updated IGNITE-12300:
-
Description: 
ComputeJob#cancel executes with the context of a current node rather than 
should be executed with context of a node that initiates ComputeJob.

 

Reproducer:

[https://github.com/apache/ignite/pull/6984/files]

  was:ComputeJob#cancel executes with context of current node rather then 
should be executed with context of node that initiate ComputeJob.


> ComputeJob#cancel executes with wrong SecurityContext
> -
>
> Key: IGNITE-12300
> URL: https://issues.apache.org/jira/browse/IGNITE-12300
> Project: Ignite
>  Issue Type: Bug
>Reporter: Denis Garus
>Priority: Major
>
> ComputeJob#cancel executes with the context of a current node rather than 
> should be executed with context of a node that initiates ComputeJob.
>  
> Reproducer:
> [https://github.com/apache/ignite/pull/6984/files]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12300) ComputeJob#cancel executes with wrong SecurityContext

2019-10-17 Thread Denis Garus (Jira)
Denis Garus created IGNITE-12300:


 Summary: ComputeJob#cancel executes with wrong SecurityContext
 Key: IGNITE-12300
 URL: https://issues.apache.org/jira/browse/IGNITE-12300
 Project: Ignite
  Issue Type: Bug
Reporter: Denis Garus


ComputeJob#cancel executes with context of current node rather then should be 
executed with context of node that initiate ComputeJob.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-10959) Memory leaks in continuous query handlers

2019-10-17 Thread Grey Guo (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-10959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16953706#comment-16953706
 ] 

Grey Guo commented on IGNITE-10959:
---

This is critical, it will trigger major GC everyday in our scenario, if there 
is no quick fix, please give some recommendation on how to workaround this

> Memory leaks in continuous query handlers
> -
>
> Key: IGNITE-10959
> URL: https://issues.apache.org/jira/browse/IGNITE-10959
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.7
>Reporter: Denis Mekhanikov
>Priority: Major
> Fix For: 2.9
>
> Attachments: CacheContinuousQueryMemoryUsageTest.java
>
>
> Continuous query handlers don't clear internal data structures after cache 
> events are processed.
> A test, that reproduces the problem, is attached.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-11704) Write tombstones during rebalance to get rid of deferred delete buffer

2019-10-17 Thread Maxim Muzafarov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxim Muzafarov updated IGNITE-11704:
-
Ignite Flags: Release Notes Required

> Write tombstones during rebalance to get rid of deferred delete buffer
> --
>
> Key: IGNITE-11704
> URL: https://issues.apache.org/jira/browse/IGNITE-11704
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexey Goncharuk
>Assignee: Pavel Kovalenko
>Priority: Major
>  Labels: rebalance
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently Ignite relies on deferred delete buffer in order to handle 
> write-remove conflicts during rebalance. Given the limit size of the buffer, 
> this approach is fundamentally flawed, especially in case when persistence is 
> enabled.
> I suggest to extend the logic of data storage to be able to store key 
> tombstones - to keep version for deleted entries. The tombstones will be 
> stored when rebalance is in progress and should be cleaned up when rebalance 
> is completed.
> Later this approach may be used to implement fast partition rebalance based 
> on merkle trees (in this case, tombstones should be written on an incomplete 
> baseline).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12299) Store tombstone links into separate BPlus tree to avoid partition full-scan during tombstones remove

2019-10-17 Thread Pavel Kovalenko (Jira)
Pavel Kovalenko created IGNITE-12299:


 Summary: Store tombstone links into separate BPlus tree to avoid 
partition full-scan during tombstones remove
 Key: IGNITE-12299
 URL: https://issues.apache.org/jira/browse/IGNITE-12299
 Project: Ignite
  Issue Type: Improvement
  Components: cache
Affects Versions: 2.8
Reporter: Pavel Kovalenko
 Fix For: 2.9


Currently, we can't identify which keys are tombstones in the partition fastly. 
To collect tombstones we need to make a full-scan BPlus tree. It can slowdown 
node performance when rebalance is finished and tombstones cleanup is needed. 
We can introduce a separate BPlus tree (like for TTL) inside partition where we 
can store links to tombstone keys. When tombstones cleanup is needed we can 
make a fast scan for tombstones using the only a subset of the keys stored to 
this tree.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-12298) Write tombstones on incomplete baseline to get rid of partition cleanup

2019-10-17 Thread Pavel Kovalenko (Jira)
Pavel Kovalenko created IGNITE-12298:


 Summary: Write tombstones on incomplete baseline to get rid of 
partition cleanup
 Key: IGNITE-12298
 URL: https://issues.apache.org/jira/browse/IGNITE-12298
 Project: Ignite
  Issue Type: Improvement
  Components: cache
Affects Versions: 2.8
Reporter: Pavel Kovalenko
 Fix For: 2.9


After tombstone objects are introduced 
https://issues.apache.org/jira/browse/IGNITE-11704
we can write tombstones on OWNING nodes if the baseline is incomplete (some of 
the backup nodes are left). When baseline completes and old nodes return back 
we can avoid partition cleanup on those nodes before rebalance. We can 
translate the whole OWNING partition state including tombstones that will clear 
the data that was removed when node was offline.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-11704) Write tombstones during rebalance to get rid of deferred delete buffer

2019-10-17 Thread Pavel Kovalenko (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Kovalenko updated IGNITE-11704:
-
Ignite Flags:   (was: Docs Required)

> Write tombstones during rebalance to get rid of deferred delete buffer
> --
>
> Key: IGNITE-11704
> URL: https://issues.apache.org/jira/browse/IGNITE-11704
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexey Goncharuk
>Assignee: Pavel Kovalenko
>Priority: Major
>  Labels: rebalance
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently Ignite relies on deferred delete buffer in order to handle 
> write-remove conflicts during rebalance. Given the limit size of the buffer, 
> this approach is fundamentally flawed, especially in case when persistence is 
> enabled.
> I suggest to extend the logic of data storage to be able to store key 
> tombstones - to keep version for deleted entries. The tombstones will be 
> stored when rebalance is in progress and should be cleaned up when rebalance 
> is completed.
> Later this approach may be used to implement fast partition rebalance based 
> on merkle trees (in this case, tombstones should be written on an incomplete 
> baseline).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-11704) Write tombstones during rebalance to get rid of deferred delete buffer

2019-10-17 Thread Alexei Scherbakov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-11704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16953592#comment-16953592
 ] 

Alexei Scherbakov commented on IGNITE-11704:


[~jokser]

Looks good.

> Write tombstones during rebalance to get rid of deferred delete buffer
> --
>
> Key: IGNITE-11704
> URL: https://issues.apache.org/jira/browse/IGNITE-11704
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexey Goncharuk
>Assignee: Pavel Kovalenko
>Priority: Major
>  Labels: rebalance
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently Ignite relies on deferred delete buffer in order to handle 
> write-remove conflicts during rebalance. Given the limit size of the buffer, 
> this approach is fundamentally flawed, especially in case when persistence is 
> enabled.
> I suggest to extend the logic of data storage to be able to store key 
> tombstones - to keep version for deleted entries. The tombstones will be 
> stored when rebalance is in progress and should be cleaned up when rebalance 
> is completed.
> Later this approach may be used to implement fast partition rebalance based 
> on merkle trees (in this case, tombstones should be written on an incomplete 
> baseline).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-11704) Write tombstones during rebalance to get rid of deferred delete buffer

2019-10-17 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-11704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16953552#comment-16953552
 ] 

Ignite TC Bot commented on IGNITE-11704:


{panel:title=Branch: [pull/6931/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=4701586buildTypeId=IgniteTests24Java8_RunAll]

> Write tombstones during rebalance to get rid of deferred delete buffer
> --
>
> Key: IGNITE-11704
> URL: https://issues.apache.org/jira/browse/IGNITE-11704
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexey Goncharuk
>Assignee: Pavel Kovalenko
>Priority: Major
>  Labels: rebalance
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently Ignite relies on deferred delete buffer in order to handle 
> write-remove conflicts during rebalance. Given the limit size of the buffer, 
> this approach is fundamentally flawed, especially in case when persistence is 
> enabled.
> I suggest to extend the logic of data storage to be able to store key 
> tombstones - to keep version for deleted entries. The tombstones will be 
> stored when rebalance is in progress and should be cleaned up when rebalance 
> is completed.
> Later this approach may be used to implement fast partition rebalance based 
> on merkle trees (in this case, tombstones should be written on an incomplete 
> baseline).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)