[jira] [Updated] (IGNITE-11780) Split command handler on hierarchy of commands

2019-05-07 Thread Dmitriy Pavlov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Pavlov updated IGNITE-11780:

Ignite Flags:   (was: Docs Required)

> Split command handler on hierarchy of commands
> --
>
> Key: IGNITE-11780
> URL: https://issues.apache.org/jira/browse/IGNITE-11780
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Eduard Shangareev
>Assignee: Eduard Shangareev
>Priority: Major
> Fix For: 2.8
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Now it is just a big ball of mud. 
> Splitting on command would define API and make adding new command much easier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11780) Split command handler on hierarchy of commands

2019-05-07 Thread Dmitriy Pavlov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Pavlov updated IGNITE-11780:

Fix Version/s: 2.8

> Split command handler on hierarchy of commands
> --
>
> Key: IGNITE-11780
> URL: https://issues.apache.org/jira/browse/IGNITE-11780
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Eduard Shangareev
>Assignee: Eduard Shangareev
>Priority: Major
> Fix For: 2.8
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Now it is just a big ball of mud. 
> Splitting on command would define API and make adding new command much easier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11840) Annotation-based configuration creates not only nested fields, but the enclosing object as well

2019-05-07 Thread Alex Savitsky (JIRA)
Alex Savitsky created IGNITE-11840:
--

 Summary: Annotation-based configuration creates not only nested 
fields, but the enclosing object as well
 Key: IGNITE-11840
 URL: https://issues.apache.org/jira/browse/IGNITE-11840
 Project: Ignite
  Issue Type: Bug
  Components: general
Affects Versions: 2.7
Reporter: Alex Savitsky


Using the annotation-based example from the 
[documentation|https://apacheignite-sql.readme.io/docs/schema-and-indexes#annotation-based-configuration],
 I'm creating a cache of Person object, with a nested Address inside. The 
expectation is that the created SQL table would have 4 columns: "id", "name", 
"street", and "zip". However, it also creates a column named "address" of type 
OTHER, and there's no apparent way to turn the creation of that column off, 
without also losing the columns "street" and "zip". The OTHER type messes up 
most of the SQL tools, as they don't know how to deal with it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-11780) Split command handler on hierarchy of commands

2019-05-07 Thread Dmitriy Pavlov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16834963#comment-16834963
 ] 

Dmitriy Pavlov commented on IGNITE-11780:
-

[~EdShangGG], thank you for doing such an impressive refactoring and solving 
the issue with too complex class. 

I've left several proposals, mostly related to code style, formatting, and 
Javadoc. Could you please address?

[~akalashnikov] thank you for review!

> Split command handler on hierarchy of commands
> --
>
> Key: IGNITE-11780
> URL: https://issues.apache.org/jira/browse/IGNITE-11780
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Eduard Shangareev
>Assignee: Eduard Shangareev
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Now it is just a big ball of mud. 
> Splitting on command would define API and make adding new command much easier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-11780) Split command handler on hierarchy of commands

2019-05-07 Thread Anton Kalashnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16834942#comment-16834942
 ] 

Anton Kalashnikov commented on IGNITE-11780:


Now it looks good to me

> Split command handler on hierarchy of commands
> --
>
> Key: IGNITE-11780
> URL: https://issues.apache.org/jira/browse/IGNITE-11780
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Eduard Shangareev
>Assignee: Eduard Shangareev
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Now it is just a big ball of mud. 
> Splitting on command would define API and make adding new command much easier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-11780) Split command handler on hierarchy of commands

2019-05-07 Thread Ignite TC Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16834928#comment-16834928
 ] 

Ignite TC Bot commented on IGNITE-11780:


{panel:title=-- Run :: All: Possible 
Blockers|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}Platform .NET (Core Linux){color} [[tests 0 Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=3768951]]

{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=3768979buildTypeId=IgniteTests24Java8_RunAll]

> Split command handler on hierarchy of commands
> --
>
> Key: IGNITE-11780
> URL: https://issues.apache.org/jira/browse/IGNITE-11780
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Eduard Shangareev
>Assignee: Eduard Shangareev
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Now it is just a big ball of mud. 
> Splitting on command would define API and make adding new command much easier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11839) SQL: table join order changes may lead to incorrect result

2019-05-07 Thread Roman Kondakov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Kondakov updated IGNITE-11839:

Description: 
Under some circumstances table join order changes may lead to incorrect result. 
For example if one of joining tables is {{REPLICATED}} and another has 
{{queryparallelism > 1}}.

This problem can be reproduced in test 
{{IgniteSqlSegmentedIndexSelfTest#testSegmentedPartitionedWithReplicated}} if 
swap tables {{Person}} and {{Organization}} in the method 
{{IgniteSqlSegmentedIndexSelfTest#checkLocalQueryWithSegmentedIndex}} and set 
{{enforceJoinOrder}} flag to {{true}}:


{code:java}
String select0 = "select o.name n1, p.name n2 from  \"org\".Organization o, 
\"pers\".Person p  where p.orgId = o._key";

List> res = c1.query(new 
SqlFieldsQuery(select0).setLocal(true).setEnforceJoinOrder(true)).getAll();
{code}

Result is:

{noformat}
java.lang.AssertionError: 
Expected :956
Actual   :8



at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.ignite.testframework.junits.JUnitAssertAware.assertEquals(JUnitAssertAware.java:89)
at 
org.apache.ignite.internal.processors.query.IgniteSqlSegmentedIndexSelfTest.checkLocalQueryWithSegmentedIndex(IgniteSqlSegmentedIndexSelfTest.java:280)
at 
org.apache.ignite.internal.processors.query.IgniteSqlSegmentedIndexSelfTest.testSegmentedPartitionedWithReplicated(IgniteSqlSegmentedIndexSelfTest.java:222)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.apache.ignite.testframework.junits.GridAbstractTest$6.run(GridAbstractTest.java:2145)
at java.lang.Thread.run(Thread.java:748)

{noformat}


  was:
Under some circumstances table join order changes may lead to incorrect result. 
For example if one of joining tables is {{REPLICATED}} and another has 
{{queryparallelism > 1}}.

This problem can be reproduced in test 
{{IgniteSqlSegmentedIndexSelfTest#testSegmentedPartitionedWithReplicated}} if 
swap tables {{Person}} and {{Organization}} in the method 
{{IgniteSqlSegmentedIndexSelfTest#checkLocalQueryWithSegmentedIndex}} and set 
{{enforceJoinOrder}} flag to {{true}}:


{code:java}
String select0 = "select o.name n1, p.name n2 from  \"org\".Organization o, 
\"pers\".Person p  where p.orgId = o._key";

List> res = c1.query(new 
SqlFieldsQuery(select0).setLocal(true).setEnforceJoinOrder(true)).getAll();
{code}



> SQL: table join order changes may lead to incorrect result
> --
>
> Key: IGNITE-11839
> URL: https://issues.apache.org/jira/browse/IGNITE-11839
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 2.7
>Reporter: Roman Kondakov
>Priority: Major
> Fix For: 2.8
>
>
> Under some circumstances table join order changes may lead to incorrect 
> result. For example if one of joining tables is {{REPLICATED}} and another 
> has {{queryparallelism > 1}}.
> This problem can be reproduced in test 
> {{IgniteSqlSegmentedIndexSelfTest#testSegmentedPartitionedWithReplicated}} if 
> swap tables {{Person}} and {{Organization}} in the method 
> {{IgniteSqlSegmentedIndexSelfTest#checkLocalQueryWithSegmentedIndex}} and set 
> {{enforceJoinOrder}} flag to {{true}}:
> {code:java}
> String select0 = "select o.name n1, p.name n2 from  \"org\".Organization o, 
> \"pers\".Person p  where p.orgId = o._key";
> List> res = c1.query(new 
> SqlFieldsQuery(select0).setLocal(true).setEnforceJoinOrder(true)).getAll();
> {code}
> Result is:
> {noformat}
> java.lang.AssertionError: 
> Expected :956
> Actual   :8
> 
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)

[jira] [Updated] (IGNITE-11839) SQL: table join order changes may lead to incorrect result

2019-05-07 Thread Roman Kondakov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Kondakov updated IGNITE-11839:

Description: 
Under some circumstances table join order changes may lead to incorrect result. 
For example if one of joining tables is {{REPLICATED}} and another has 
{{queryparallelism > 1}}.

This problem can be reproduced in test 
{{IgniteSqlSegmentedIndexSelfTest#testSegmentedPartitionedWithReplicated}} if 
swap tables {{Person}} and {{Organization}} in the method 
{{IgniteSqlSegmentedIndexSelfTest#checkLocalQueryWithSegmentedIndex}} and set 
{{enforceJoinOrder}} flag to {{true}}:


{code:java}
String select0 = "select o.name n1, p.name n2 from  \"org\".Organization o, 
\"pers\".Person p  where p.orgId = o._key";

List> res = c1.query(new 
SqlFieldsQuery(select0).setLocal(true).setEnforceJoinOrder(true)).getAll();
{code}


  was:
Under some circumstances table join order changes may lead to incorrect result. 
For example if one of joining tables is {{REPLICATED}} and another has 
{{queryparallelism > 1}}.

This problem can be reproduced in test 
{{IgniteSqlSegmentedIndexSelfTest#testSegmentedPartitionedWithReplicated}} if 
swap tables {{Person}} and {{Organization}} in the method 
{{IgniteSqlSegmentedIndexSelfTest#checkLocalQueryWithSegmentedIndex}} and set 
{{enforceJoinOrder}} flag to {{true}}.


> SQL: table join order changes may lead to incorrect result
> --
>
> Key: IGNITE-11839
> URL: https://issues.apache.org/jira/browse/IGNITE-11839
> Project: Ignite
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 2.7
>Reporter: Roman Kondakov
>Priority: Major
> Fix For: 2.8
>
>
> Under some circumstances table join order changes may lead to incorrect 
> result. For example if one of joining tables is {{REPLICATED}} and another 
> has {{queryparallelism > 1}}.
> This problem can be reproduced in test 
> {{IgniteSqlSegmentedIndexSelfTest#testSegmentedPartitionedWithReplicated}} if 
> swap tables {{Person}} and {{Organization}} in the method 
> {{IgniteSqlSegmentedIndexSelfTest#checkLocalQueryWithSegmentedIndex}} and set 
> {{enforceJoinOrder}} flag to {{true}}:
> {code:java}
> String select0 = "select o.name n1, p.name n2 from  \"org\".Organization o, 
> \"pers\".Person p  where p.orgId = o._key";
> List> res = c1.query(new 
> SqlFieldsQuery(select0).setLocal(true).setEnforceJoinOrder(true)).getAll();
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11839) SQL: table join order changes may lead to incorrect result

2019-05-07 Thread Roman Kondakov (JIRA)
Roman Kondakov created IGNITE-11839:
---

 Summary: SQL: table join order changes may lead to incorrect result
 Key: IGNITE-11839
 URL: https://issues.apache.org/jira/browse/IGNITE-11839
 Project: Ignite
  Issue Type: Bug
  Components: sql
Affects Versions: 2.7
Reporter: Roman Kondakov
 Fix For: 2.8


Under some circumstances table join order changes may lead to incorrect result. 
For example if one of joining tables is {{REPLICATED}} and another has 
{{queryparallelism > 1}}.

This problem can be reproduced in test 
{{IgniteSqlSegmentedIndexSelfTest#testSegmentedPartitionedWithReplicated}} if 
swap tables {{Person}} and {{Organization}} in the method 
{{IgniteSqlSegmentedIndexSelfTest#checkLocalQueryWithSegmentedIndex}} and set 
{{enforceJoinOrder}} flag to {{true}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-11780) Split command handler on hierarchy of commands

2019-05-07 Thread Anton Kalashnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16834894#comment-16834894
 ] 

Anton Kalashnikov commented on IGNITE-11780:


[~EdShangGG], I have couple minor notes like (Command class transform to 
interface, rename short method in CommandLogger). But in general changes looks 
good to me.

> Split command handler on hierarchy of commands
> --
>
> Key: IGNITE-11780
> URL: https://issues.apache.org/jira/browse/IGNITE-11780
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Eduard Shangareev
>Assignee: Eduard Shangareev
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Now it is just a big ball of mud. 
> Splitting on command would define API and make adding new command much easier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11838) Improve usability of UriDeploymentSpi documentation

2019-05-07 Thread Dmitry Sherstobitov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Sherstobitov updated IGNITE-11838:
-
Description: 
I was trying to run UriDeploymentSPi feature and actually failed in it. I've 
only managed to stop it using the actual Java code.

Here some issues in documentation I've found:
1. Not clear what is GAR file and how user can create it (manually?, using some 
utility?)
2. Local disk folder containing only compiled Java classes - this doesn’t work 
for me (and according to java code this shouldn't work)
3. Local disk folder with structure of unpacked GAR file - this DOES work but. 
META-INF/ actually is an optional folder, xyz.class -see previous). The only 
thing user need is to put lib/ folder in deployment URI and put .jar file there
4. Doesn’t clear what is ignite.xml descriptor file. How user can create it
5. I don’t like windows paths in examples (I think linux paths is more common 
in case of Ignite, we may create Note with Windows paths examples)
6. In case of Linux path user should write something like this: 
file:///tmp/path/deployment (3 slashes instead of 2)
7. 
https://apacheignite.readme.io/docs/service-grid-28#section-service-updates-redeployment
 - here link to URI looks strange and doesn’t work
8. Previous page: example temporaryDirectoryPath value is optional so we may 
remove it

  was:
I was trying to run UriDeploymentSPi feature and actually failed in it. I've 
only managed to stop it sung Java code.

Here some issues in documentation I've found:
1. Not clear what is GAR file and how user can create it (manually?, using some 
utility?)
2. Local disk folder containing only compiled Java classes - this doesn’t work 
for me (and according to java code this shouldn't work)
3. Local disk folder with structure of unpacked GAR file - this DOES work but. 
META-INF/ actually is an optional folder, xyz.class -see previous). The only 
thing user need is to put lib/ folder in deployment URI and put .jar file there
4. Doesn’t clear what is ignite.xml descriptor file. How user can create it
5. I don’t like windows paths in examples (I think linux paths is more common 
in case of Ignite, we may create Note with Windows paths examples)
6. In case of Linux path user should write something like this: 
file:///tmp/path/deployment (3 slashes instead of 2)
7. 
https://apacheignite.readme.io/docs/service-grid-28#section-service-updates-redeployment
 - here link to URI looks strange and doesn’t work
8. Previous page: example temporaryDirectoryPath value is optional so we may 
remove it


> Improve usability of UriDeploymentSpi documentation 
> 
>
> Key: IGNITE-11838
> URL: https://issues.apache.org/jira/browse/IGNITE-11838
> Project: Ignite
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.7
>Reporter: Dmitry Sherstobitov
>Priority: Critical
>
> I was trying to run UriDeploymentSPi feature and actually failed in it. I've 
> only managed to stop it using the actual Java code.
> Here some issues in documentation I've found:
> 1. Not clear what is GAR file and how user can create it (manually?, using 
> some utility?)
> 2. Local disk folder containing only compiled Java classes - this doesn’t 
> work for me (and according to java code this shouldn't work)
> 3. Local disk folder with structure of unpacked GAR file - this DOES work 
> but. META-INF/ actually is an optional folder, xyz.class -see previous). The 
> only thing user need is to put lib/ folder in deployment URI and put .jar 
> file there
> 4. Doesn’t clear what is ignite.xml descriptor file. How user can create it
> 5. I don’t like windows paths in examples (I think linux paths is more common 
> in case of Ignite, we may create Note with Windows paths examples)
> 6. In case of Linux path user should write something like this: 
> file:///tmp/path/deployment (3 slashes instead of 2)
> 7. 
> https://apacheignite.readme.io/docs/service-grid-28#section-service-updates-redeployment
>  - here link to URI looks strange and doesn’t work
> 8. Previous page: example temporaryDirectoryPath value is optional so we may 
> remove it



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11838) Improve usability of UriDeploymentSpi documentation

2019-05-07 Thread Dmitry Sherstobitov (JIRA)
Dmitry Sherstobitov created IGNITE-11838:


 Summary: Improve usability of UriDeploymentSpi documentation 
 Key: IGNITE-11838
 URL: https://issues.apache.org/jira/browse/IGNITE-11838
 Project: Ignite
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.7
Reporter: Dmitry Sherstobitov


I was trying to run UriDeploymentSPi feature and actually failed in it. I've 
only managed to stop it sung Java code.

Here some issues in documentation I've found:
1. Not clear what is GAR file and how user can create it (manually?, using some 
utility?)
2. Local disk folder containing only compiled Java classes - this doesn’t work 
for me (and according to java code this shouldn't work)
3. Local disk folder with structure of unpacked GAR file - this DOES work but. 
META-INF/ actually is an optional folder, xyz.class -see previous). The only 
thing user need is to put lib/ folder in deployment URI and put .jar file there
4. Doesn’t clear what is ignite.xml descriptor file. How user can create it
5. I don’t like windows paths in examples (I think linux paths is more common 
in case of Ignite, we may create Note with Windows paths examples)
6. In case of Linux path user should write something like this: 
file:///tmp/path/deployment (3 slashes instead of 2)
7. 
https://apacheignite.readme.io/docs/service-grid-28#section-service-updates-redeployment
 - here link to URI looks strange and doesn’t work
8. Previous page: example temporaryDirectoryPath value is optional so we may 
remove it



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11837) Thin client fails to connect to the cluster if one node is down

2019-05-07 Thread Nikola Arnaudov (JIRA)
Nikola Arnaudov created IGNITE-11837:


 Summary: Thin client fails to connect to the cluster if one node 
is down
 Key: IGNITE-11837
 URL: https://issues.apache.org/jira/browse/IGNITE-11837
 Project: Ignite
  Issue Type: Bug
  Components: thin client
Affects Versions: 2.7
Reporter: Nikola Arnaudov


According to java doc: 

in org.apache.ignite.Ignition

 

/**
 * Initializes new instance of \{@link IgniteClient}.
 * 
 * Server connection will be lazily initialized when first required.
 *
 * @param cfg Thin client configuration.
 * @return Successfully opened thin client connection.
 */
public static IgniteClient startClient(ClientConfiguration cfg)

 

but that seems wrong as I get exception:

Exception in thread "main" org.apache.ignite.client.ClientConnectionException: 
Ignite cluster is unavailable
 at 
org.apache.ignite.internal.client.thin.TcpClientChannel.(TcpClientChannel.java:114)
 at 
org.apache.ignite.internal.client.thin.TcpIgniteClient.lambda$new$0(TcpIgniteClient.java:79)
 at 
org.apache.ignite.internal.client.thin.ReliableChannel.(ReliableChannel.java:84)
 at 
org.apache.ignite.internal.client.thin.TcpIgniteClient.(TcpIgniteClient.java:86)
 at 
org.apache.ignite.internal.client.thin.TcpIgniteClient.start(TcpIgniteClient.java:205)
 Caused by: java.net.ConnectException: Connection refused: connect
 at java.net.DualStackPlainSocketImpl.connect0(Native Method)
 at 
java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:79)
 at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
 at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
 at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
 at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
 at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
 at java.net.Socket.connect(Socket.java:589)
 at java.net.Socket.connect(Socket.java:538)
 at java.net.Socket.(Socket.java:434)
 at java.net.Socket.(Socket.java:211)
 at 
org.apache.ignite.internal.client.thin.TcpClientChannel.createSocket(TcpClientChannel.java:216)
 at 
org.apache.ignite.internal.client.thin.TcpClientChannel.(TcpClientChannel.java:108)at
 org.apache.ignite.Ignition.startClient(Ignition.java:586)

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11749) Implement automatic pages history dump on CorruptedTreeException

2019-05-07 Thread Dmitriy Govorukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Govorukhin updated IGNITE-11749:

Description: 
Currently, the only way to debug possible bugs in checkpointer/recovery 
mechanics is to manually parse WAL files after the corruption happened. This is 
not practical for several reasons. First, it requires manual actions which 
depend on the content of the exception. Second, it is not always possible to 
obtain WAL files (it may contain sensitive data).

We need to add a mechanics which will dump all information required for primary 
analysis of the corruption to the exception handler. For example, if an 
exception happened when materializing a link {{0xabcd}} written on an index 
page {{0xdcba}}, we need to dump history of both pages changes, checkpoint 
records on the analysis interval. Possibly, we should include FreeList pages to 
which the aforementioned pages were included to.

Example of output:
{noformat}
[2019-05-07 11:57:57,350][INFO 
][test-runner-#58%diagnostic.DiagnosticProcessorTest%][PageHistoryDiagnoster] 
Next WAL record :: PageSnapshot [fullPageId = FullPageId 
[pageId=0002, effectivePageId=, grpId=-2100569601], 
page = [
Header [
type=11 (PageMetaIO),
ver=1,
crc=0,
pageId=844420635164672(offset=0, flags=10, partId=65535, index=0)
],
PageMeta[
treeRoot=844420635164675,
lastSuccessfulFullSnapshotId=0,
lastSuccessfulSnapshotId=0,
nextSnapshotTag=1,
lastSuccessfulSnapshotTag=0,
lastAllocatedPageCount=0,
candidatePageCount=0
]],
super = [WALRecord [size=4129, chainSize=0, pos=FileWALPointer [idx=0, 
fileOff=103, len=4129], type=PAGE_RECORD]]]
Next WAL record :: CheckpointRecord [cpId=c6ba7793-113b-4b54-8530-45e1708ca44c, 
end=false, cpMark=FileWALPointer [idx=0, fileOff=29, len=29], super=WALRecord 
[size=1963, chainSize=0, pos=FileWALPointer [idx=0, fileOff=39686, len=1963], 
type=CHECKPOINT_RECORD]]
Next WAL record :: PageSnapshot [fullPageId = FullPageId 
[pageId=0002, effectivePageId=, grpId=-1368047378], 
page = [
Header [
type=11 (PageMetaIO),
ver=1,
crc=0,
pageId=844420635164672(offset=0, flags=10, partId=65535, index=0)
],
PageMeta[
treeRoot=844420635164675,
lastSuccessfulFullSnapshotId=0,
lastSuccessfulSnapshotId=0,
nextSnapshotTag=1,
lastSuccessfulSnapshotTag=0,
lastAllocatedPageCount=0,
candidatePageCount=0
]],
super = [WALRecord [size=4129, chainSize=0, pos=FileWALPointer [idx=0, 
fileOff=55961, len=4129], type=PAGE_RECORD]]]
Next WAL record :: CheckpointRecord [cpId=145e599e-66fc-45f5-bde4-b0c392125968, 
end=false, cpMark=null, super=WALRecord [size=21409, chainSize=0, 
pos=FileWALPointer [idx=0, fileOff=13101788, len=21409], 
type=CHECKPOINT_RECORD]]
{noformat}

  was:
Currently, the only way to debug possible bugs in checkpointer/recovery 
mechanics is to manually parse WAL files after the corruption happened. This is 
not practical for several reasons. First, it requires manual actions which 
depend on the content of the exception. Second, it is not always possible to 
obtain WAL files (it may contain sensitive data).

We need to add a mechanics which will dump all information required for primary 
analysis of the corruption to the exception handler. For example, if an 
exception happened when materializing a link {{0xabcd}} written on an index 
page {{0xdcba}}, we need to dump history of both pages changes, checkpoint 
records on the analysis interval. Possibly, we should include FreeList pages to 
which the aforementioned pages were included to.

 

console.sh command:
{noformat}
control.sh --diagnostic page_history print_to_log print_to_file [page_ids 
] [dump_path ] [--yes]

 

--diagnostic - command for dumping some diagnostic info

page_history - subcommand for dumping only page_history. Required.

page_ids {list_of_page_ids} - list of page ids for dumping

print_to_log, print_to_file - place for dumping(file or log or both). At least 
one of them is required.

dump_path  - custom path to folder(absolute or relative 
of work_dir).

{noformat}
Example of output:
{noformat}
[2019-05-07 11:57:57,350][INFO 
][test-runner-#58%diagnostic.DiagnosticProcessorTest%][PageHistoryDiagnoster] 
Next WAL record :: PageSnapshot [fullPageId = FullPageId 
[pageId=0002, effectivePageId=, grpId=-2100569601], 
page = [
Header [
type=11 (PageMetaIO),
ver=1,
crc=0,
pageId=844420635164672(offset=0, flags=10, partId=65535, index=0)
],
PageMeta[
treeRoot=844420635164675,
lastSuccessfulFullSnapshotId=0,
lastSuccessfulSnapshotId=0,
nextSnapshotTag=1,
lastSuccessfulSnapshotTag=0,
lastAllocatedPageCount=0,
candidatePageCount=0

[jira] [Updated] (IGNITE-11818) Support JMX/control.sh for debug page info

2019-05-07 Thread Dmitriy Govorukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Govorukhin updated IGNITE-11818:

Description: 
Support JMX/control.sh for debug page info

JMX
{code}
public interface DiagnosticMXBean {
@MXBeanDescription("Dump page history to custom path.")
void dumpPageHistory(boolean dumpToFile, boolean dumpToLog, String 
filePath, long... pageIds);

@MXBeanDescription("Dump page history.")
void dumpPageHistory(boolean dumpToFile, boolean dumpToLog, long... 
pageIds);
}
{code}


console.sh command:
{noformat}
control.sh --diagnostic page_history print_to_log print_to_file [page_ids 
] [dump_path ] [--yes]

--diagnostic - command for dumping some diagnostic info

page_history - subcommand for dumping only page_history. Required.

page_ids {list_of_page_ids} - list of page ids for dumping

print_to_log, print_to_file - place for dumping(file or log or both). At least 
one of them is required.

dump_path  - custom path to folder(absolute or relative 
of work_dir).

{noformat}

  was:
Support JMX/control.sh for debug page info

JMX
{code}
public interface DiagnosticMXBean {
@MXBeanDescription("Dump page history to custom path.")
void dumpPageHistory(boolean dumpToFile, boolean dumpToLog, String 
filePath, long... pageIds);

@MXBeanDescription("Dump page history.")
void dumpPageHistory(boolean dumpToFile, boolean dumpToLog, long... 
pageIds);
}
{code}


> Support JMX/control.sh for debug page info
> --
>
> Key: IGNITE-11818
> URL: https://issues.apache.org/jira/browse/IGNITE-11818
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Anton Kalashnikov
>Assignee: Anton Kalashnikov
>Priority: Major
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Support JMX/control.sh for debug page info
> JMX
> {code}
> public interface DiagnosticMXBean {
> @MXBeanDescription("Dump page history to custom path.")
> void dumpPageHistory(boolean dumpToFile, boolean dumpToLog, String 
> filePath, long... pageIds);
> @MXBeanDescription("Dump page history.")
> void dumpPageHistory(boolean dumpToFile, boolean dumpToLog, long... 
> pageIds);
> }
> {code}
> console.sh command:
> {noformat}
> control.sh --diagnostic page_history print_to_log print_to_file [page_ids 
> ] [dump_path ] [--yes]
> --diagnostic - command for dumping some diagnostic info
> page_history - subcommand for dumping only page_history. Required.
> page_ids {list_of_page_ids} - list of page ids for dumping
> print_to_log, print_to_file - place for dumping(file or log or both). At 
> least one of them is required.
> dump_path  - custom path to folder(absolute or 
> relative of work_dir).
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-11824) Integrate diagnostic PageLockTracker to DataStructure

2019-05-07 Thread Ignite TC Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16834741#comment-16834741
 ] 

Ignite TC Bot commented on IGNITE-11824:


{panel:title=-- Run :: All: Possible 
Blockers|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}Platform .NET (Core Linux){color} [[tests 0 Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=3776179]]

{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=3769095buildTypeId=IgniteTests24Java8_RunAll]

> Integrate diagnostic PageLockTracker to DataStructure 
> --
>
> Key: IGNITE-11824
> URL: https://issues.apache.org/jira/browse/IGNITE-11824
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Major
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> After [IGNITE-11786] will be completed, we will have a structure for tracking 
> page locks per-thread. The next step, need to integrate it into diagnostic 
> API and implements a component for creating this structure per-thread.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-10078) Node failure during concurrent partition updates may cause partition desync between primary and backup.

2019-05-07 Thread Anton Vinogradov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-10078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16834707#comment-16834707
 ] 

Anton Vinogradov commented on IGNITE-10078:
---

[~ascherbakov]
Could you please join the discussion "Idle verify" to "Online verify" on 
devlist.


> Node failure during concurrent partition updates may cause partition desync 
> between primary and backup.
> ---
>
> Key: IGNITE-10078
> URL: https://issues.apache.org/jira/browse/IGNITE-10078
> Project: Ignite
>  Issue Type: Bug
>Reporter: Alexei Scherbakov
>Assignee: Alexei Scherbakov
>Priority: Major
> Fix For: 2.8
>
>
> This is possible if some updates are not written to WAL before node failure. 
> They will be not applied by rebalancing due to same partition counters in 
> certain scenario:
> 1. Start grid with 3 nodes, 2 backups.
> 2. Preload some data to partition P.
> 3. Start two concurrent transactions writing single key to the same partition 
> P, keys are different
> {noformat}
> try(Transaction tx = client.transactions().txStart(PESSIMISTIC, 
> REPEATABLE_READ, 0, 1)) {
>   client.cache(DEFAULT_CACHE_NAME).put(k, v);
>   tx.commit();
> }
> {noformat}
> 4. Order updates on backup in the way such update with max partition counter 
> is written to WAL and update with lesser partition counter failed due to 
> triggering of FH before it's added to WAL
> 5. Return failed node to grid, observe no rebalancing due to same partition 
> counters.
> Possible solution: detect gaps in update counters on recovery and force 
> rebalance from a node without gaps if detected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11749) Implement automatic pages history dump on CorruptedTreeException

2019-05-07 Thread Anton Kalashnikov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Kalashnikov updated IGNITE-11749:
---
Description: 
Currently, the only way to debug possible bugs in checkpointer/recovery 
mechanics is to manually parse WAL files after the corruption happened. This is 
not practical for several reasons. First, it requires manual actions which 
depend on the content of the exception. Second, it is not always possible to 
obtain WAL files (it may contain sensitive data).

We need to add a mechanics which will dump all information required for primary 
analysis of the corruption to the exception handler. For example, if an 
exception happened when materializing a link {{0xabcd}} written on an index 
page {{0xdcba}}, we need to dump history of both pages changes, checkpoint 
records on the analysis interval. Possibly, we should include FreeList pages to 
which the aforementioned pages were included to.

 

console.sh command:
{noformat}
control.sh --diagnostic page_history print_to_log print_to_file [page_ids 
] [dump_path ] [--yes]

 

--diagnostic - command for dumping some diagnostic info

page_history - subcommand for dumping only page_history. Required.

page_ids {list_of_page_ids} - list of page ids for dumping

print_to_log, print_to_file - place for dumping(file or log or both). At least 
one of them is required.

dump_path  - custom path to folder(absolute or relative 
of work_dir).

{noformat}
Example of output:
{noformat}
[2019-05-07 11:57:57,350][INFO 
][test-runner-#58%diagnostic.DiagnosticProcessorTest%][PageHistoryDiagnoster] 
Next WAL record :: PageSnapshot [fullPageId = FullPageId 
[pageId=0002, effectivePageId=, grpId=-2100569601], 
page = [
Header [
type=11 (PageMetaIO),
ver=1,
crc=0,
pageId=844420635164672(offset=0, flags=10, partId=65535, index=0)
],
PageMeta[
treeRoot=844420635164675,
lastSuccessfulFullSnapshotId=0,
lastSuccessfulSnapshotId=0,
nextSnapshotTag=1,
lastSuccessfulSnapshotTag=0,
lastAllocatedPageCount=0,
candidatePageCount=0
]],
super = [WALRecord [size=4129, chainSize=0, pos=FileWALPointer [idx=0, 
fileOff=103, len=4129], type=PAGE_RECORD]]]
Next WAL record :: CheckpointRecord [cpId=c6ba7793-113b-4b54-8530-45e1708ca44c, 
end=false, cpMark=FileWALPointer [idx=0, fileOff=29, len=29], super=WALRecord 
[size=1963, chainSize=0, pos=FileWALPointer [idx=0, fileOff=39686, len=1963], 
type=CHECKPOINT_RECORD]]
Next WAL record :: PageSnapshot [fullPageId = FullPageId 
[pageId=0002, effectivePageId=, grpId=-1368047378], 
page = [
Header [
type=11 (PageMetaIO),
ver=1,
crc=0,
pageId=844420635164672(offset=0, flags=10, partId=65535, index=0)
],
PageMeta[
treeRoot=844420635164675,
lastSuccessfulFullSnapshotId=0,
lastSuccessfulSnapshotId=0,
nextSnapshotTag=1,
lastSuccessfulSnapshotTag=0,
lastAllocatedPageCount=0,
candidatePageCount=0
]],
super = [WALRecord [size=4129, chainSize=0, pos=FileWALPointer [idx=0, 
fileOff=55961, len=4129], type=PAGE_RECORD]]]
Next WAL record :: CheckpointRecord [cpId=145e599e-66fc-45f5-bde4-b0c392125968, 
end=false, cpMark=null, super=WALRecord [size=21409, chainSize=0, 
pos=FileWALPointer [idx=0, fileOff=13101788, len=21409], 
type=CHECKPOINT_RECORD]]
{noformat}

  was:
Currently, the only way to debug possible bugs in checkpointer/recovery 
mechanics is to manually parse WAL files after the corruption happened. This is 
not practical for several reasons. First, it requires manual actions which 
depend on the content of the exception. Second, it is not always possible to 
obtain WAL files (it may contain sensitive data).

We need to add a mechanics which will dump all information required for primary 
analysis of the corruption to the exception handler. For example, if an 
exception happened when materializing a link {{0xabcd}} written on an index 
page {{0xdcba}}, we need to dump history of both pages changes, checkpoint 
records on the analysis interval. Possibly, we should include FreeList pages to 
which the aforementioned pages were included to.

 

console.sh command:
{noformat}
control.sh --diagnostic page_history [page_ids pageId1,pageId2] print_to_log 
print_to_file [--yes]

 

--diagnostic - command for dumping some diagnostic info

page_history - subcommand for dumping only page_history. Required.

page_ids {list_of_page_ids} - list of page ids for dumping

print_to_log, print_to_file - place for dumping(file or log or both). At least 
one of them is required.

{noformat}
Example of output:
{noformat}
[2019-05-07 11:57:57,350][INFO 
][test-runner-#58%diagnostic.DiagnosticProcessorTest%][PageHistoryDiagnoster] 
Next WAL record :: PageSnapshot [fullPageId = FullPageId 
[pageId=0002, 

[jira] [Updated] (IGNITE-11749) Implement automatic pages history dump on CorruptedTreeException

2019-05-07 Thread Anton Kalashnikov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Kalashnikov updated IGNITE-11749:
---
Description: 
Currently, the only way to debug possible bugs in checkpointer/recovery 
mechanics is to manually parse WAL files after the corruption happened. This is 
not practical for several reasons. First, it requires manual actions which 
depend on the content of the exception. Second, it is not always possible to 
obtain WAL files (it may contain sensitive data).

We need to add a mechanics which will dump all information required for primary 
analysis of the corruption to the exception handler. For example, if an 
exception happened when materializing a link {{0xabcd}} written on an index 
page {{0xdcba}}, we need to dump history of both pages changes, checkpoint 
records on the analysis interval. Possibly, we should include FreeList pages to 
which the aforementioned pages were included to.

 

console.sh command:
{noformat}
control.sh --diagnostic page_history [page_ids pageId1,pageId2] print_to_log 
print_to_file [--yes]

 

--diagnostic - command for dumping some diagnostic info

page_history - subcommand for dumping only page_history. Required.

page_ids {list_of_page_ids} - list of page ids for dumping

print_to_log, print_to_file - place for dumping(file or log or both). At least 
one of them is required.

{noformat}
Example of output:
{noformat}
[2019-05-07 11:57:57,350][INFO 
][test-runner-#58%diagnostic.DiagnosticProcessorTest%][PageHistoryDiagnoster] 
Next WAL record :: PageSnapshot [fullPageId = FullPageId 
[pageId=0002, effectivePageId=, grpId=-2100569601], 
page = [
Header [
type=11 (PageMetaIO),
ver=1,
crc=0,
pageId=844420635164672(offset=0, flags=10, partId=65535, index=0)
],
PageMeta[
treeRoot=844420635164675,
lastSuccessfulFullSnapshotId=0,
lastSuccessfulSnapshotId=0,
nextSnapshotTag=1,
lastSuccessfulSnapshotTag=0,
lastAllocatedPageCount=0,
candidatePageCount=0
]],
super = [WALRecord [size=4129, chainSize=0, pos=FileWALPointer [idx=0, 
fileOff=103, len=4129], type=PAGE_RECORD]]]
Next WAL record :: CheckpointRecord [cpId=c6ba7793-113b-4b54-8530-45e1708ca44c, 
end=false, cpMark=FileWALPointer [idx=0, fileOff=29, len=29], super=WALRecord 
[size=1963, chainSize=0, pos=FileWALPointer [idx=0, fileOff=39686, len=1963], 
type=CHECKPOINT_RECORD]]
Next WAL record :: PageSnapshot [fullPageId = FullPageId 
[pageId=0002, effectivePageId=, grpId=-1368047378], 
page = [
Header [
type=11 (PageMetaIO),
ver=1,
crc=0,
pageId=844420635164672(offset=0, flags=10, partId=65535, index=0)
],
PageMeta[
treeRoot=844420635164675,
lastSuccessfulFullSnapshotId=0,
lastSuccessfulSnapshotId=0,
nextSnapshotTag=1,
lastSuccessfulSnapshotTag=0,
lastAllocatedPageCount=0,
candidatePageCount=0
]],
super = [WALRecord [size=4129, chainSize=0, pos=FileWALPointer [idx=0, 
fileOff=55961, len=4129], type=PAGE_RECORD]]]
Next WAL record :: CheckpointRecord [cpId=145e599e-66fc-45f5-bde4-b0c392125968, 
end=false, cpMark=null, super=WALRecord [size=21409, chainSize=0, 
pos=FileWALPointer [idx=0, fileOff=13101788, len=21409], 
type=CHECKPOINT_RECORD]]
{noformat}

  was:
Currently, the only way to debug possible bugs in checkpointer/recovery 
mechanics is to manually parse WAL files after the corruption happened. This is 
not practical for several reasons. First, it requires manual actions which 
depend on the content of the exception. Second, it is not always possible to 
obtain WAL files (it may contain sensitive data).

We need to add a mechanics which will dump all information required for primary 
analysis of the corruption to the exception handler. For example, if an 
exception happened when materializing a link {{0xabcd}} written on an index 
page {{0xdcba}}, we need to dump history of both pages changes, checkpoint 
records on the analysis interval. Possibly, we should include FreeList pages to 
which the aforementioned pages were included to.

 

console.sh command:

{noformat}

--diagnostic page_history page_ids 234324,3455 print_to_log print_to_file

 

--diagnostic - command for dumping some diagnostic info

page_history - subcommand for dumping only page_history. Required.

page_ids {list_of_page_ids} - list of page ids for dumping

print_to_log, print_to_file - place for dumping(file or log or both). At least 
one of them is required.

{noformat}

Example of output:
{noformat}
[2019-05-07 11:57:57,350][INFO 
][test-runner-#58%diagnostic.DiagnosticProcessorTest%][PageHistoryDiagnoster] 
Next WAL record :: PageSnapshot [fullPageId = FullPageId 
[pageId=0002, effectivePageId=, grpId=-2100569601], 
page = [
Header [
type=11 (PageMetaIO),

[jira] [Updated] (IGNITE-11749) Implement automatic pages history dump on CorruptedTreeException

2019-05-07 Thread Anton Kalashnikov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Kalashnikov updated IGNITE-11749:
---
Description: 
Currently, the only way to debug possible bugs in checkpointer/recovery 
mechanics is to manually parse WAL files after the corruption happened. This is 
not practical for several reasons. First, it requires manual actions which 
depend on the content of the exception. Second, it is not always possible to 
obtain WAL files (it may contain sensitive data).

We need to add a mechanics which will dump all information required for primary 
analysis of the corruption to the exception handler. For example, if an 
exception happened when materializing a link {{0xabcd}} written on an index 
page {{0xdcba}}, we need to dump history of both pages changes, checkpoint 
records on the analysis interval. Possibly, we should include FreeList pages to 
which the aforementioned pages were included to.

 

console.sh command:

{noformat}

--diagnostic page_history page_ids 234324,3455 print_to_log print_to_file

 

--diagnostic - command for dumping some diagnostic info

page_history - subcommand for dumping only page_history. Required.

page_ids {list_of_page_ids} - list of page ids for dumping

print_to_log, print_to_file - place for dumping(file or log or both). At least 
one of them is required.

{noformat}

Example of output:
{noformat}
[2019-05-07 11:57:57,350][INFO 
][test-runner-#58%diagnostic.DiagnosticProcessorTest%][PageHistoryDiagnoster] 
Next WAL record :: PageSnapshot [fullPageId = FullPageId 
[pageId=0002, effectivePageId=, grpId=-2100569601], 
page = [
Header [
type=11 (PageMetaIO),
ver=1,
crc=0,
pageId=844420635164672(offset=0, flags=10, partId=65535, index=0)
],
PageMeta[
treeRoot=844420635164675,
lastSuccessfulFullSnapshotId=0,
lastSuccessfulSnapshotId=0,
nextSnapshotTag=1,
lastSuccessfulSnapshotTag=0,
lastAllocatedPageCount=0,
candidatePageCount=0
]],
super = [WALRecord [size=4129, chainSize=0, pos=FileWALPointer [idx=0, 
fileOff=103, len=4129], type=PAGE_RECORD]]]
Next WAL record :: CheckpointRecord [cpId=c6ba7793-113b-4b54-8530-45e1708ca44c, 
end=false, cpMark=FileWALPointer [idx=0, fileOff=29, len=29], super=WALRecord 
[size=1963, chainSize=0, pos=FileWALPointer [idx=0, fileOff=39686, len=1963], 
type=CHECKPOINT_RECORD]]
Next WAL record :: PageSnapshot [fullPageId = FullPageId 
[pageId=0002, effectivePageId=, grpId=-1368047378], 
page = [
Header [
type=11 (PageMetaIO),
ver=1,
crc=0,
pageId=844420635164672(offset=0, flags=10, partId=65535, index=0)
],
PageMeta[
treeRoot=844420635164675,
lastSuccessfulFullSnapshotId=0,
lastSuccessfulSnapshotId=0,
nextSnapshotTag=1,
lastSuccessfulSnapshotTag=0,
lastAllocatedPageCount=0,
candidatePageCount=0
]],
super = [WALRecord [size=4129, chainSize=0, pos=FileWALPointer [idx=0, 
fileOff=55961, len=4129], type=PAGE_RECORD]]]
Next WAL record :: CheckpointRecord [cpId=145e599e-66fc-45f5-bde4-b0c392125968, 
end=false, cpMark=null, super=WALRecord [size=21409, chainSize=0, 
pos=FileWALPointer [idx=0, fileOff=13101788, len=21409], 
type=CHECKPOINT_RECORD]]
{noformat}

  was:
Currently, the only way to debug possible bugs in checkpointer/recovery 
mechanics is to manually parse WAL files after the corruption happened. This is 
not practical for several reasons. First, it requires manual actions which 
depend on the content of the exception. Second, it is not always possible to 
obtain WAL files (it may contain sensitive data).

We need to add a mechanics which will dump all information required for primary 
analysis of the corruption to the exception handler. For example, if an 
exception happened when materializing a link {{0xabcd}} written on an index 
page {{0xdcba}}, we need to dump history of both pages changes, checkpoint 
records on the analysis interval. Possibly, we should include FreeList pages to 
which the aforementioned pages were included to.

Example of output:
{noformat}
[2019-05-07 11:57:57,350][INFO 
][test-runner-#58%diagnostic.DiagnosticProcessorTest%][PageHistoryDiagnoster] 
Next WAL record :: PageSnapshot [fullPageId = FullPageId 
[pageId=0002, effectivePageId=, grpId=-2100569601], 
page = [
Header [
type=11 (PageMetaIO),
ver=1,
crc=0,
pageId=844420635164672(offset=0, flags=10, partId=65535, index=0)
],
PageMeta[
treeRoot=844420635164675,
lastSuccessfulFullSnapshotId=0,
lastSuccessfulSnapshotId=0,
nextSnapshotTag=1,
lastSuccessfulSnapshotTag=0,
lastAllocatedPageCount=0,
candidatePageCount=0
]],
super = [WALRecord [size=4129, chainSize=0, pos=FileWALPointer [idx=0, 
fileOff=103, 

[jira] [Updated] (IGNITE-11749) Implement automatic pages history dump on CorruptedTreeException

2019-05-07 Thread Anton Kalashnikov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Kalashnikov updated IGNITE-11749:
---
Description: 
Currently, the only way to debug possible bugs in checkpointer/recovery 
mechanics is to manually parse WAL files after the corruption happened. This is 
not practical for several reasons. First, it requires manual actions which 
depend on the content of the exception. Second, it is not always possible to 
obtain WAL files (it may contain sensitive data).

We need to add a mechanics which will dump all information required for primary 
analysis of the corruption to the exception handler. For example, if an 
exception happened when materializing a link {{0xabcd}} written on an index 
page {{0xdcba}}, we need to dump history of both pages changes, checkpoint 
records on the analysis interval. Possibly, we should include FreeList pages to 
which the aforementioned pages were included to.

Example of output:
{noformat}
[2019-05-07 11:57:57,350][INFO 
][test-runner-#58%diagnostic.DiagnosticProcessorTest%][PageHistoryDiagnoster] 
Next WAL record :: PageSnapshot [fullPageId = FullPageId 
[pageId=0002, effectivePageId=, grpId=-2100569601], 
page = [
Header [
type=11 (PageMetaIO),
ver=1,
crc=0,
pageId=844420635164672(offset=0, flags=10, partId=65535, index=0)
],
PageMeta[
treeRoot=844420635164675,
lastSuccessfulFullSnapshotId=0,
lastSuccessfulSnapshotId=0,
nextSnapshotTag=1,
lastSuccessfulSnapshotTag=0,
lastAllocatedPageCount=0,
candidatePageCount=0
]],
super = [WALRecord [size=4129, chainSize=0, pos=FileWALPointer [idx=0, 
fileOff=103, len=4129], type=PAGE_RECORD]]]
Next WAL record :: CheckpointRecord [cpId=c6ba7793-113b-4b54-8530-45e1708ca44c, 
end=false, cpMark=FileWALPointer [idx=0, fileOff=29, len=29], super=WALRecord 
[size=1963, chainSize=0, pos=FileWALPointer [idx=0, fileOff=39686, len=1963], 
type=CHECKPOINT_RECORD]]
Next WAL record :: PageSnapshot [fullPageId = FullPageId 
[pageId=0002, effectivePageId=, grpId=-1368047378], 
page = [
Header [
type=11 (PageMetaIO),
ver=1,
crc=0,
pageId=844420635164672(offset=0, flags=10, partId=65535, index=0)
],
PageMeta[
treeRoot=844420635164675,
lastSuccessfulFullSnapshotId=0,
lastSuccessfulSnapshotId=0,
nextSnapshotTag=1,
lastSuccessfulSnapshotTag=0,
lastAllocatedPageCount=0,
candidatePageCount=0
]],
super = [WALRecord [size=4129, chainSize=0, pos=FileWALPointer [idx=0, 
fileOff=55961, len=4129], type=PAGE_RECORD]]]
Next WAL record :: CheckpointRecord [cpId=145e599e-66fc-45f5-bde4-b0c392125968, 
end=false, cpMark=null, super=WALRecord [size=21409, chainSize=0, 
pos=FileWALPointer [idx=0, fileOff=13101788, len=21409], 
type=CHECKPOINT_RECORD]]
{noformat}

  was:
Currently, the only way to debug possible bugs in checkpointer/recovery 
mechanics is to manually parse WAL files after the corruption happened. This is 
not practical for several reasons. First, it requires manual actions which 
depend on the content of the exception. Second, it is not always possible to 
obtain WAL files (it may contain sensitive data).

We need to add a mechanics which will dump all information required for primary 
analysis of the corruption to the exception handler. For example, if an 
exception happened when materializing a link {{0xabcd}} written on an index 
page {{0xdcba}}, we need to dump history of both pages changes, checkpoint 
records on the analysis interval. Possibly, we should include FreeList pages to 
which the aforementioned pages were included to.


> Implement automatic pages history dump on CorruptedTreeException
> 
>
> Key: IGNITE-11749
> URL: https://issues.apache.org/jira/browse/IGNITE-11749
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexey Goncharuk
>Priority: Major
>
> Currently, the only way to debug possible bugs in checkpointer/recovery 
> mechanics is to manually parse WAL files after the corruption happened. This 
> is not practical for several reasons. First, it requires manual actions which 
> depend on the content of the exception. Second, it is not always possible to 
> obtain WAL files (it may contain sensitive data).
> We need to add a mechanics which will dump all information required for 
> primary analysis of the corruption to the exception handler. For example, if 
> an exception happened when materializing a link {{0xabcd}} written on an 
> index page {{0xdcba}}, we need to dump history of both pages changes, 
> checkpoint records on the analysis interval. Possibly, we should include 
> FreeList pages to which the aforementioned pages were included 

[jira] [Commented] (IGNITE-11626) InitNewCoordinatorFuture should be reported in diagnostic output

2019-05-07 Thread Ivan Rakov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-11626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16834515#comment-16834515
 ] 

Ivan Rakov commented on IGNITE-11626:
-

[~mstepachev], thanks, merged to master.

> InitNewCoordinatorFuture should be reported in diagnostic output
> 
>
> Key: IGNITE-11626
> URL: https://issues.apache.org/jira/browse/IGNITE-11626
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexey Goncharuk
>Assignee: Stepachev Maksim
>Priority: Major
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently {{InitNewCoordinatorFuture}} is not printed in PME diagnostic 
> output. This future also does not implement diagnostic aware interface and 
> remote information is not collected for this future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11835) Support JMX/control.sh API for page lock dump

2019-05-07 Thread Dmitriy Govorukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Govorukhin updated IGNITE-11835:

Description: 
Support JMX/control.sh API for page lock dump

JMX
{code}
public interface PageLockMXBean  {
void enableTracking();

void disableTracking();

boolean isTracingEnable();

String dumpLocks();

void dumpLocksToLog();

String dumpLocksToFile();

String dumpLocksToFile(String path);
}
{code}

HeapArrayLockStack and HeapArrayLockStack output:
org.apache.ignite.internal.processors.cache.persistence.diagnostic.PageLockStackTest#testThreeReadPageLock_3
{code}
1. Step
main (time=1557216932196, 2019-05-07 11:15:32.196) locked pages stack:
-> try read lock, pageId=1 [pageIdxHex=0001, partId=1, 
pageIdx=1, flags=]

2. Step
main (time=1557216932196, 2019-05-07 11:15:32.196) locked pages stack:
0 pageId=1 [pageIdxHex=0001, partId=1, pageIdx=1, 
flags=]

3. Step
main (time=1557216932196, 2019-05-07 11:15:32.196) locked pages stack:
-> try read lock, pageId=11 [pageIdxHex=000b, partId=11, 
pageIdx=11, flags=]
0 pageId=1 [pageIdxHex=0001, partId=1, pageIdx=1, 
flags=]

4. Step
main (time=1557216932196, 2019-05-07 11:15:32.196) locked pages stack:
1 pageId=11 [pageIdxHex=000b, partId=11, pageIdx=11, 
flags=]
0 pageId=1 [pageIdxHex=0001, partId=1, pageIdx=1, 
flags=]

5. Step
main (time=1557216932196, 2019-05-07 11:15:32.196) locked pages stack:
-> try read lock, pageId=111 [pageIdxHex=006f, partId=111, 
pageIdx=111, flags=]
1 pageId=11 [pageIdxHex=000b, partId=11, pageIdx=11, 
flags=]
0 pageId=1 [pageIdxHex=0001, partId=1, pageIdx=1, 
flags=]

6. Step
main (time=1557216932196, 2019-05-07 11:15:32.196) locked pages stack:
2 pageId=111 [pageIdxHex=006f, partId=111, pageIdx=111, 
flags=]
1 pageId=11 [pageIdxHex=000b, partId=11, pageIdx=11, 
flags=]
0 pageId=1 [pageIdxHex=0001, partId=1, pageIdx=1, 
flags=]

7. Step
main (time=1557216932196, 2019-05-07 11:15:32.196) locked pages stack:
2 pageId=111 [pageIdxHex=006f, partId=111, pageIdx=111, 
flags=]
1 -
0 pageId=1 [pageIdxHex=0001, partId=1, pageIdx=1, 
flags=]

8. Step
main (time=1557216932196, 2019-05-07 11:15:32.196) locked pages stack:
0 pageId=1 [pageIdxHex=0001, partId=1, pageIdx=1, 
flags=]
{code}

HeapArrayLockLog and OffHeapLockLog
org.apache.ignite.internal.processors.cache.persistence.diagnostic.PageLockLogTest#testThreeReadPageLock_3
{code}
1. Step
main
locked pages = []
-> Try read lock nextOpPageId=1, nextOpStructureId=123 
[pageIdxHex=0001, partId=1, pageIdx=1, flags=]

2. Step
main
locked pages = [1(r=1|w=0)]
L=1 -> Read lock nextOpPageId=1, nextOpCacheId=123 
[pageIdxHex=0001, partId=1, pageIdx=1, flags=]

3. Step
main
locked pages = [1(r=1|w=0)]
L=1 -> Read lock nextOpPageId=1, nextOpCacheId=123 
[pageIdxHex=0001, partId=1, pageIdx=1, flags=]
-> Try read lock nextOpPageId=11, nextOpStructureId=123 
[pageIdxHex=000b, partId=11, pageIdx=11, flags=]

4. Step
main
locked pages = [1(r=1|w=0),11(r=1|w=0)]
L=1 -> Read lock nextOpPageId=1, nextOpCacheId=123 
[pageIdxHex=0001, partId=1, pageIdx=1, flags=]
L=2 -> Read lock nextOpPageId=11, nextOpCacheId=123 
[pageIdxHex=000b, partId=11, pageIdx=11, flags=]

5. Step
main
locked pages = [1(r=1|w=0),11(r=1|w=0)]
L=1 -> Read lock nextOpPageId=1, nextOpCacheId=123 
[pageIdxHex=0001, partId=1, pageIdx=1, flags=]
L=2 -> Read lock nextOpPageId=11, nextOpCacheId=123 
[pageIdxHex=000b, partId=11, pageIdx=11, flags=]
-> Try read lock nextOpPageId=111, nextOpStructureId=123 
[pageIdxHex=006f, partId=111, pageIdx=111, flags=]

6. Step
main
locked pages = [1(r=1|w=0),11(r=1|w=0),111(r=1|w=0)]
L=1 -> Read lock nextOpPageId=1, nextOpCacheId=123 
[pageIdxHex=0001, partId=1, pageIdx=1, flags=]
L=2 -> Read lock nextOpPageId=11, nextOpCacheId=123 
[pageIdxHex=000b, partId=11, pageIdx=11, flags=]
L=3 -> Read lock nextOpPageId=111, nextOpCacheId=123 
[pageIdxHex=006f, partId=111, pageIdx=111, flags=]

7. Step
main
locked pages = [1(r=1|w=0),111(r=1|w=0)]
L=1 -> Read lock nextOpPageId=1, nextOpCacheId=123 
[pageIdxHex=0001, partId=1, pageIdx=1, flags=]
L=2 -> Read lock nextOpPageId=11, nextOpCacheId=123 
[pageIdxHex=000b, partId=11, pageIdx=11, 

[jira] [Updated] (IGNITE-11835) Support JMX/control.sh API for page lock dump

2019-05-07 Thread Dmitriy Govorukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Govorukhin updated IGNITE-11835:

Description: 
Support JMX/control.sh API for page lock dump

JMX
{code}
public interface PageLockMXBean  {
void enableTracking();

void disableTracking();

boolean isTracingEnable();

String dumpLocks();

void dumpLocksToLog();

String dumpLocksToFile();

String dumpLocksToFile(String path);
}
{code}

HeapArrayLockStack and HeapArrayLockStack output:

{code}
org.apache.ignite.internal.processors.cache.persistence.diagnostic.PageLockStackTest

1. Step
main (time=1557216932196, 2019-05-07 11:15:32.196) locked pages stack:
-> try read lock, pageId=1 [pageIdxHex=0001, partId=1, 
pageIdx=1, flags=]

2. Step
main (time=1557216932196, 2019-05-07 11:15:32.196) locked pages stack:
0 pageId=1 [pageIdxHex=0001, partId=1, pageIdx=1, 
flags=]

3. Step
main (time=1557216932196, 2019-05-07 11:15:32.196) locked pages stack:
-> try read lock, pageId=11 [pageIdxHex=000b, partId=11, 
pageIdx=11, flags=]
0 pageId=1 [pageIdxHex=0001, partId=1, pageIdx=1, 
flags=]

4. Step
main (time=1557216932196, 2019-05-07 11:15:32.196) locked pages stack:
1 pageId=11 [pageIdxHex=000b, partId=11, pageIdx=11, 
flags=]
0 pageId=1 [pageIdxHex=0001, partId=1, pageIdx=1, 
flags=]

5. Step
main (time=1557216932196, 2019-05-07 11:15:32.196) locked pages stack:
-> try read lock, pageId=111 [pageIdxHex=006f, partId=111, 
pageIdx=111, flags=]
1 pageId=11 [pageIdxHex=000b, partId=11, pageIdx=11, 
flags=]
0 pageId=1 [pageIdxHex=0001, partId=1, pageIdx=1, 
flags=]

6. Step
main (time=1557216932196, 2019-05-07 11:15:32.196) locked pages stack:
2 pageId=111 [pageIdxHex=006f, partId=111, pageIdx=111, 
flags=]
1 pageId=11 [pageIdxHex=000b, partId=11, pageIdx=11, 
flags=]
0 pageId=1 [pageIdxHex=0001, partId=1, pageIdx=1, 
flags=]

7. Step
main (time=1557216932196, 2019-05-07 11:15:32.196) locked pages stack:
2 pageId=111 [pageIdxHex=006f, partId=111, pageIdx=111, 
flags=]
1 -
0 pageId=1 [pageIdxHex=0001, partId=1, pageIdx=1, 
flags=]

8. Step
main (time=1557216932196, 2019-05-07 11:15:32.196) locked pages stack:
0 pageId=1 [pageIdxHex=0001, partId=1, pageIdx=1, 
flags=]
{code}

HeapArrayLockLog and OffHeapLockLog

{code}
org.apache.ignite.internal.processors.cache.persistence.diagnostic.PageLockLogTest

1. Step
main
locked pages = []
-> Try read lock nextOpPageId=1, nextOpStructureId=123 
[pageIdxHex=0001, partId=1, pageIdx=1, flags=]

2. Step
main
locked pages = [1(r=1|w=0)]
L=1 -> Read lock nextOpPageId=1, nextOpCacheId=123 
[pageIdxHex=0001, partId=1, pageIdx=1, flags=]

3. Step
main
locked pages = [1(r=1|w=0)]
L=1 -> Read lock nextOpPageId=1, nextOpCacheId=123 
[pageIdxHex=0001, partId=1, pageIdx=1, flags=]
-> Try read lock nextOpPageId=11, nextOpStructureId=123 
[pageIdxHex=000b, partId=11, pageIdx=11, flags=]

4. Step
main
locked pages = [1(r=1|w=0),11(r=1|w=0)]
L=1 -> Read lock nextOpPageId=1, nextOpCacheId=123 
[pageIdxHex=0001, partId=1, pageIdx=1, flags=]
L=2 -> Read lock nextOpPageId=11, nextOpCacheId=123 
[pageIdxHex=000b, partId=11, pageIdx=11, flags=]

5. Step
main
locked pages = [1(r=1|w=0),11(r=1|w=0)]
L=1 -> Read lock nextOpPageId=1, nextOpCacheId=123 
[pageIdxHex=0001, partId=1, pageIdx=1, flags=]
L=2 -> Read lock nextOpPageId=11, nextOpCacheId=123 
[pageIdxHex=000b, partId=11, pageIdx=11, flags=]
-> Try read lock nextOpPageId=111, nextOpStructureId=123 
[pageIdxHex=006f, partId=111, pageIdx=111, flags=]

6. Step
main
locked pages = [1(r=1|w=0),11(r=1|w=0),111(r=1|w=0)]
L=1 -> Read lock nextOpPageId=1, nextOpCacheId=123 
[pageIdxHex=0001, partId=1, pageIdx=1, flags=]
L=2 -> Read lock nextOpPageId=11, nextOpCacheId=123 
[pageIdxHex=000b, partId=11, pageIdx=11, flags=]
L=3 -> Read lock nextOpPageId=111, nextOpCacheId=123 
[pageIdxHex=006f, partId=111, pageIdx=111, flags=]

7. Step
main
locked pages = [1(r=1|w=0),111(r=1|w=0)]
L=1 -> Read lock nextOpPageId=1, nextOpCacheId=123 
[pageIdxHex=0001, partId=1, pageIdx=1, flags=]
L=2 -> Read lock nextOpPageId=11, nextOpCacheId=123 
[pageIdxHex=000b, partId=11, pageIdx=11, flags=]
L=3 -> Read lock 

[jira] [Updated] (IGNITE-11816) Diagnostic processor for dump page history info

2019-05-07 Thread Dmitriy Govorukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Govorukhin updated IGNITE-11816:

Description: Diagnostic processor for dump page history info  (was: Debug 
processor for dump page history info)

> Diagnostic processor for dump page history info
> ---
>
> Key: IGNITE-11816
> URL: https://issues.apache.org/jira/browse/IGNITE-11816
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Anton Kalashnikov
>Assignee: Anton Kalashnikov
>Priority: Major
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Diagnostic processor for dump page history info



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11816) Diagnostic processor for dump page history info

2019-05-07 Thread Dmitriy Govorukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Govorukhin updated IGNITE-11816:

Summary: Diagnostic processor for dump page history info  (was: Debug 
processor for dump page history info)

> Diagnostic processor for dump page history info
> ---
>
> Key: IGNITE-11816
> URL: https://issues.apache.org/jira/browse/IGNITE-11816
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Anton Kalashnikov
>Assignee: Anton Kalashnikov
>Priority: Major
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Debug processor for dump page history info



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-11816) Debug processor for dump page history info

2019-05-07 Thread Dmitriy Govorukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-11816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Govorukhin updated IGNITE-11816:

Fix Version/s: 2.8

> Debug processor for dump page history info
> --
>
> Key: IGNITE-11816
> URL: https://issues.apache.org/jira/browse/IGNITE-11816
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Anton Kalashnikov
>Assignee: Anton Kalashnikov
>Priority: Major
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Debug processor for dump page history info



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)