[jira] [Commented] (IGNITE-11393) Create IgniteLinkTaglet.toString() implementation for Java9+

2020-07-02 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-11393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150728#comment-17150728
 ] 

Ignite TC Bot commented on IGNITE-11393:


{panel:title=Branch: [pull/7983/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
{panel:title=Branch: [pull/7983/head] Base: [master] : New Tests 
(8)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#8b}Service Grid{color} [tests 4]
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple 
[val1=DiscoveryEvent [evtNode=c67a0639-b5f1-4c78-925e-8366aa0cc174, topVer=0, 
nodeId8=9102236b, msg=, type=NODE_JOINED, tstamp=1593674423276], 
val2=AffinityTopologyVersion [topVer=3424044072389115096, minorTopVer=0]]] - 
PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple 
[val1=DiscoveryEvent [evtNode=c67a0639-b5f1-4c78-925e-8366aa0cc174, topVer=0, 
nodeId8=9102236b, msg=, type=NODE_JOINED, tstamp=1593674423276], 
val2=AffinityTopologyVersion [topVer=3424044072389115096, minorTopVer=0]]] - 
PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple 
[val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest 
[id=bbd207e0371-c697d234-6fad-493d-afea-4beeb930a44d, reqs=SingletonList 
[ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent 
[evtNode=b9bf8878-9fb3-4f05-8cea-4c7c2edfa88e, topVer=0, nodeId8=b9bf8878, 
msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1593674423276]], 
val2=AffinityTopologyVersion [topVer=-2163644835463648817, minorTopVer=0]]] - 
PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple 
[val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest 
[id=bbd207e0371-c697d234-6fad-493d-afea-4beeb930a44d, reqs=SingletonList 
[ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent 
[evtNode=b9bf8878-9fb3-4f05-8cea-4c7c2edfa88e, topVer=0, nodeId8=b9bf8878, 
msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1593674423276]], 
val2=AffinityTopologyVersion [topVer=-2163644835463648817, minorTopVer=0]]] - 
PASSED{color}

{color:#8b}Service Grid (legacy mode){color} [tests 4]
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple 
[val1=DiscoveryEvent [evtNode=c7ed05a2-5ee9-4202-a8ab-9521a3509083, topVer=0, 
nodeId8=3abde5a8, msg=, type=NODE_JOINED, tstamp=1593674229938], 
val2=AffinityTopologyVersion [topVer=3445520685455818384, minorTopVer=0]]] - 
PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple 
[val1=DiscoveryEvent [evtNode=c7ed05a2-5ee9-4202-a8ab-9521a3509083, topVer=0, 
nodeId8=3abde5a8, msg=, type=NODE_JOINED, tstamp=1593674229938], 
val2=AffinityTopologyVersion [topVer=3445520685455818384, minorTopVer=0]]] - 
PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple 
[val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest 
[id=6b8e26e0371-2f2c9fa4-9ae9-4e17-9a57-3e98f106e958, reqs=SingletonList 
[ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent 
[evtNode=b11a8a03-377a-4758-aa62-b584b75b5284, topVer=0, nodeId8=b11a8a03, 
msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1593674229938]], 
val2=AffinityTopologyVersion [topVer=5336888790201139606, minorTopVer=0]]] - 
PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple 
[val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest 
[id=6b8e26e0371-2f2c9fa4-9ae9-4e17-9a57-3e98f106e958, reqs=SingletonList 
[ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent 
[evtNode=b11a8a03-377a-4758-aa62-b584b75b5284, topVer=0, nodeId8=b11a8a03, 
msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1593674229938]], 
val2=AffinityTopologyVersion [topVer=5336888790201139606, minorTopVer=0]]] - 
PASSED{color}

{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=5433386buildTypeId=IgniteTests24Java8_RunAll]

> Create IgniteLinkTaglet.toString() implementation for Java9+
> 
>
> Key: IGNITE-11393
> URL: https://issues.apache.org/jira/browse/IGNITE-11393
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Dmitry Pavlov
>Assignee: Aleksey Plekhanov
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> New implementation was added according to the new Java API for Javadoc.
> But the main method 

[jira] [Comment Edited] (IGNITE-12510) In-memory page eviction may fail in case very large entries are stored in the cache

2020-07-02 Thread Dmitriy Shirchenko (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150642#comment-17150642
 ] 

Dmitriy Shirchenko edited comment on IGNITE-12510 at 7/3/20, 12:11 AM:
---

Can someone comment on how large of entries will cause eviction to fail? We see 
this error on binary blobs of 400MB so wonder if we are hitting this issue.

 

Edit: I guess I can answer my own question as it would be 5000*page_size. I 
guess we can work around this by increasing our page size if we want to support 
blobs that big. Still hoping for a better solution. 


was (Author: shirchen):
Can someone comment on how large of entries will cause eviction to fail? We see 
this error on binary blobs of 400MB so wonder if we are hitting this issue.

> In-memory page eviction may fail in case very large entries are stored in the 
> cache
> ---
>
> Key: IGNITE-12510
> URL: https://issues.apache.org/jira/browse/IGNITE-12510
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.7.6
>Reporter: Ivan Rakov
>Priority: Major
>  Labels: newbie
>
> In-memory page eviction (both DataPageEvictionMode#RANDOM_LRU and 
> DataPageEvictionMode#RANDOM_2_LRU) has limited number of attempts to choose 
> candidate page for data removal:
> {code:java}
> if (sampleSpinCnt > SAMPLE_SPIN_LIMIT) { // 5000
> LT.warn(log, "Too many attempts to choose data page: " + 
> SAMPLE_SPIN_LIMIT);
> return;
> }
> {code}
> Large data entries are stored in several data pages which are sequentially 
> linked to each other. Only "head" pages are suitable as candidates for 
> eviction, because the whole entry is available only from "head" page (list of 
> pages is singly linked; there are no reverse links from tail to head).
> The problem is that if we put very large entries to evictable cache (e.g. 
> each entry needs more than 5000 pages to be stored), there are too few head 
> pages and "Too many attempts to choose data page" error is likely to show up.
> We need to perform something like full scan if we failed to find a head page 
> in SAMPLE_SPIN_LIMIT attempts instead of just failing node with error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-12510) In-memory page eviction may fail in case very large entries are stored in the cache

2020-07-02 Thread Dmitriy Shirchenko (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150642#comment-17150642
 ] 

Dmitriy Shirchenko commented on IGNITE-12510:
-

Can someone comment on how large of entries will cause eviction to fail? We see 
this error on binary blobs of 400MB so wonder if we are hitting this issue.

> In-memory page eviction may fail in case very large entries are stored in the 
> cache
> ---
>
> Key: IGNITE-12510
> URL: https://issues.apache.org/jira/browse/IGNITE-12510
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.7.6
>Reporter: Ivan Rakov
>Priority: Major
>  Labels: newbie
>
> In-memory page eviction (both DataPageEvictionMode#RANDOM_LRU and 
> DataPageEvictionMode#RANDOM_2_LRU) has limited number of attempts to choose 
> candidate page for data removal:
> {code:java}
> if (sampleSpinCnt > SAMPLE_SPIN_LIMIT) { // 5000
> LT.warn(log, "Too many attempts to choose data page: " + 
> SAMPLE_SPIN_LIMIT);
> return;
> }
> {code}
> Large data entries are stored in several data pages which are sequentially 
> linked to each other. Only "head" pages are suitable as candidates for 
> eviction, because the whole entry is available only from "head" page (list of 
> pages is singly linked; there are no reverse links from tail to head).
> The problem is that if we put very large entries to evictable cache (e.g. 
> each entry needs more than 5000 pages to be stored), there are too few head 
> pages and "Too many attempts to choose data page" error is likely to show up.
> We need to perform something like full scan if we failed to find a head page 
> in SAMPLE_SPIN_LIMIT attempts instead of just failing node with error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (IGNITE-13209) JavaIgniteCatalogExample doesn't work with a standalone Spark cluster

2020-07-02 Thread Valentin Kulichenko (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentin Kulichenko reassigned IGNITE-13209:


Assignee: Valentin Kulichenko

> JavaIgniteCatalogExample doesn't work with a standalone Spark cluster
> -
>
> Key: IGNITE-13209
> URL: https://issues.apache.org/jira/browse/IGNITE-13209
> Project: Ignite
>  Issue Type: Bug
>  Components: spark
>Affects Versions: 2.8.1
>Reporter: Valentin Kulichenko
>Assignee: Valentin Kulichenko
>Priority: Major
>
> To reproduce the issue:
>  # Start Spark master and slave as described here: 
> [http://spark.apache.org/docs/latest/spark-standalone.html]
>  # Change the master URL in the {{JavaIgniteCatalogExample}} from "local" to 
> the one just started.
>  # Run the example.
> Updated code that creates the {{IgniteSparkSession}}:
> {code:java}
> String libs = 
> "/Users/vkulichenko/GridGain/releases/apache-ignite-2.8.1-bin/libs";
> 
> IgniteSparkSession igniteSession = IgniteSparkSession.builder()
> .appName("Spark Ignite catalog example")
> .master("spark://Valentin-Kulichenko-MacBook-Pro-1772.local:7077")
> .config("spark.executor.instances", "2")
> .config("spark.executor.extraClassPath", libs + "/*" + ":" + libs + 
> "/ignite-spark/*:" + libs + "/ignite-spring/*")
> .igniteConfig(CONFIG)
> .getOrCreate();
> {code}
> Execution fails with this exception:
> {noformat}
> [2020-07-02 15:50:27,627][ERROR][task-result-getter-3][TaskSetManager] Task 0 
> in stage 0.0 failed 4 times; aborting job
> Exception in thread "main" org.apache.spark.SparkException: Job aborted due 
> to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: 
> Lost task 0.3 in stage 0.0 (TID 3, 10.0.0.11, executor 0): class 
> org.apache.ignite.IgniteIllegalStateException: Ignite instance with provided 
> name doesn't exist. Did you call Ignition.start(..) to start an Ignite 
> instance? [name=testing]
>   at org.apache.ignite.internal.IgnitionEx.grid(IgnitionEx.java:1351)
>   at org.apache.ignite.Ignition.ignite(Ignition.java:528)
>   at org.apache.ignite.spark.impl.package$.ignite(package.scala:65)
>   at 
> org.apache.ignite.spark.impl.IgniteRelationProvider$$anonfun$configProvider$1$2.apply(IgniteRelationProvider.scala:238)
>   at 
> org.apache.ignite.spark.impl.IgniteRelationProvider$$anonfun$configProvider$1$2.apply(IgniteRelationProvider.scala:235)
>   at org.apache.ignite.spark.Once.apply(IgniteContext.scala:222)
>   at org.apache.ignite.spark.IgniteContext.ignite(IgniteContext.scala:144)
>   at 
> org.apache.ignite.spark.impl.IgniteSQLDataFrameRDD.compute(IgniteSQLDataFrameRDD.scala:65)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>   at org.apache.spark.scheduler.Task.run(Task.scala:109)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13209) JavaIgniteCatalogExample doesn't work with a standalone Spark cluster

2020-07-02 Thread Valentin Kulichenko (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150638#comment-17150638
 ] 

Valentin Kulichenko commented on IGNITE-13209:
--

The issue was reported multiple times in the past, but apparently was never 
reproduced:
 * https://issues.apache.org/jira/browse/IGNITE-12397
 * https://issues.apache.org/jira/browse/IGNITE-12637
 * 
http://apache-ignite-users.70518.x6.nabble.com/IgniteSparkSession-exception-Ignite-instance-with-provided-name-doesn-t-exist-Did-you-call-Ignition–td24263.html
 * 
https://stackoverflow.com/questions/60083307/newly-created-spark-executor-running-in-kubernetes-doesnt-know-ignite-configura

> JavaIgniteCatalogExample doesn't work with a standalone Spark cluster
> -
>
> Key: IGNITE-13209
> URL: https://issues.apache.org/jira/browse/IGNITE-13209
> Project: Ignite
>  Issue Type: Bug
>  Components: spark
>Affects Versions: 2.8.1
>Reporter: Valentin Kulichenko
>Priority: Major
>
> To reproduce the issue:
>  # Start Spark master and slave as described here: 
> [http://spark.apache.org/docs/latest/spark-standalone.html]
>  # Change the master URL in the {{JavaIgniteCatalogExample}} from "local" to 
> the one just started.
>  # Run the example.
> Updated code that creates the {{IgniteSparkSession}}:
> {code:java}
> String libs = 
> "/Users/vkulichenko/GridGain/releases/apache-ignite-2.8.1-bin/libs";
> 
> IgniteSparkSession igniteSession = IgniteSparkSession.builder()
> .appName("Spark Ignite catalog example")
> .master("spark://Valentin-Kulichenko-MacBook-Pro-1772.local:7077")
> .config("spark.executor.instances", "2")
> .config("spark.executor.extraClassPath", libs + "/*" + ":" + libs + 
> "/ignite-spark/*:" + libs + "/ignite-spring/*")
> .igniteConfig(CONFIG)
> .getOrCreate();
> {code}
> Execution fails with this exception:
> {noformat}
> [2020-07-02 15:50:27,627][ERROR][task-result-getter-3][TaskSetManager] Task 0 
> in stage 0.0 failed 4 times; aborting job
> Exception in thread "main" org.apache.spark.SparkException: Job aborted due 
> to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: 
> Lost task 0.3 in stage 0.0 (TID 3, 10.0.0.11, executor 0): class 
> org.apache.ignite.IgniteIllegalStateException: Ignite instance with provided 
> name doesn't exist. Did you call Ignition.start(..) to start an Ignite 
> instance? [name=testing]
>   at org.apache.ignite.internal.IgnitionEx.grid(IgnitionEx.java:1351)
>   at org.apache.ignite.Ignition.ignite(Ignition.java:528)
>   at org.apache.ignite.spark.impl.package$.ignite(package.scala:65)
>   at 
> org.apache.ignite.spark.impl.IgniteRelationProvider$$anonfun$configProvider$1$2.apply(IgniteRelationProvider.scala:238)
>   at 
> org.apache.ignite.spark.impl.IgniteRelationProvider$$anonfun$configProvider$1$2.apply(IgniteRelationProvider.scala:235)
>   at org.apache.ignite.spark.Once.apply(IgniteContext.scala:222)
>   at org.apache.ignite.spark.IgniteContext.ignite(IgniteContext.scala:144)
>   at 
> org.apache.ignite.spark.impl.IgniteSQLDataFrameRDD.compute(IgniteSQLDataFrameRDD.scala:65)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>   at org.apache.spark.scheduler.Task.run(Task.scala:109)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13209) JavaIgniteCatalogExample doesn't work with a standalone Spark cluster

2020-07-02 Thread Valentin Kulichenko (Jira)
Valentin Kulichenko created IGNITE-13209:


 Summary: JavaIgniteCatalogExample doesn't work with a standalone 
Spark cluster
 Key: IGNITE-13209
 URL: https://issues.apache.org/jira/browse/IGNITE-13209
 Project: Ignite
  Issue Type: Bug
  Components: spark
Affects Versions: 2.8.1
Reporter: Valentin Kulichenko


To reproduce the issue:
 # Start Spark master and slave as described here: 
[http://spark.apache.org/docs/latest/spark-standalone.html]
 # Change the master URL in the {{JavaIgniteCatalogExample}} from "local" to 
the one just started.
 # Run the example.

Updated code that creates the {{IgniteSparkSession}}:
{code:java}
String libs = 
"/Users/vkulichenko/GridGain/releases/apache-ignite-2.8.1-bin/libs";

IgniteSparkSession igniteSession = IgniteSparkSession.builder()
.appName("Spark Ignite catalog example")
.master("spark://Valentin-Kulichenko-MacBook-Pro-1772.local:7077")
.config("spark.executor.instances", "2")
.config("spark.executor.extraClassPath", libs + "/*" + ":" + libs + 
"/ignite-spark/*:" + libs + "/ignite-spring/*")
.igniteConfig(CONFIG)
.getOrCreate();
{code}
Execution fails with this exception:
{noformat}
[2020-07-02 15:50:27,627][ERROR][task-result-getter-3][TaskSetManager] Task 0 
in stage 0.0 failed 4 times; aborting job
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to 
stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost 
task 0.3 in stage 0.0 (TID 3, 10.0.0.11, executor 0): class 
org.apache.ignite.IgniteIllegalStateException: Ignite instance with provided 
name doesn't exist. Did you call Ignition.start(..) to start an Ignite 
instance? [name=testing]
at org.apache.ignite.internal.IgnitionEx.grid(IgnitionEx.java:1351)
at org.apache.ignite.Ignition.ignite(Ignition.java:528)
at org.apache.ignite.spark.impl.package$.ignite(package.scala:65)
at 
org.apache.ignite.spark.impl.IgniteRelationProvider$$anonfun$configProvider$1$2.apply(IgniteRelationProvider.scala:238)
at 
org.apache.ignite.spark.impl.IgniteRelationProvider$$anonfun$configProvider$1$2.apply(IgniteRelationProvider.scala:235)
at org.apache.ignite.spark.Once.apply(IgniteContext.scala:222)
at org.apache.ignite.spark.IgniteContext.ignite(IgniteContext.scala:144)
at 
org.apache.ignite.spark.impl.IgniteSQLDataFrameRDD.compute(IgniteSQLDataFrameRDD.scala:65)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13208) Refactoring of IgniteSpiOperationTimeoutHelper

2020-07-02 Thread Vladimir Steshin (Jira)
Vladimir Steshin created IGNITE-13208:
-

 Summary: Refactoring of IgniteSpiOperationTimeoutHelper
 Key: IGNITE-13208
 URL: https://issues.apache.org/jira/browse/IGNITE-13208
 Project: Ignite
  Issue Type: Task
Reporter: Vladimir Steshin
Assignee: Vladimir Steshin


IgniteSpiOperationTimeoutHelper has many timeout fields. It looks like to get 
simplified.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-10251) Get rid of the code left from times when lateAffinity=false was supported

2020-07-02 Thread Alexey Goncharuk (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150446#comment-17150446
 ] 

Alexey Goncharuk commented on IGNITE-10251:
---

[~DmitriyGovorukhin] do you mind if I take over this task?

> Get rid of the code left from times when lateAffinity=false was supported
> -
>
> Key: IGNITE-10251
> URL: https://issues.apache.org/jira/browse/IGNITE-10251
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexey Scherbakov
>Assignee: Dmitriy Govorukhin
>Priority: Major
> Fix For: 2.9
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This code can hide errors and lead to inefficient processing in some 
> scenarios.
> Some examples:
>  * *Forced key preloading*
> {code:java}
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtForceKeysFuture
>  and related stuff{code}
> which is called if key is mapped to moving partition
>  * *Unnecessary dht lock processing*
> {code:java}
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLockFuture#map(java.lang.Iterable)
> {code}
> needVal is always false if lateAff=true
> Also 
> \{{org.apache.ignite.configuration.IgniteConfiguration#setLateAffinityAssignment}}
>  must be removed in 3.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13206) Represent in the documenttion affection of several node addresses on failure detection.

2020-07-02 Thread Vladimir Steshin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-13206:
--
Description: 
Current TcpDiscoverySpi can prolong detection of node failure which has several 
IP addresses. This happens because most of the timeouts like 
failureDetectionTimeout, sockTimeout, ackTimeout work per address. Actual 
failure detection delay is: failureDetectionTimeout*addressesNumber (1). And 
the node addresses are sorted out consistently. This affection on failure 
detection should be noted in the documentation.

*1: addressesNumber - addresses number of next node in the ring.

The suggestion is to represent this behavior in 
https://apacheignite.readme.io/docs/tcpip-discovery. The text might be:

"You should assing multiple addresses to a node only if they represent some 
real physical connections which can give more reliability. Providing several 
addresses can prolong failure detection of current node. The timeouts and 
settings on network operations (_failureDetectionTimeout(), sockTimeout, 
ackTimeout, maxAckTimeout, reconCnt_) work per connection/address. The 
exception is _connRecoveryTimeout_. And node addresses are sorted out 
consistently.
 Example: if you use _failureDetectionTimeout _and have set 3 ip addresses 
for this node, previous node iт  the ring can take up to 
'failureDetectionTimeout * 3' to detect failure of current node."



  was:
Current TcpDiscoverySpi can prolong detection of node failure which has several 
IP addresses. This happens because most of the timeouts like 
failureDetectionTimeout, sockTimeout, ackTimeout work per address. Actual 
failure detection delay is: failureDetectionTimeout*addressesNumber (1). And 
the node addresses are sorted out consistently. This affection on failure 
detection should be noted in the documentation.

*1: addressesNumber - addresses number of next node in the ring.

The suggestion is to represent this behavior in 
https://apacheignite.readme.io/docs/tcpip-discovery. The text might be:

You should assing multiple addresses to a node only if they represent some real 
physical connections which can give more reliability. Providing several 
addresses can prolong failure detection of current node. The timeouts and 
settings on network operations (_failureDetectionTimeout(), sockTimeout, 
ackTimeout, maxAckTimeout, reconCnt_) work per connection/address. The 
exception is _connRecoveryTimeout_. And node addresses are sorted out 
consistently.
 Example: if you use _failureDetectionTimeout _and have set 3 ip addresses 
for this node, previous node iт  the ring can take up to 
'failureDetectionTimeout * 3' to detect failure of current node.




> Represent in the documenttion affection of several node addresses on failure 
> detection.
> ---
>
> Key: IGNITE-13206
> URL: https://issues.apache.org/jira/browse/IGNITE-13206
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Minor
>  Labels: iep-45
>
> Current TcpDiscoverySpi can prolong detection of node failure which has 
> several IP addresses. This happens because most of the timeouts like 
> failureDetectionTimeout, sockTimeout, ackTimeout work per address. Actual 
> failure detection delay is: failureDetectionTimeout*addressesNumber (1). And 
> the node addresses are sorted out consistently. This affection on failure 
> detection should be noted in the documentation.
> *1: addressesNumber - addresses number of next node in the ring.
> The suggestion is to represent this behavior in 
> https://apacheignite.readme.io/docs/tcpip-discovery. The text might be:
> "You should assing multiple addresses to a node only if they represent some 
> real physical connections which can give more reliability. Providing several 
> addresses can prolong failure detection of current node. The timeouts and 
> settings on network operations (_failureDetectionTimeout(), sockTimeout, 
> ackTimeout, maxAckTimeout, reconCnt_) work per connection/address. The 
> exception is _connRecoveryTimeout_. And node addresses are sorted out 
> consistently.
>  Example: if you use _failureDetectionTimeout _and have set 3 ip 
> addresses for this node, previous node iт  the ring can take up to 
> 'failureDetectionTimeout * 3' to detect failure of current node."



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13206) Represent in the documenttion affection of several node addresses on failure detection.

2020-07-02 Thread Vladimir Steshin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-13206:
--
Description: 
Current TcpDiscoverySpi can prolong detection of node failure which has several 
IP addresses. This happens because most of the timeouts like 
failureDetectionTimeout, sockTimeout, ackTimeout work per address. Actual 
failure detection delay is: failureDetectionTimeout*addressesNumber (1). And 
the node addresses are sorted out consistently. This affection on failure 
detection should be noted in the documentation.

*1: addressesNumber - addresses number of next node in the ring.

The suggestion is to represent this behavior in 
https://apacheignite.readme.io/docs/tcpip-discovery. The text might be:

You should assing multiple addresses to a node only if they represent some real 
physical connections which can give more reliability. Providing several 
addresses can prolong failure detection of current node. The timeouts and 
settings on network operations (_failureDetectionTimeout(), sockTimeout, 
ackTimeout, maxAckTimeout, reconCnt_) work per connection/address. The 
exception is _connRecoveryTimeout_. And node addresses are sorted out 
consistently.
 Example: if you use _failureDetectionTimeout _and have set 3 ip addresses 
for this node, previous node iт  the ring can take up to 
'failureDetectionTimeout * 3' to detect failure of current node.



  was:
Current TcpDiscoverySpi can prolong detection of node failure which has several 
IP addresses. This happens because most of the timeouts like 
failureDetectionTimeout, sockTimeout, ackTimeout work per address. Actual 
failure detection delay is: failureDetectionTimeout*addressesNumber (1). And 
the node addresses are sorted out consistently. This affection on failure 
detection should be noted in the documentation.

*1: addressesNumber - addresses number of next node in the ring.


> Represent in the documenttion affection of several node addresses on failure 
> detection.
> ---
>
> Key: IGNITE-13206
> URL: https://issues.apache.org/jira/browse/IGNITE-13206
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Minor
>  Labels: iep-45
>
> Current TcpDiscoverySpi can prolong detection of node failure which has 
> several IP addresses. This happens because most of the timeouts like 
> failureDetectionTimeout, sockTimeout, ackTimeout work per address. Actual 
> failure detection delay is: failureDetectionTimeout*addressesNumber (1). And 
> the node addresses are sorted out consistently. This affection on failure 
> detection should be noted in the documentation.
> *1: addressesNumber - addresses number of next node in the ring.
> The suggestion is to represent this behavior in 
> https://apacheignite.readme.io/docs/tcpip-discovery. The text might be:
> You should assing multiple addresses to a node only if they represent some 
> real physical connections which can give more reliability. Providing several 
> addresses can prolong failure detection of current node. The timeouts and 
> settings on network operations (_failureDetectionTimeout(), sockTimeout, 
> ackTimeout, maxAckTimeout, reconCnt_) work per connection/address. The 
> exception is _connRecoveryTimeout_. And node addresses are sorted out 
> consistently.
>  Example: if you use _failureDetectionTimeout _and have set 3 ip 
> addresses for this node, previous node iт  the ring can take up to 
> 'failureDetectionTimeout * 3' to detect failure of current node.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13207) Checkpointer code refactoring: Splitting GridCacheDatabaseSharedManager ant Checkpointer

2020-07-02 Thread Anton Kalashnikov (Jira)
Anton Kalashnikov created IGNITE-13207:
--

 Summary: Checkpointer code refactoring: Splitting 
GridCacheDatabaseSharedManager ant Checkpointer
 Key: IGNITE-13207
 URL: https://issues.apache.org/jira/browse/IGNITE-13207
 Project: Ignite
  Issue Type: Sub-task
Reporter: Anton Kalashnikov
Assignee: Anton Kalashnikov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13123) Move control.sh to a separate module

2020-07-02 Thread Kirill Tkalenko (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150313#comment-17150313
 ] 

Kirill Tkalenko commented on IGNITE-13123:
--

[~agoncharuk] Please make new review.

> Move control.sh to a separate module
> 
>
> Key: IGNITE-13123
> URL: https://issues.apache.org/jira/browse/IGNITE-13123
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Move [1] and its associated classes to a separate "ignite-control-utility" 
> module.
> [1] - org.apache.ignite.internal.commandline.CommandHandler



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13123) Move control.sh to a separate module

2020-07-02 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150311#comment-17150311
 ] 

Ignite TC Bot commented on IGNITE-13123:


{panel:title=Branch: [pull/7910/head] Base: [master] : Possible Blockers 
(1)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1}
{color:#d04437}SPI{color} [[tests 0 TIMEOUT , Exit Code 
|https://ci.ignite.apache.org/viewLog.html?buildId=5433820]]

{panel}
{panel:title=Branch: [pull/7910/head] Base: [master] : New Tests 
(8)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#8b}Service Grid{color} [tests 4]
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple 
[val1=DiscoveryEvent [evtNode=a5558439-47f5-407d-b5c5-190dbc84c745, topVer=0, 
nodeId8=ffe82789, msg=, type=NODE_JOINED, tstamp=1593687948747], 
val2=AffinityTopologyVersion [topVer=-609332755295076476, minorTopVer=0]]] - 
PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple 
[val1=DiscoveryEvent [evtNode=a5558439-47f5-407d-b5c5-190dbc84c745, topVer=0, 
nodeId8=ffe82789, msg=, type=NODE_JOINED, tstamp=1593687948747], 
val2=AffinityTopologyVersion [topVer=-609332755295076476, minorTopVer=0]]] - 
PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple 
[val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest 
[id=fcd343f0371-da5aba94-eb00-46c7-98e2-a12cf230c16c, reqs=SingletonList 
[ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent 
[evtNode=bc134439-3a2a-4be8-968a-dedff0a58df2, topVer=0, nodeId8=bc134439, 
msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1593687948747]], 
val2=AffinityTopologyVersion [topVer=8798508403763738627, minorTopVer=0]]] - 
PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple 
[val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest 
[id=fcd343f0371-da5aba94-eb00-46c7-98e2-a12cf230c16c, reqs=SingletonList 
[ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent 
[evtNode=bc134439-3a2a-4be8-968a-dedff0a58df2, topVer=0, nodeId8=bc134439, 
msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1593687948747]], 
val2=AffinityTopologyVersion [topVer=8798508403763738627, minorTopVer=0]]] - 
PASSED{color}

{color:#8b}Service Grid (legacy mode){color} [tests 4]
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple 
[val1=DiscoveryEvent [evtNode=cc188e3e-966e-486f-a964-09f44481211a, topVer=0, 
nodeId8=ee08a3b7, msg=, type=NODE_JOINED, tstamp=1593687826247], 
val2=AffinityTopologyVersion [topVer=5907908285539696028, minorTopVer=0]]] - 
PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple 
[val1=DiscoveryEvent [evtNode=cc188e3e-966e-486f-a964-09f44481211a, topVer=0, 
nodeId8=ee08a3b7, msg=, type=NODE_JOINED, tstamp=1593687826247], 
val2=AffinityTopologyVersion [topVer=5907908285539696028, minorTopVer=0]]] - 
PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple 
[val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest 
[id=b4f523f0371-4b0890ac-8083-4789-b9e4-eba0b5e650e8, reqs=SingletonList 
[ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent 
[evtNode=402ad641-2c18-416d-98fc-dbfd23358c06, topVer=0, nodeId8=402ad641, 
msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1593687826247]], 
val2=AffinityTopologyVersion [topVer=-7368274692200575106, minorTopVer=0]]] - 
PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple 
[val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest 
[id=b4f523f0371-4b0890ac-8083-4789-b9e4-eba0b5e650e8, reqs=SingletonList 
[ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent 
[evtNode=402ad641-2c18-416d-98fc-dbfd23358c06, topVer=0, nodeId8=402ad641, 
msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1593687826247]], 
val2=AffinityTopologyVersion [topVer=-7368274692200575106, minorTopVer=0]]] - 
PASSED{color}

{panel}
[TeamCity *-- Run :: All* 
Results|https://ci.ignite.apache.org/viewLog.html?buildId=5433905buildTypeId=IgniteTests24Java8_RunAll]

> Move control.sh to a separate module
> 
>
> Key: IGNITE-13123
> URL: https://issues.apache.org/jira/browse/IGNITE-13123
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Move [1] and its 

[jira] [Updated] (IGNITE-13151) Checkpointer code refactoring: extracting classes from GridCacheDatabaseSharedManager

2020-07-02 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13151:
---
Labels: IEP-47  (was: )

> Checkpointer code refactoring: extracting classes from 
> GridCacheDatabaseSharedManager
> -
>
> Key: IGNITE-13151
> URL: https://issues.apache.org/jira/browse/IGNITE-13151
> Project: Ignite
>  Issue Type: Sub-task
>  Components: persistence
>Reporter: Sergey Chugunov
>Assignee: Anton Kalashnikov
>Priority: Major
>  Labels: IEP-47
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Checkpointer is at the center of Ignite persistence subsystem and more people 
> from the community understand it the better means it is more stable and more 
> efficient.
> However for now checkpointer code sits inside of 
> GridCacheDatabaseSharedManager class and is entangled with this higher-level 
> and more general component.
> To take a step forward to more modular checkpointer we need to do two things:
>  # Move checkpointer code outside database manager to a separate class. 
> (That's what this ticket is about.)
>  # Create a well-defined API of checkpointer that will allow us to create new 
> implementations of checkpointer in the future. An example of this is new 
> checkpointer implementation needed for defragmentation feature purposes. 
> (Should be done in a separate ticket)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13190) Core defragmentation functions

2020-07-02 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13190:
---
Labels: IEP-47  (was: )

> Core defragmentation functions
> --
>
> Key: IGNITE-13190
> URL: https://issues.apache.org/jira/browse/IGNITE-13190
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Sergey Chugunov
>Priority: Major
>  Labels: IEP-47
>
> The following set of functions covering defragmentation happy-case needed:
>  * Initialization of defragmentation manager when node is started in 
> maintenance mode.
>  * Information about partition files is gathered by defrag mgr.
>  * For each partition file corresponding file of defragmented partition is 
> created and initialized.
>  * Keys are transferred from old partitions to new partitions.
>  * Checkpointer is aware of new partition files and flushes defragmented 
> memory to new partition files.
>  
> No fault-tolerance code nor index defragmentation mappings are needed in this 
> task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13189) Maintenance mode switch and defragmentation process initialization

2020-07-02 Thread Ivan Bessonov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-13189:
---
Labels: IEP-47  (was: )

> Maintenance mode switch and defragmentation process initialization
> --
>
> Key: IGNITE-13189
> URL: https://issues.apache.org/jira/browse/IGNITE-13189
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Sergey Chugunov
>Assignee: Sergey Chugunov
>Priority: Major
>  Labels: IEP-47
>
> As described in IEP-47 defragmentation is performed when a node enters a 
> special mode called maintenance mode.
> Discussion on dev-list clarifies algorithm to enter maintenance mode:
>  # Special key is written to local metastorage.
>  # Node is restarted.
>  # Node observes the key on startup and enters maintenance mode.
> Node should be fully-functioning in that mode but should not join the rest of 
> the cluster and participate in any regular activity like handling cache 
> operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13206) Represent in the documenttion affection of several node addresses on failure detection.

2020-07-02 Thread Vladimir Steshin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-13206:
--
Ignite Flags: Docs Required  (was: Docs Required,Release Notes Required)

> Represent in the documenttion affection of several node addresses on failure 
> detection.
> ---
>
> Key: IGNITE-13206
> URL: https://issues.apache.org/jira/browse/IGNITE-13206
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Minor
>  Labels: iep-45
>
> Current TcpDiscoverySpi can prolong detection of node failure which has 
> several IP addresses. This happens because most of the timeouts like 
> failureDetectionTimeout, sockTimeout, ackTimeout work per address. Actual 
> failure detection delay is: failureDetectionTimeout*addressesNumber (1). And 
> the node addresses are sorted out consistently. This affection on failure 
> detection should be noted in the documentation.
> *1: addressesNumber - addresses number of next node in the ring.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13206) Represent in the documenttion affection of several node addresses on failure detection.

2020-07-02 Thread Vladimir Steshin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-13206:
--
Description: 
Current TcpDiscoverySpi can prolong detection of node failure which has several 
IP addresses. This happens because most of the timeouts like 
failureDetectionTimeout, sockTimeout, ackTimeout work per address. Actual 
failure detection delay is: failureDetectionTimeout*addressesNumber (1). And 
the node addresses are sorted out consistently. This affection on failure 
detection should be noted in the documentation.

*1: addressesNumber - addresses number of next node in the ring.

  was:
Current TcpDiscoverySpi can prolong detection of node failure which has several 
IP addresses. This happens because most of the timeouts like 
failureDetectionTimeout, sockTimeout, ackTimeout work per address. Actual 
failure detection delay is: failureDetectionTimeout*addressesNumber (1). And 
the node addresses are sorted out serially. This affection on failure detection 
should be noted in the documentation.

*1: addressesNumber - addresses number of next node in the ring.


> Represent in the documenttion affection of several node addresses on failure 
> detection.
> ---
>
> Key: IGNITE-13206
> URL: https://issues.apache.org/jira/browse/IGNITE-13206
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Minor
>  Labels: iep-45
>
> Current TcpDiscoverySpi can prolong detection of node failure which has 
> several IP addresses. This happens because most of the timeouts like 
> failureDetectionTimeout, sockTimeout, ackTimeout work per address. Actual 
> failure detection delay is: failureDetectionTimeout*addressesNumber (1). And 
> the node addresses are sorted out consistently. This affection on failure 
> detection should be noted in the documentation.
> *1: addressesNumber - addresses number of next node in the ring.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13205) Represent in logs, javadoc affection of several node addresses on failure detection.

2020-07-02 Thread Vladimir Steshin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-13205:
--
Description: 
Current TcpDiscoverySpi can prolong detection of node failure which has several 
IP addresses. This happens because most of the timeouts like 
_failureDetectionTimeout, sockTimeout, ackTimeout_ work per address. Actual 
failure detection delay is: _failureDetectionTimeout*addressesNumber_ (1). And 
the node addresses are sorted out consistently. This affection on failure 
detection should be noted in logs, javadocs.

*1:  addressesNumber - addresses number of next node in the ring.

  was:
Current TcpDiscoverySpi can prolong detection of node failure which has several 
IP addresses. This happens because most of the timeouts like 
_failureDetectionTimeout, sockTimeout, ackTimeout_ work per address. Actual 
failure detection delay is: _failureDetectionTimeout*addressesNumber_ (1). And 
the node addresses are sorted out serially. This affection on failure detection 
should be noted in logs, javadocs.

*1:  addressesNumber - addresses number of next node in the ring.


> Represent in logs, javadoc affection of several node addresses on failure 
> detection.
> 
>
> Key: IGNITE-13205
> URL: https://issues.apache.org/jira/browse/IGNITE-13205
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Minor
>  Labels: iep-45
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Current TcpDiscoverySpi can prolong detection of node failure which has 
> several IP addresses. This happens because most of the timeouts like 
> _failureDetectionTimeout, sockTimeout, ackTimeout_ work per address. Actual 
> failure detection delay is: _failureDetectionTimeout*addressesNumber_ (1). 
> And the node addresses are sorted out consistently. This affection on failure 
> detection should be noted in logs, javadocs.
> *1:  addressesNumber - addresses number of next node in the ring.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13205) Represent in logs, javadoc affection of several node addresses on failure detection.

2020-07-02 Thread Vladimir Steshin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-13205:
--
Priority: Minor  (was: Major)

> Represent in logs, javadoc affection of several node addresses on failure 
> detection.
> 
>
> Key: IGNITE-13205
> URL: https://issues.apache.org/jira/browse/IGNITE-13205
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Minor
>  Labels: iep-45
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Current TcpDiscoverySpi can prolong detection of node failure which has 
> several IP addresses. This happens because most of the timeouts like 
> _failureDetectionTimeout, sockTimeout, ackTimeout_ work per address. Actual 
> failure detection delay is: _failureDetectionTimeout*addressesNumber_ (1). 
> And the node addresses are sorted out serially. This affection on failure 
> detection should be noted in logs, javadocs.
> *1:  addressesNumber - addresses number of next node in the ring.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13206) Represent in the documenttion affection of several node addresses on failure detection.

2020-07-02 Thread Vladimir Steshin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-13206:
--
Labels: iep-45  (was: )

> Represent in the documenttion affection of several node addresses on failure 
> detection.
> ---
>
> Key: IGNITE-13206
> URL: https://issues.apache.org/jira/browse/IGNITE-13206
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Minor
>  Labels: iep-45
>
> Current TcpDiscoverySpi can prolong detection of node failure which has 
> several IP addresses. This happens because most of the timeouts like 
> failureDetectionTimeout, sockTimeout, ackTimeout work per address. Actual 
> failure detection delay is: failureDetectionTimeout*addressesNumber (1). And 
> the node addresses are sorted out serially. This affection on failure 
> detection should be noted in the documentation.
> *1: addressesNumber - addresses number of next node in the ring.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13206) Represent in the documenttion affection of several node addresses on failure detection.

2020-07-02 Thread Vladimir Steshin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-13206:
--
Description: 
Current TcpDiscoverySpi can prolong detection of node failure which has several 
IP addresses. This happens because most of the timeouts like 
failureDetectionTimeout, sockTimeout, ackTimeout work per address. Actual 
failure detection delay is: failureDetectionTimeout*addressesNumber (1). And 
the node addresses are sorted out serially. This affection on failure detection 
should be noted in the documentation.

*1: addressesNumber - addresses number of next node in the ring.
Summary: Represent in the documenttion affection of several node 
addresses on failure detection.  (was: Represent in the doc affection of 
several node addresses on failure detection.)

> Represent in the documenttion affection of several node addresses on failure 
> detection.
> ---
>
> Key: IGNITE-13206
> URL: https://issues.apache.org/jira/browse/IGNITE-13206
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Minor
>
> Current TcpDiscoverySpi can prolong detection of node failure which has 
> several IP addresses. This happens because most of the timeouts like 
> failureDetectionTimeout, sockTimeout, ackTimeout work per address. Actual 
> failure detection delay is: failureDetectionTimeout*addressesNumber (1). And 
> the node addresses are sorted out serially. This affection on failure 
> detection should be noted in the documentation.
> *1: addressesNumber - addresses number of next node in the ring.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13206) Represent in the doc affection of several node addresses on failure detection.

2020-07-02 Thread Vladimir Steshin (Jira)
Vladimir Steshin created IGNITE-13206:
-

 Summary: Represent in the doc affection of several node addresses 
on failure detection.
 Key: IGNITE-13206
 URL: https://issues.apache.org/jira/browse/IGNITE-13206
 Project: Ignite
  Issue Type: Improvement
Reporter: Vladimir Steshin
Assignee: Vladimir Steshin






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13193) Implement fallback to full partition rebalancing in case historical supplier failed to read all necessary data updates from WAL

2020-07-02 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150228#comment-17150228
 ] 

Vladislav Pyatkov commented on IGNITE-13193:


[~slava.koptilin] I left three comments in PR.

Please look at those.

> Implement fallback to full partition rebalancing in case historical supplier 
> failed to read all necessary data updates from WAL
> ---
>
> Key: IGNITE-13193
> URL: https://issues.apache.org/jira/browse/IGNITE-13193
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.8.1
>Reporter: Vyacheslav Koptilin
>Assignee: Vyacheslav Koptilin
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Historical rebalance may fail for several reasons:
> 1) WAL on supplier node is corrupted - the supplier will trigger a failure 
> handler in the current implementation.
> 2) After iteration over WAL demander node didn't receive all updates to make 
> MOVING partition up-to-date (resulting update counter didn't converge with 
> expected update counter of OWNING partition) - demander will silently ignore 
> lack of updates in the current implementation.
> Such behavior negatively affects the stability of the cluster: an 
> inappropriate state of historical WAL is not a reason to fail a supplier node.
> The more proper way to handle this scenario is:
>  - Either try to rebalance partition historically from another supplier
>  - Or use full partition rebalance for problem partition
> Once the supplier fails to provide data from part of the WAL, its 
> corresponding sequence of checkpoints should be marked as inapplicable for 
> historical rebalance in order to prevent further errors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13016) Fix backward checking of failed node.

2020-07-02 Thread Anton Vinogradov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-13016:
--
Fix Version/s: 2.9

> Fix backward checking of failed node.
> -
>
> Key: IGNITE-13016
> URL: https://issues.apache.org/jira/browse/IGNITE-13016
> Project: Ignite
>  Issue Type: Sub-task
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Major
>  Labels: iep-45
> Fix For: 2.9
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Backward node connection checking looks wierd. What might be improved are:
> 1) Addresses checking could be done in parrallel, not serializably
> {code:java}
> for (InetSocketAddress addr : nodeAddrs) {
> // Connection refused may be got if node doesn't listen
> // (or blocked by firewall, but anyway assume it is dead).
> if (!isConnectionRefused(addr)) {
> liveAddr = addr;
> break;
> }
> }
> {code}
> 2) Any io-exception should be considered as failed connection, not only 
> connection-refused:
> {code:java}
> catch (ConnectException e) {
> return true;
> }
> catch (IOException e) {
> return false;
> }
> {code}
> 3) Timeout on connection checking should not be constand or hardcoced:
> {code:java}
> sock.connect(addr, 100);
> {code}
> 4) Decision to check connection should rely on configured exchange timeout, 
> no on the ping interval
> {code:java}
> // We got message from previous in less than double connection check interval.
> boolean ok = rcvdTime + U.millisToNanos(connCheckInterval) * 2 >= now;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13134) Fix connection recovery timeout.

2020-07-02 Thread Anton Vinogradov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-13134:
--
Fix Version/s: 2.9

> Fix connection recovery timeout.
> 
>
> Key: IGNITE-13134
> URL: https://issues.apache.org/jira/browse/IGNITE-13134
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.8.1
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Major
>  Labels: iep-45
> Fix For: 2.9
>
> Attachments: IGNITE-130134-patch.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If node experiences connection issues it must establish new connection or 
> fail within failureDetectionTimeout + connectionRecoveryTimout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13134) Fix connection recovery timeout.

2020-07-02 Thread Anton Vinogradov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Vinogradov updated IGNITE-13134:
--
Summary: Fix connection recovery timeout.  (was: Fix connection recovery 
timout.)

> Fix connection recovery timeout.
> 
>
> Key: IGNITE-13134
> URL: https://issues.apache.org/jira/browse/IGNITE-13134
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.8.1
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Major
>  Labels: iep-45
> Attachments: IGNITE-130134-patch.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If node experiences connection issues it must establish new connection or 
> fail within failureDetectionTimeout + connectionRecoveryTimout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13205) Represent in logs, javadoc affection of several node addresses on failure detection.

2020-07-02 Thread Vladimir Steshin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-13205:
--
Labels: iep-45  (was: )

> Represent in logs, javadoc affection of several node addresses on failure 
> detection.
> 
>
> Key: IGNITE-13205
> URL: https://issues.apache.org/jira/browse/IGNITE-13205
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Major
>  Labels: iep-45
>
> Current TcpDiscoverySpi can prolong detection of node failure which has 
> several IP addresses. This happens because most of the timeouts like 
> _failureDetectionTimeout, sockTimeout, ackTimeout_ work per address. Actual 
> failure detection delay is: _failureDetectionTimeout*addressesNumber_ (1). 
> And the node addresses are sorted out serially. This affection on failure 
> detection should be noted in logs, javadocs.
> *1:  addressesNumber - addresses number of next node in the ring.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13205) Represent in logs, javadoc affection of several node addresses on failure detection.

2020-07-02 Thread Vladimir Steshin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-13205:
--
Description: 
Current TcpDiscoverySpi can prolong detection of node failure which has several 
IP addresses. This happens because most of the timeouts like 
_failureDetectionTimeout, sockTimeout, ackTimeout_ work per address. Actual 
failure detection delay is: _failureDetectionTimeout*addressesNumber_ (1). And 
the node addresses are sorted out serially. This affection on failure detection 
should be noted in logs, javadocs.

*1:  addressesNumber - addresses number of next node in the ring.

  was:
Current TcpDiscoverySpi can prolong detection of node failure which has several 
IP addresses. This happens because most of the timeouts like 
_failureDetectionTimeout, sockTimeout, ackTimeout_ work per address. Actual 
failure detection delay is: _failureDetectionTimeout*addressesNumber_ (1). And 
the node addresses are sorted out serially. This affection on failure detection 
should be noted in logs, javadocs.

*1:  addressesNumber - addresses number of next node.


> Represent in logs, javadoc affection of several node addresses on failure 
> detection.
> 
>
> Key: IGNITE-13205
> URL: https://issues.apache.org/jira/browse/IGNITE-13205
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Major
>
> Current TcpDiscoverySpi can prolong detection of node failure which has 
> several IP addresses. This happens because most of the timeouts like 
> _failureDetectionTimeout, sockTimeout, ackTimeout_ work per address. Actual 
> failure detection delay is: _failureDetectionTimeout*addressesNumber_ (1). 
> And the node addresses are sorted out serially. This affection on failure 
> detection should be noted in logs, javadocs.
> *1:  addressesNumber - addresses number of next node in the ring.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13205) Represent in logs, javadoc affection of several node addresses on failure detection.

2020-07-02 Thread Vladimir Steshin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-13205:
--
Description: 
Current TcpDiscoverySpi can prolong detection of node failure which has several 
IP addresses. This happens because most of the timeouts like 
_failureDetectionTimeout, sockTimeout, ackTimeout_ work per address. Actual 
failure detection delay is: _failureDetectionTimeout*addressesNumber_ (1). And 
the node addresses are sorted out serially. This affection on failure detection 
should be noted in logs, javadocs.

*1:  addressesNumber - addresses number of next node.

  was:Current TcpDiscoverySpi can prolong detection of node failure which has 
several IP addresses. This happens because most of the timeouts like 
failureDetectionTimeout, sockTimeout, ackTimeout work per address. And the node 
addresses are sorted out serially. This affection on failure detection should 
be noted in logs, javadocs.


> Represent in logs, javadoc affection of several node addresses on failure 
> detection.
> 
>
> Key: IGNITE-13205
> URL: https://issues.apache.org/jira/browse/IGNITE-13205
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Major
>
> Current TcpDiscoverySpi can prolong detection of node failure which has 
> several IP addresses. This happens because most of the timeouts like 
> _failureDetectionTimeout, sockTimeout, ackTimeout_ work per address. Actual 
> failure detection delay is: _failureDetectionTimeout*addressesNumber_ (1). 
> And the node addresses are sorted out serially. This affection on failure 
> detection should be noted in logs, javadocs.
> *1:  addressesNumber - addresses number of next node.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IGNITE-13123) Move control.sh to a separate module

2020-07-02 Thread Kirill Tkalenko (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150147#comment-17150147
 ] 

Kirill Tkalenko edited comment on IGNITE-13123 at 7/2/20, 10:26 AM:


https://ci.ignite.apache.org/viewLog.html?buildId=5433549;
https://ci.ignite.apache.org/viewLog.html?buildId=5433671;


was (Author: ktkale...@gridgain.com):
https://ci.ignite.apache.org/viewLog.html?buildId=5433549;


> Move control.sh to a separate module
> 
>
> Key: IGNITE-13123
> URL: https://issues.apache.org/jira/browse/IGNITE-13123
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Move [1] and its associated classes to a separate "ignite-control-utility" 
> module.
> [1] - org.apache.ignite.internal.commandline.CommandHandler



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13205) Represent in logs, javadoc affection of several node addresses on failure detection.

2020-07-02 Thread Vladimir Steshin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-13205:
--
Description: Current TcpDiscoverySpi can prolong detection of node failure 
which has several IP addresses. This happens because most of the timeouts like 
failureDetectionTimeout, sockTimeout, ackTimeout work per address. And the node 
addresses are sorted out serially. This affection on failure detection should 
be noted in logs, javadocs.  (was: Current TcpDiscoverySpi can prolong 
detection of node failure which has several IP addresses. This happens because 
most of the timeouts like failureDetectionTimeout, sockTimeout, ackTimeout 
works per address. And the node addresses are sorted out serially. This 
affection on failure detection should be noted in logs, javadocs.)

> Represent in logs, javadoc affection of several node addresses on failure 
> detection.
> 
>
> Key: IGNITE-13205
> URL: https://issues.apache.org/jira/browse/IGNITE-13205
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Major
>
> Current TcpDiscoverySpi can prolong detection of node failure which has 
> several IP addresses. This happens because most of the timeouts like 
> failureDetectionTimeout, sockTimeout, ackTimeout work per address. And the 
> node addresses are sorted out serially. This affection on failure detection 
> should be noted in logs, javadocs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13205) Represent in logs, javadoc affection of several node addresses on failure detection.

2020-07-02 Thread Vladimir Steshin (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Steshin updated IGNITE-13205:
--
Ignite Flags:   (was: Docs Required,Release Notes Required)

> Represent in logs, javadoc affection of several node addresses on failure 
> detection.
> 
>
> Key: IGNITE-13205
> URL: https://issues.apache.org/jira/browse/IGNITE-13205
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladimir Steshin
>Assignee: Vladimir Steshin
>Priority: Major
>
> Current TcpDiscoverySpi can prolong detection of node failure which has 
> several IP addresses. This happens because most of the timeouts like 
> failureDetectionTimeout, sockTimeout, ackTimeout works per address. And the 
> node addresses are sorted out serially. This affection on failure detection 
> should be noted in logs, javadocs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13205) Represent in logs, javadoc affection of several node addresses on failure detection.

2020-07-02 Thread Vladimir Steshin (Jira)
Vladimir Steshin created IGNITE-13205:
-

 Summary: Represent in logs, javadoc affection of several node 
addresses on failure detection.
 Key: IGNITE-13205
 URL: https://issues.apache.org/jira/browse/IGNITE-13205
 Project: Ignite
  Issue Type: Improvement
Reporter: Vladimir Steshin
Assignee: Vladimir Steshin


Current TcpDiscoverySpi can prolong detection of node failure which has several 
IP addresses. This happens because most of the timeouts like 
failureDetectionTimeout, sockTimeout, ackTimeout works per address. And the 
node addresses are sorted out serially. This affection on failure detection 
should be noted in logs, javadocs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13123) Move control.sh to a separate module

2020-07-02 Thread Kirill Tkalenko (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150147#comment-17150147
 ] 

Kirill Tkalenko commented on IGNITE-13123:
--

https://ci.ignite.apache.org/viewLog.html?buildId=5433549;


> Move control.sh to a separate module
> 
>
> Key: IGNITE-13123
> URL: https://issues.apache.org/jira/browse/IGNITE-13123
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Kirill Tkalenko
>Assignee: Kirill Tkalenko
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Move [1] and its associated classes to a separate "ignite-control-utility" 
> module.
> [1] - org.apache.ignite.internal.commandline.CommandHandler



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-12845) GridNioServer can infinitely lose some events

2020-07-02 Thread Alexey Goncharuk (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150146#comment-17150146
 ] 

Alexey Goncharuk commented on IGNITE-12845:
---

Looks good to me, thanks!

> GridNioServer can infinitely lose some events 
> --
>
> Key: IGNITE-12845
> URL: https://issues.apache.org/jira/browse/IGNITE-12845
> Project: Ignite
>  Issue Type: Bug
>Reporter: Aleksey Plekhanov
>Assignee: Aleksey Plekhanov
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> With enabled optimization (IGNITE_NO_SELECTOR_OPTS = false, by default) 
> {{GridNioServer}} can lose some events for a channel (depending on JDK 
> version and OS). It can lead to connected applications hang. Reproducer: 
> {code:java}
> public void testConcurrentLoad() throws Exception {
> startGrid(0);
> try (IgniteClient client = Ignition.startClient(new 
> ClientConfiguration().setAddresses("127.0.0.1:10800"))) {
> ClientCache cache = 
> client.getOrCreateCache(DEFAULT_CACHE_NAME);
> GridTestUtils.runMultiThreaded(
> () -> {
> for (int i = 0; i < 1000; i++)
> cache.put(i, i);
> }, 5, "run-async");
> }
> }
> {code}
> This reproducer hangs eventually on MacOS (tested with JDK 8, 11, 12, 13, 
> 14), hangs on Windows with some JDK versions (tested with JDK 11, 14), but 
> passes on Windows with JDK 8, Linux systems, or when system property 
> {{IGNITE_NO_SELECTOR_OPTS = true}} is set.
> The root cause: optimized {{SelectedSelectionKeySet}} always returns 
> {{false}} for {{contains()}} method. The {{contains()}} method used by 
> {{sun.nio.ch.SelectorImpl.processReadyEvents()}} method:
> {code:java}
> if (selectedKeys.contains(ski)) {
> if (ski.translateAndUpdateReadyOps(rOps)) {
> return 1;
> }
> } else {
> ski.translateAndSetReadyOps(rOps);
> if ((ski.nioReadyOps() & ski.nioInterestOps()) != 0) {
> selectedKeys.add(ski);
> return 1;
> }
> }
> {code}
> So, for fair implementation, if a selection key is contained in the selected 
> keys set, then ready operations flags are updated, but for 
> {{SelectedSelectionKeySet}} ready operations flags will be always overridden 
> and new selector key will be added even if it's already contained in the set. 
> Some {{SelectorImpl}} implementations can pass several events for one 
> selector key to {{processReadyEvents}} method (for example, MacOs 
> implementation {{KQueueSelectorImpl}} works in such a way). In this case, 
> duplicated selector keys will be added to {{selectedKeys}} and all events 
> except last will be lost.
> Two bad things happen in {{GridNioServer}} due to described above reasons:
>  # Some event flags are lost and the worker doesn't process corresponding 
> action (for attached reproducer "channel is ready for reading" event is lost 
> and the workers never read the channel after some point in time).
>  # Duplicated selector keys with the same event flags (for attached 
> reproducer it's "channel is ready for writing" event, this duplication leads 
> to wrong processing of {{GridSelectorNioSessionImpl#procWrite}} flag, which 
> will be {{false}} in some cases, but at the same time selector key's 
> {{interestedOps}} will contain {{OP_WRITE}} operation and this operation 
> never be excluded) 
> Possible solutions:
>  * Fair implementation of {{SelectedSelectionKeySet.contains}} method (this 
> will solve all problems but can be resource consuming)
>  * Always set {{GridSelectorNioSessionImpl#procWrite}} to {{true}} when 
> adding {{OP_WRITE}} to {{interestedOps}} (for example in 
> {{AbstractNioClientWorker.registerWrite()}} method). In this case, some 
> "channel is ready for reading" events (but not data) still can be lost, but 
> not infinitely, and eventually data will be read. If events will be reordered 
> (first "channel is ready for writing", after it "channel is ready for 
> reading") then write to the channel will be only processed after all data 
> will be read.
>  * Exclude {{OP_WRITE}} from {{interestedOps}} even if 
> {{GridSelectorNioSessionImpl#procWrite}} is {{false}} when there are no write 
> requests in the queue (see {{GridNioServer.stopPollingForWrite()}} method). 
> This solution has the same shortcomings as the previous one. 
>  * Hybrid approach. Use some probabilistic implementation for {{contains}} 
> method (bloom filter or just check the last element) and use one of two 
> previous solutions as a workaround, for cases when we incorrectly return 
> {{false}} for {{contains}}. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13191) Public-facing API for "waiting for backups on shutdown"

2020-07-02 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150099#comment-17150099
 ] 

Vladislav Pyatkov commented on IGNITE-13191:


[~ivan.glukos] Please look at this patch.

> Public-facing API for "waiting for backups on shutdown"
> ---
>
> Key: IGNITE-13191
> URL: https://issues.apache.org/jira/browse/IGNITE-13191
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We should introduce "should wait for backups on shutdown" flag in Ignition 
> and/or IgniteConfiguration.
> Maybe we should do the same to "cancel compute tasks" flag.
> Also make sure that we can shut down node explicitly, overriding this flag 
> but without JVM termination.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (IGNITE-13191) Public-facing API for "waiting for backups on shutdown"

2020-07-02 Thread Vladislav Pyatkov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladislav Pyatkov reopened IGNITE-13191:


Mistakenly resolved

> Public-facing API for "waiting for backups on shutdown"
> ---
>
> Key: IGNITE-13191
> URL: https://issues.apache.org/jira/browse/IGNITE-13191
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We should introduce "should wait for backups on shutdown" flag in Ignition 
> and/or IgniteConfiguration.
> Maybe we should do the same to "cancel compute tasks" flag.
> Also make sure that we can shut down node explicitly, overriding this flag 
> but without JVM termination.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13204) Java Thin client Kubernetes discovery

2020-07-02 Thread Alexandr (Jira)
Alexandr created IGNITE-13204:
-

 Summary: Java Thin client Kubernetes discovery
 Key: IGNITE-13204
 URL: https://issues.apache.org/jira/browse/IGNITE-13204
 Project: Ignite
  Issue Type: New Feature
  Components: platforms
Reporter: Alexandr


Thin clients should be able to discover servers from within Kubernetes pod 
through k8s API, without specifying any IP addresses.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13202) Javadoc HTML can't be generated correctly with maven-javadoc-plugin on JDK 11+

2020-07-02 Thread Aleksey Plekhanov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Plekhanov updated IGNITE-13202:
---
Description: 
Javadoc utility has some bugs which don't allow to build Ignite Javadocs 
correctly.

Building javadoc under JDK 11+ throws an error "The code being documented uses 
modules but the packages defined in
 [https://docs.oracle.com/javase/8/docs/api] are in the unnamed module". To 
workaround this error argument "source=1.8" can be specified, but there is 
another bug related to "source" and "subpackages" argument usages: 
[https://bugs.openjdk.java.net/browse/JDK-8193030.] We still can build javadoc 
with disabled {{detectJavaApiLink}} maven-javadoc-plugin option, but in this 
case there will be no references to Java API from Ignite Javadoc.  

Also, there is a bug with "-exclude" argument in JDK 11+, it doesn't exclude 
subpackages of specified to exclude packages, so in generated output there is a 
lot of javadocs for internal packages. 

Javadoc generation command: {{mvn initialize -Pjavadoc}}

  was:
Javadoc utility has some bugs which don't allow to build Ignite Javadocs 
correctly.

Building javadoc under JDK 11+ throws an error "The code being documented uses 
modules but the packages defined in
 [https://docs.oracle.com/javase/8/docs/api] are in the unnamed module". To 
workaround this error argument "source=1.8" can be specified, but there is 
another bug related to "source" and "subpackages" argument usages: 
[https://bugs.openjdk.java.net/browse/JDK-8193030.] We still can build javadoc 
with disabled {{detectJavaApiLink}} maven-javadoc-plugin option, but in this 
case there will be no references to Java API from Ignite Javadoc.  

Also, there is a bug with "-exclude" argument in JDK 11+, it doesn't exclude 
subpackages of specified to exclude packages, so in generated output there is a 
lot of javadocs for internal packages. 


> Javadoc HTML can't be generated correctly with maven-javadoc-plugin on JDK 11+
> --
>
> Key: IGNITE-13202
> URL: https://issues.apache.org/jira/browse/IGNITE-13202
> Project: Ignite
>  Issue Type: Bug
>Reporter: Aleksey Plekhanov
>Priority: Major
>
> Javadoc utility has some bugs which don't allow to build Ignite Javadocs 
> correctly.
> Building javadoc under JDK 11+ throws an error "The code being documented 
> uses modules but the packages defined in
>  [https://docs.oracle.com/javase/8/docs/api] are in the unnamed module". To 
> workaround this error argument "source=1.8" can be specified, but there is 
> another bug related to "source" and "subpackages" argument usages: 
> [https://bugs.openjdk.java.net/browse/JDK-8193030.] We still can build 
> javadoc with disabled {{detectJavaApiLink}} maven-javadoc-plugin option, but 
> in this case there will be no references to Java API from Ignite Javadoc.  
> Also, there is a bug with "-exclude" argument in JDK 11+, it doesn't exclude 
> subpackages of specified to exclude packages, so in generated output there is 
> a lot of javadocs for internal packages. 
> Javadoc generation command: {{mvn initialize -Pjavadoc}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13202) Javadoc HTML can't be generated correctly with maven-javadoc-plugin on JDK 11+

2020-07-02 Thread Aleksey Plekhanov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Plekhanov updated IGNITE-13202:
---
Description: 
Javadoc utility has some bugs which don't allow to build Ignite Javadocs 
correctly.

Building javadoc under JDK 11+ throws an error "The code being documented uses 
modules but the packages defined in
 [https://docs.oracle.com/javase/8/docs/api] are in the unnamed module". To 
workaround this error argument "source=1.8" can be specified, but there is 
another bug related to "source" and "subpackages" argument usages: 
[https://bugs.openjdk.java.net/browse/JDK-8193030.] We still can build javadoc 
with disabled {{detectJavaApiLink}} maven-javadoc-plugin option, but in this 
case there will be no references to Java API from Ignite Javadoc.  

Also, there is a bug with "-exclude" argument in JDK 11+, it doesn't exclude 
subpackages of specified to exclude packages, so in generated output there is a 
lot of javadocs for internal packages. 

  was:
Javadoc utility has some bugs which don't allow to build Ignite Javadocs 
correctly ({{mvn initialize -Pjavadoc}}).

Building javadoc under JDK 11+ throws an error "The code being documented uses 
modules but the packages defined in
 [https://docs.oracle.com/javase/8/docs/api] are in the unnamed module". To 
workaround this error argument "source=1.8" can be specified, but there is 
another bug related to "source" and "subpackages" argument usages: 
[https://bugs.openjdk.java.net/browse/JDK-8193030.] We still can build javadoc 
with disabled {{detectJavaApiLink}} maven-javadoc-plugin option, but in this 
case there will be no references to Java API from Ignite Javadoc.  

Also, there is a bug with "-exclude" argument in JDK 11+, it doesn't exclude 
subpackages of specified to exclude packages, so in generated output there is a 
lot of javadocs for internal packages. 


> Javadoc HTML can't be generated correctly with maven-javadoc-plugin on JDK 11+
> --
>
> Key: IGNITE-13202
> URL: https://issues.apache.org/jira/browse/IGNITE-13202
> Project: Ignite
>  Issue Type: Bug
>Reporter: Aleksey Plekhanov
>Priority: Major
>
> Javadoc utility has some bugs which don't allow to build Ignite Javadocs 
> correctly.
> Building javadoc under JDK 11+ throws an error "The code being documented 
> uses modules but the packages defined in
>  [https://docs.oracle.com/javase/8/docs/api] are in the unnamed module". To 
> workaround this error argument "source=1.8" can be specified, but there is 
> another bug related to "source" and "subpackages" argument usages: 
> [https://bugs.openjdk.java.net/browse/JDK-8193030.] We still can build 
> javadoc with disabled {{detectJavaApiLink}} maven-javadoc-plugin option, but 
> in this case there will be no references to Java API from Ignite Javadoc.  
> Also, there is a bug with "-exclude" argument in JDK 11+, it doesn't exclude 
> subpackages of specified to exclude packages, so in generated output there is 
> a lot of javadocs for internal packages. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13202) Javadoc HTML can't be generated correctly with maven-javadoc-plugin on JDK 11+

2020-07-02 Thread Aleksey Plekhanov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Plekhanov updated IGNITE-13202:
---
Description: 
Javadoc utility has some bugs which don't allow to build Ignite Javadocs 
correctly ({{mvn initialize -Pjavadoc}}).

Building javadoc under JDK 11+ throws an error "The code being documented uses 
modules but the packages defined in
 [https://docs.oracle.com/javase/8/docs/api] are in the unnamed module". To 
workaround this error argument "source=1.8" can be specified, but there is 
another bug related to "source" and "subpackages" argument usages: 
[https://bugs.openjdk.java.net/browse/JDK-8193030.] We still can build javadoc 
with disabled {{detectJavaApiLink}} maven-javadoc-plugin option, but in this 
case there will be no references to Java API from Ignite Javadoc.  

Also, there is a bug with "-exclude" argument in JDK 11+, it doesn't exclude 
subpackages of specified to exclude packages, so in generated output there is a 
lot of javadocs for internal packages. 

  was:
Javadoc utility has some bugs which don't allow to build Ignite Javadocs 
correctly.

Building javadoc under JDK 11+ throws an error "The code being documented uses 
modules but the packages defined in
https://docs.oracle.com/javase/8/docs/api are in the unnamed module". To 
workaround this error argument "source=1.8" can be specified, but there is 
another bug related to "source" and "subpackages" argument usages: 
[https://bugs.openjdk.java.net/browse/JDK-8193030.] We still can build javadoc 
with disabled {{detectJavaApiLink}} maven-javadoc-plugin option, but in this 
case there will be no references to Java API from Ignite Javadoc.  

Also, there is a bug with "-exclude" argument in JDK 11+, it doesn't exclude 
subpackages of specified to exclude packages, so in generated output there is a 
lot of javadocs for internal packages. 


> Javadoc HTML can't be generated correctly with maven-javadoc-plugin on JDK 11+
> --
>
> Key: IGNITE-13202
> URL: https://issues.apache.org/jira/browse/IGNITE-13202
> Project: Ignite
>  Issue Type: Bug
>Reporter: Aleksey Plekhanov
>Priority: Major
>
> Javadoc utility has some bugs which don't allow to build Ignite Javadocs 
> correctly ({{mvn initialize -Pjavadoc}}).
> Building javadoc under JDK 11+ throws an error "The code being documented 
> uses modules but the packages defined in
>  [https://docs.oracle.com/javase/8/docs/api] are in the unnamed module". To 
> workaround this error argument "source=1.8" can be specified, but there is 
> another bug related to "source" and "subpackages" argument usages: 
> [https://bugs.openjdk.java.net/browse/JDK-8193030.] We still can build 
> javadoc with disabled {{detectJavaApiLink}} maven-javadoc-plugin option, but 
> in this case there will be no references to Java API from Ignite Javadoc.  
> Also, there is a bug with "-exclude" argument in JDK 11+, it doesn't exclude 
> subpackages of specified to exclude packages, so in generated output there is 
> a lot of javadocs for internal packages. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IGNITE-13203) Improve OptimizedObjectOutputStream serialisation performance

2020-07-02 Thread Jira
Manuel Núñez created IGNITE-13203:
-

 Summary: Improve OptimizedObjectOutputStream serialisation 
performance
 Key: IGNITE-13203
 URL: https://issues.apache.org/jira/browse/IGNITE-13203
 Project: Ignite
  Issue Type: Improvement
  Components: binary
Affects Versions: 2.8.1, 2.7.6, 2.7.5, 2.8, 2.7, 2.6, 2.5, 2.4
Reporter: Manuel Núñez
 Fix For: 2.9


1. OptimizedObjectOutputStream -> writeObject0: Strings bigger than 4K reduce 
serialisation performance ~x2 respect JDK marshaller; proposal, use JDK 
marshaller for Strings bigger than 4K
{code:java}
 private void writeObject0(Object obj) throws IOException {
curObj = null;
curFields = null;
curPut = null;

if (obj == null)
writeByte(NULL);
else {
boolean jdkStringWrite = false;

// fix string bigger than 4K use JDK maschaller to improve 
performance
if (obj instanceof String){
jdkStringWrite = (((String)obj).length() > 4096);
}

if ((jdkStringWrite || obj instanceof Throwable) && !(obj 
instanceof Externalizable) || U.isEnum(obj.getClass())) {
// Avoid problems with differing Enum objects or Enum 
implementation class deadlocks.
writeByte(JDK);

...
{code}


2. OptimizedObjectOutputStream -> GridHandleTable: lookup performance can be 
improved  by caching hashes, especially for objects with high complexity and 
cycle references.

{code:java}
/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *  http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

package org.apache.ignite.internal.util;

import java.util.Arrays;

/**
 * Lightweight identity hash table which maps objects to integer handles,
 * assigned in ascending order.
 *
 */
public class GridHandleTable {
/** Number of mappings in table/next available handle. */
private int size;

/** Size threshold determining when to expand hash spine. */
private int threshold;

/** Factor for computing size threshold. */
private final float loadFactor;

/** Maps hash value -> candidate handle value. */
private int[] spine;

/** Maps handle value -> next candidate handle value. */
private int[] next;

/** Maps handle value -> associated object. */
private Object[] objs;

/** Maps handle object hash -> associated object hash */
private int[] objHashes;

/** */
private int[] spineEmpty;

/** */
private int[] nextEmpty;


/**
 * Creates new HandleTable with given capacity and load factor.
 *
 * @param initCap Initial capacity.
 * @param loadFactor Load factor.
 */
public GridHandleTable(int initCap, float loadFactor) {
this.loadFactor = loadFactor;

spine = new int[initCap];
next = new int[initCap];

objs = new Object[initCap];
objHashes = new int[initCap];

spineEmpty = new int[initCap];
nextEmpty = new int[initCap];

Arrays.fill(spineEmpty, -1);
Arrays.fill(nextEmpty, -1);

threshold = (int)(initCap * loadFactor);

clear();
}

/**
 * Looks up and returns handle associated with given object, or -1 if
 * no mapping found.
 *
 * @param obj Object.
 * @return Handle.
 */
public int lookup(Object obj) {

int objHash = hash(obj);

int idx = objHash % spine.length;

if (size > 0) {
for (int i = spine[idx]; i >= 0; i = next[i])
if (objs[i] == obj)
return i;
}

if (size >= next.length)
growEntries();

if (size >= threshold) {
growSpine();

idx = objHash % spine.length;
}

insert(obj, size, idx, objHash);

size++;

return -1;
}

/**
 * Resets table to its initial (empty) state.
 */
public void clear() {
System.arraycopy(spineEmpty, 0, spine, 0, spineEmpty.length);
System.arraycopy(nextEmpty, 0, next, 0, nextEmpty.length);

Arrays.fill(objs, null);

Arrays.fill(objHashes, 0);

size = 0;
}


[jira] [Commented] (IGNITE-12935) Disadvantages in log of historical rebalance

2020-07-02 Thread Vladislav Pyatkov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150064#comment-17150064
 ] 

Vladislav Pyatkov commented on IGNITE-12935:


[~ascherbakov] Could you please look at again?

> Disadvantages in log of historical rebalance
> 
>
> Key: IGNITE-12935
> URL: https://issues.apache.org/jira/browse/IGNITE-12935
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Vladislav Pyatkov
>Assignee: Vladislav Pyatkov
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> # Mention in the log only partitions for which there are no nodes that suit 
> as historical supplier
>  For these partitions, print minimal counter (since which we should perform 
> historical rebalancing) with corresponding node and maximum reserved counter 
> (since which cluster can perform historical rebalancing) with corresponding 
> node.
>  This will let us know:
>  ## Whether history was reserved at all
>  ## How much reserved history we lack to perform a historical rebalancing
>  ## I see resulting output like this:
> {noformat}
>  Historical rebalancing wasn't scheduled for some partitions:
>  History wasn't reserved for: [list of partitions and groups]
>  History was reserved, but minimum present counter is less than maximum 
> reserved: [[grp=GRP, part=ID, minCntr=cntr, minNodeId=ID, maxReserved=cntr, 
> maxReservedNodeId=ID], ...]{noformat}
>  ## We can also aggregate previous message by (minNodeId) to easily find the 
> exact node (or nodes) which were the reason of full rebalance.
>  # Log results of {{reserveHistoryForExchange()}}. They can be compactly 
> represented as mappings: {{(grpId -> checkpoint (id, timestamp))}}. For every 
> group, also log message about why the previous checkpoint wasn't successfully 
> reserved.
>  There can be three reasons:
>  ## Previous checkpoint simply isn't present in the history (the oldest is 
> reserved)
>  ## WAL reservation failure (call below returned false)
> {code:java}
> chpEntry = entry(cpTs);
> boolean reserved = cctx.wal().reserve(chpEntry.checkpointMark());// If 
> checkpoint WAL history can't be reserved, stop searching. 
> if (!reserved) 
>   break;{code}
> ## Checkpoint was marked as inapplicable for historical rebalancing
> {code:java}
> for (Integer grpId : new HashSet<>(groupsAndPartitions.keySet()))
>    if (!isCheckpointApplicableForGroup(grpId, chpEntry))
>      groupsAndPartitions.remove(grpId);{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-11393) Create IgniteLinkTaglet.toString() implementation for Java9+

2020-07-02 Thread Aleksey Plekhanov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-11393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150036#comment-17150036
 ] 

Aleksey Plekhanov commented on IGNITE-11393:


[~amashenkov] or [~dpavlov], can you please review the patch in PR 7983?

I've additionally fixed some maven build issues and now release can be built by 
maven on JDK 11+:
{noformat}
mvn clean install -Pall-java,all-scala,licenses -DskipTests
mvn initialize -Pjavadoc
mvn initialize -Prelease{noformat}
Checked on JDK 8, JDK 11, JDK 14.

But there still some issues with Javadoc on JDK 11+ (see IGNITE-13202)

> Create IgniteLinkTaglet.toString() implementation for Java9+
> 
>
> Key: IGNITE-11393
> URL: https://issues.apache.org/jira/browse/IGNITE-11393
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Dmitry Pavlov
>Assignee: Aleksey Plekhanov
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> New implementation was added according to the new Java API for Javadoc.
> But the main method kept empty, need to implement toString() to process 
> IgniteLink annotation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-13185) API to change Cluster Tag and notify about change of Cluster Tag

2020-07-02 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-13185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17150026#comment-17150026
 ] 

Ignite TC Bot commented on IGNITE-13185:


{panel:title=Branch: [pull/7964/head] Base: [master] : No blockers 
found!|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}{panel}
{panel:title=Branch: [pull/7964/head] Base: [master] : New Tests 
(14)|borderStyle=dashed|borderColor=#ccc|titleBGColor=#D6F7C1}
{color:#8b}ZooKeeper (Discovery) 2{color} [tests 1]
* {color:#013220}ZookeeperDiscoverySpiTestSuite2: 
GridCommandHandlerTest.testClusterChangeTag - PASSED{color}

{color:#8b}Basic 3{color} [tests 5]
* {color:#013220}IgniteBasicWithPersistenceTestSuite: 
IgniteClusterIdTagTest.testTagChangedEvent - PASSED{color}
* {color:#013220}IgniteBasicWithPersistenceTestSuite: 
IgniteClusterIdTagTest.testChangeTagExceptions - PASSED{color}
* {color:#013220}IgniteBasicWithPersistenceTestSuite: 
IgniteClusterIdTagTest.testTagChangedEventMultinodeWithRemoteFilter - 
PASSED{color}
* {color:#013220}IgniteBasicWithPersistenceTestSuite: 
GridCommandHandlerWithSSLTest.testClusterChangeTag - PASSED{color}
* {color:#013220}IgniteBasicWithPersistenceTestSuite: 
GridCommandHandlerTest.testClusterChangeTag - PASSED{color}

{color:#8b}Service Grid{color} [tests 4]
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple 
[val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest 
[id=dd35e050371-f933b01a-9734-4d56-b9b6-519cdc0ac07d, reqs=SingletonList 
[ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent 
[evtNode=c41a7e5c-6f76-4c3c-81ca-214b68c544f2, topVer=0, nodeId8=c41a7e5c, 
msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1593517691865]], 
val2=AffinityTopologyVersion [topVer=-8832215235125097063, minorTopVer=0]]] - 
PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple 
[val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest 
[id=dd35e050371-f933b01a-9734-4d56-b9b6-519cdc0ac07d, reqs=SingletonList 
[ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent 
[evtNode=c41a7e5c-6f76-4c3c-81ca-214b68c544f2, topVer=0, nodeId8=c41a7e5c, 
msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1593517691865]], 
val2=AffinityTopologyVersion [topVer=-8832215235125097063, minorTopVer=0]]] - 
PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple 
[val1=DiscoveryEvent [evtNode=00d39de4-648f-4138-ab37-962935a1bb5a, topVer=0, 
nodeId8=a41e8cd2, msg=, type=NODE_JOINED, tstamp=1593517691865], 
val2=AffinityTopologyVersion [topVer=-2234418847054097440, minorTopVer=0]]] - 
PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple 
[val1=DiscoveryEvent [evtNode=00d39de4-648f-4138-ab37-962935a1bb5a, topVer=0, 
nodeId8=a41e8cd2, msg=, type=NODE_JOINED, tstamp=1593517691865], 
val2=AffinityTopologyVersion [topVer=-2234418847054097440, minorTopVer=0]]] - 
PASSED{color}

{color:#8b}Service Grid (legacy mode){color} [tests 4]
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple 
[val1=DiscoveryEvent [evtNode=c1773332-2e0d-439f-8e9c-526115490c21, topVer=0, 
nodeId8=6442d543, msg=, type=NODE_JOINED, tstamp=1593526018992], 
val2=AffinityTopologyVersion [topVer=6853082273767427558, minorTopVer=0]]] - 
PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple 
[val1=DiscoveryEvent [evtNode=c1773332-2e0d-439f-8e9c-526115490c21, topVer=0, 
nodeId8=6442d543, msg=, type=NODE_JOINED, tstamp=1593526018992], 
val2=AffinityTopologyVersion [topVer=6853082273767427558, minorTopVer=0]]] - 
PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.topologyVersion[Test event=IgniteBiTuple 
[val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest 
[id=4b36d850371-e81ca58f-3ca7-42cc-8275-58fef1b3ce90, reqs=SingletonList 
[ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent 
[evtNode=51c9e7cc-c4c8-421d-bfc1-7e5f05220a0e, topVer=0, nodeId8=51c9e7cc, 
msg=null, type=DISCOVERY_CUSTOM_EVT, tstamp=1593526018992]], 
val2=AffinityTopologyVersion [topVer=3428967372149071837, minorTopVer=0]]] - 
PASSED{color}
* {color:#013220}IgniteServiceGridTestSuite: 
ServiceDeploymentProcessIdSelfTest.requestId[Test event=IgniteBiTuple 
[val1=DiscoveryCustomEvent [customMsg=ServiceChangeBatchRequest 
[id=4b36d850371-e81ca58f-3ca7-42cc-8275-58fef1b3ce90, reqs=SingletonList 
[ServiceUndeploymentRequest []]], affTopVer=null, super=DiscoveryEvent 
[evtNode=51c9e7cc-c4c8-421d-bfc1-7e5f05220a0e, topVer=0, nodeId8=51c9e7cc, 
msg=null, type=DISCOVERY_CUSTOM_EVT, 

[jira] [Created] (IGNITE-13202) Javadoc HTML can't be generated correctly with maven-javadoc-plugin on JDK 11+

2020-07-02 Thread Aleksey Plekhanov (Jira)
Aleksey Plekhanov created IGNITE-13202:
--

 Summary: Javadoc HTML can't be generated correctly with 
maven-javadoc-plugin on JDK 11+
 Key: IGNITE-13202
 URL: https://issues.apache.org/jira/browse/IGNITE-13202
 Project: Ignite
  Issue Type: Bug
Reporter: Aleksey Plekhanov


Javadoc utility has some bugs which don't allow to build Ignite Javadocs 
correctly.

Building javadoc under JDK 11+ throws an error "The code being documented uses 
modules but the packages defined in
https://docs.oracle.com/javase/8/docs/api are in the unnamed module". To 
workaround this error argument "source=1.8" can be specified, but there is 
another bug related to "source" and "subpackages" argument usages: 
[https://bugs.openjdk.java.net/browse/JDK-8193030.] We still can build javadoc 
with disabled {{detectJavaApiLink}} maven-javadoc-plugin option, but in this 
case there will be no references to Java API from Ignite Javadoc.  

Also, there is a bug with "-exclude" argument in JDK 11+, it doesn't exclude 
subpackages of specified to exclude packages, so in generated output there is a 
lot of javadocs for internal packages. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-12897) Add .NET api to enabling SQL indexing for existing cache.

2020-07-02 Thread Ivan Daschinskiy (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Daschinskiy updated IGNITE-12897:
--
Ignite Flags:   (was: Docs Required,Release Notes Required)

> Add .NET api to enabling SQL indexing for existing cache.
> -
>
> Key: IGNITE-12897
> URL: https://issues.apache.org/jira/browse/IGNITE-12897
> Project: Ignite
>  Issue Type: Improvement
>  Components: platforms
>Reporter: Ivan Daschinskiy
>Assignee: Ivan Daschinskiy
>Priority: Minor
>  Labels: .NET
>
> Add .NET api to enabling SQL indexing for existing cache.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IGNITE-12897) Add .NET api to enabling SQL indexing for existing cache.

2020-07-02 Thread Ivan Daschinskiy (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Daschinskiy resolved IGNITE-12897.
---
Release Note: Because no public Java API exposed and enabling SQL is only 
available through SQL DDL (CREATE TABLE), this improvent won't be implemented.
  Resolution: Won't Do

> Add .NET api to enabling SQL indexing for existing cache.
> -
>
> Key: IGNITE-12897
> URL: https://issues.apache.org/jira/browse/IGNITE-12897
> Project: Ignite
>  Issue Type: Improvement
>  Components: platforms
>Reporter: Ivan Daschinskiy
>Assignee: Ivan Daschinskiy
>Priority: Minor
>  Labels: .NET
>
> Add .NET api to enabling SQL indexing for existing cache.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-12897) Add .NET api to enabling SQL indexing for existing cache.

2020-07-02 Thread Ivan Daschinskiy (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Daschinskiy updated IGNITE-12897:
--
Issue Type: Improvement  (was: Bug)

> Add .NET api to enabling SQL indexing for existing cache.
> -
>
> Key: IGNITE-12897
> URL: https://issues.apache.org/jira/browse/IGNITE-12897
> Project: Ignite
>  Issue Type: Improvement
>  Components: platforms
>Reporter: Ivan Daschinskiy
>Assignee: Ivan Daschinskiy
>Priority: Minor
>  Labels: .NET
>
> Add .NET api to enabling SQL indexing for existing cache.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (IGNITE-8120) Improve test coverage of rebalance failing

2020-07-02 Thread Ivan Daschinskiy (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Daschinskiy reassigned IGNITE-8120:


Assignee: (was: Ivan Daschinskiy)

> Improve test coverage of rebalance failing
> --
>
> Key: IGNITE-8120
> URL: https://issues.apache.org/jira/browse/IGNITE-8120
> Project: Ignite
>  Issue Type: Test
>  Components: general
>Affects Versions: 2.4
>Reporter: Ivan Daschinskiy
>Priority: Minor
>  Labels: test
> Fix For: 2.10
>
>
> Need to cover situation, when some archived wal segments, which are not 
> reserved by IgniteWriteAheadLogManager, are deleted during rebalancing or 
> were deleted before. However, rebalancing from WAL is broken. When fix 
> [IGNITE-8116|https://issues.apache.org/jira/browse/IGNITE-8116] is available, 
> it will be implemented.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (IGNITE-12901) SQL: Uncorrelated subquery should run only once.

2020-07-02 Thread Ivan Daschinskiy (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-12901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Daschinskiy reassigned IGNITE-12901:
-

Assignee: (was: Ivan Daschinskiy)

> SQL: Uncorrelated subquery should run only once.
> 
>
> Key: IGNITE-12901
> URL: https://issues.apache.org/jira/browse/IGNITE-12901
> Project: Ignite
>  Issue Type: Improvement
>Affects Versions: 2.8
>Reporter: Alexey Kukushkin
>Priority: Minor
>  Labels: sbcf
> Attachments: ignite-12901-subquery.patch
>
>
> Currently uncorrelated subqueries (where subquery is not depends on the outer 
> query) are executed on each nested loop iteration in the 
> org.h2.command.dml.Select#isConditionMet method. 
> We may avoid this, for example, using results caching.
> h2. Reproducer
> {code:java}
> public class SubQueryTest extends AbstractIndexingCommonTest {
> /** Keys counts at the RIGHT table. */
> private static final int RIGHT_CNT = 10;
> /** Keys counts at the LEFT table. */
> private static final int LEFT_CNT = 50;
> /** {@inheritDoc} */
> @SuppressWarnings("unchecked")
> @Override protected void beforeTest() throws Exception {
> super.beforeTest();
> startGrids(1);
> IgniteCache cacheA = grid(0).createCache(new CacheConfiguration Long>()
> .setName("A")
> .setSqlSchema("TEST")
> .setQueryEntities(Collections.singleton(new 
> QueryEntity(Long.class.getTypeName(), "A_VAL")
> .setTableName("A")
> .addQueryField("ID", Long.class.getName(), null)
> .addQueryField("JID", Long.class.getName(), null)
> .addQueryField("VAL", Long.class.getName(), null)
> .setKeyFieldName("ID")
> )));
> IgniteCache cacheB = grid(0).createCache(new CacheConfiguration()
> .setCacheMode(CacheMode.REPLICATED)
> .setName("B")
> .setSqlSchema("TEST")
> .setQueryEntities(Collections.singleton(new 
> QueryEntity(Long.class.getName(), "B_VAL")
> .setTableName("B")
> .addQueryField("ID", Long.class.getName(), null)
> .addQueryField("A_JID", Long.class.getName(), null)
> .addQueryField("VAL0", String.class.getName(), null)
> .setKeyFieldName("ID")
> )));
> Map batch = new HashMap<>();
> for (long i = 0; i < LEFT_CNT; ++i) {
> batch.put(i, grid(0).binary().builder("A_VAL")
> .setField("JID", i % RIGHT_CNT)
> .setField("VAL", i)
> .build());
> if (batch.size() > 1000) {
> cacheA.putAll(batch);
> batch.clear();
> }
> }
> if (batch.size() > 0) {
> cacheA.putAll(batch);
> batch.clear();
> }
> for (long i = 0; i < RIGHT_CNT; ++i)
> cacheB.put(i, grid(0).binary().builder("B_VAL")
> .setField("A_JID", i)
> .setField("VAL0", String.format("val%03d", i))
> .build());
> }
> /** {@inheritDoc} */
> @Override protected void afterTest() throws Exception {
> stopAllGrids();
> super.afterTest();
> }
> /**
>  * Test local query execution.
>  */
> @Test
> public void test() {
> sql(true, "SELECT * FROM A WHERE A.JID IN (SELECT A_JID FROM 
> B)").getAll();
> }
> /**
>  * @param enforceJoinOrder Enforce join order mode.
>  * @param sql SQL query.
>  * @param args Query parameters.
>  * @return Results cursor.
>  */
> private FieldsQueryCursor> sql(boolean enforceJoinOrder, String 
> sql, Object... args) {
> return grid(0).context().query().querySqlFields(new 
> SqlFieldsQuery(sql)
> .setSchema("TEST")
> .setLazy(true)
> .setEnforceJoinOrder(enforceJoinOrder)
> .setArgs(args), false);
> }
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)