?????? org.apache.ignite.IgniteCheckedException: Unknown page IO type: 0

2018-07-03 Thread NO
The problem of  nodes rejoining the cluster has been solved. The documentation 
about  control.sh is not perfect. I only modified the network-related  
parameters because the client access cluster often times out. The  
TcpCommunicationSpi.setDualSocketConnection(boolean) parameter is not  found in 
the code. Is the document labeling wrong?

For larger values, does the KV access mode have good suggestions for timeout 
access?






--  --
??: "Denis Mekhanikov";
: 2018??7??3??(??) 4:03
??: "user";

: Re: org.apache.ignite.IgniteCheckedException: Unknown page IO type: 0



If you want to bring your cluster into a valid state again, you can remove the 
WAL and db directories and restart nodes.But you will lose all data in this 
case, obviously.



What configuration properties did you change?


Denis


, 3 ??. 2018 ??. ?? 4:23,  <727418...@qq.com>:


The node has not happened fault, is I modify configuration problem appears 
after the restart, now I need to how to correct the nodes offline, and then 
again in the form of a new node to join the cluster? All node profiles for the 
cluster are the same.

 iPhone


?? 2018??7??300:16??Denis Mekhanikov  ??


Looks like your persistence files are corrupted.You configured LOG_ONLY WAL 
mode. It doesn't guarantee survival of OS crushes and power failures.
How did you restart your node?


Denis

, 2 ??. 2018 ??. ?? 16:40, NO <727418...@qq.com>:


When I restart the node, I get the following error, 
The problem persists after restarting the machine??



==

[2018-07-02T21:25:52,932][INFO 
][exchange-worker-#190][GridCacheDatabaseSharedManager] Read checkpoint status 
[startMarker=/data3/apache-ignite-persistence/node00-8c6172fa-0543-4b8d-937e-75ac27ba21ff/cp/1530535766680-f62c2aa7-4a26-45ad-b311-5b5e9ddc3f0e-START.bin,
 
endMarker=/data3/apache-ignite-persistence/node00-8c6172fa-0543-4b8d-937e-75ac27ba21ff/cp/1530535612596-2ccb2f7a-9578-44a7-ad29-ff5d6e990ae4-END.bin]
[2018-07-02T21:25:52,933][INFO 
][exchange-worker-#190][GridCacheDatabaseSharedManager] Checking memory state 
[lastValidPos=FileWALPointer [idx=845169, fileOff=32892207, len=7995], 
lastMarked=FileWALPointer [idx=845199, fileOff=43729777, len=7995], 
lastCheckpointId=f62c2aa7-4a26-45ad-b311-5b5e9ddc3f0e]
[2018-07-02T21:25:52,933][WARN 
][exchange-worker-#190][GridCacheDatabaseSharedManager] Ignite node stopped in 
the middle of checkpoint. Will restore memory state and finish checkpoint on 
node start.
[2018-07-02T21:25:52,949][INFO 
][grid-nio-worker-tcp-comm-0-#153][TcpCommunicationSpi] Accepted incoming 
communication connection [locAddr=/10.16.133.187:47100, 
rmtAddr=/10.16.133.186:22315]
[2018-07-02T21:25:53,131][INFO 
][grid-nio-worker-tcp-comm-1-#154][TcpCommunicationSpi] Accepted incoming 
communication connection [locAddr=/10.16.133.187:47100, 
rmtAddr=/10.16.133.185:32502]
[2018-07-02T21:25:56,112][ERROR][exchange-worker-#190][GridDhtPartitionsExchangeFuture]
 Failed to reinitialize local partitions (preloading will be stopped): 
GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=4917, 
minorTopVer=0], discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode 
[id=3c06c945-de21-4b7f-8830-344306327643, addrs=[10.16.133.187, 127.0.0.1], 
sockAddrs=[/127.0.0.1:47500, /10.16.133.187:47500], discPort=47500, order=4917, 
intOrder=2496, lastExchangeTime=1530537954950, loc=true, 
ver=2.4.0#20180305-sha1:aa342270, isClient=false], topVer=4917, 
nodeId8=3c06c945, msg=null, type=NODE_JOINED, tstamp=1530537952291], 
nodeId=3c06c945, evt=NODE_JOINED]
org.apache.ignite.IgniteCheckedException: Unknown page IO type: 0
at 
org.apache.ignite.internal.processors.cache.persistence.tree.io.PageIO.getBPlusIO(PageIO.java:567)
 ~[ignite-core-2.4.0.jar:2.4.0]
at 
org.apache.ignite.internal.processors.cache.persistence.tree.io.PageIO.getPageIO(PageIO.java:478)
 ~[ignite-core-2.4.0.jar:2.4.0]
at 
org.apache.ignite.internal.processors.cache.persistence.tree.io.PageIO.getPageIO(PageIO.java:438)
 ~[ignite-core-2.4.0.jar:2.4.0]
at 
org.apache.ignite.internal.pagemem.wal.record.delta.DataPageInsertFragmentRecord.applyDelta(DataPageInsertFragmentRecord.java:58)
 ~[ignite-core-2.4.0.jar:2.4.0]
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1967)
 ~[ignite-core-2.4.0.jar:2.4.0]
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1827)
 ~[ignite-core-2.4.0.jar:2.4.0]
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readCheckpointAndRestoreMemory(GridCacheDatabaseSharedManager.java:725)
 ~[ignite-core-2.4.0.jar:2.4.0]
at 

Re: Partition eviction failed, this can cause grid hang. (Caused by: java.lang.IllegalStateException: Failed to get page IO instance (page content is corrupted))

2018-07-03 Thread akurbanov
Hi Siva,

Could you share full Ignite logs and configurations for all nodes from your
case so we could find root cause of this issue and reproduce it?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Is it possible to configure Apache Ignite QueryCursor to be autocloseable in the xml configuration file?

2018-07-03 Thread tizh
Hi Igor, 

Thank you very much for the response. I have installed
apache-ignite-fabric-2.6.0.20180703-bin.zip and started a cluster  from the
list of nightly build here: 

https://ci.ignite.apache.org/viewLog.html?buildId=lastSuccessful=Releases_NightlyRelease_RunApacheIgniteNightlyRelease=artifacts=1

Then I ran my program again and got the same error: 


Thanks again, 
Best regards, 
Ti



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Metrics for IgniteDataStreamer

2018-07-03 Thread vbm
Hi

As part of our POC we wanted to compare the ingestion in to ignite using
Kafka Connect and Ignite Data Streamer.

For comparison, what are the metrics that we can monitor to measure the
performance. 


Regards,
Vishwas



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: SqlFieldsQuery Cannot create inner bean 'org.apache.ignite.cache.query.SqlFieldsQuery

2018-07-03 Thread Ilya Kasnacheev
Hello!

I'm not aware of such parameters.

Regards,

-- 
Ilya Kasnacheev

2018-07-03 19:18 GMT+03:00 ApacheUser :

> Thanks Ilya,
> Appreciate your help,
>
> Is there any parameter in COnfig file to control the number of rows or
> amount of resources a clinet connection can use and if exceeds disconnect?
>
> thanks
> Bhaskar
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Re: SqlFieldsQuery Cannot create inner bean 'org.apache.ignite.cache.query.SqlFieldsQuery

2018-07-03 Thread ApacheUser
Thanks Ilya,
Appreciate your help, 

Is there any parameter in COnfig file to control the number of rows or
amount of resources a clinet connection can use and if exceeds disconnect?

thanks
Bhaskar



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Deadlock during cache loading

2018-07-03 Thread breischl
Also, I probably should have mentioned this earlier but we're not using WAL
or any disk persistence. So everything should be in-memory, and generally
on-heap. I think that makes it less likely that we were blocked on plain
throughput of some hardware or virtual-hardware. 



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Why pageEvictionMode isn't work

2018-07-03 Thread Michaelikus
Same thing happens when i try to write single record.

private static void putGet(IgniteCache cache) throws
IgniteException {
Random rndGen = new Random();
final int keyCnt = 1;
int bulkNum;

for (int i = 0; i < keyCnt; i++){
bulkNum = rndGen.nextInt(1000);;
cache.put(i, "bulk-" + (i) + ".Value=" + (bulkNum * keyCnt +
i));
System.out.println(">>> Bulk #" + bulkNum + "["+ i + "] - stored
in cache.");
}
System.out.println(">>> Stored values in cache.");
}


Exception in thread "main"
org.apache.ignite.cache.CachePartialUpdateException: Failed to update keys
(retry update if possible).: [339]
at
org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1278)
at
org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.cacheException(IgniteCacheProxyImpl.java:1673)
at
org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.put(IgniteCacheProxyImpl.java:1029)
at
org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.put(GatewayProtectedCacheProxy.java:886)
at mlg.test.Main.putGet(Main.java:171)
at mlg.test.Main.main(Main.java:125)
Caused by: class
org.apache.ignite.internal.processors.cache.CachePartialUpdateCheckedException:
Failed to update keys (retry update if possible).: [339]
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.onPrimaryError(GridNearAtomicAbstractUpdateFuture.java:397)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.onPrimaryResponse(GridNearAtomicSingleUpdateFuture.java:253)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateResponse(GridDhtAtomicCache.java:3073)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$500(GridDhtAtomicCache.java:130)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:285)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:280)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1060)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
at
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555)
at
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1183)
at
org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
at
org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1090)
at
org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:505)
at java.lang.Thread.run(Thread.java:748)
Suppressed: class org.apache.ignite.IgniteCheckedException: Failed to
update keys on primary node.
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.UpdateErrors.addFailedKeys(UpdateErrors.java:124)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicUpdateResponse.addFailedKeys(GridNearAtomicUpdateResponse.java:342)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1785)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1628)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3055)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:130)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:266)
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:261)
... 12 more

Re: Why pageEvictionMode isn't work

2018-07-03 Thread Denis Mekhanikov
Hi!

Looks like page eviction works incorrectly, when you try to insert a lot of
data at once in a batch.
I created a JIRA ticket for this issue:
https://issues.apache.org/jira/browse/IGNITE-8917
Thank you for the report!

When I replaced putAll with singular puts, Ignite stopped failing with
IgniteOOM.
You can use this approach as a workaround.

Denis

вт, 3 июл. 2018 г. в 12:11, Michaelikus :

> Hiii everybody!
>
> There is a cluster 2.4 consists of 4 nodes with following data region and
> cache
>
> *DATA REGION:*
> 
>  class="org.apache.ignite.configuration.DataStorageConfiguration">
> 
>  class="org.apache.ignite.configuration.DataRegionConfiguration">
> 
>  value="#{1L*1024*1024*1024}"/>
>  value="RANDOM_2_LRU"/>
> 
> 
> 
> 
>
> 
> 
> 
> 
> 
>
>
> *CACHE:*
> 
> 
>  class="org.apache.ignite.configuration.CacheConfiguration">
>  value="InstagramUserId2UserName"/>
> 
> 
>  value="READ_WRITE_SAFE"/>
> 
>  value="NotPersistRegion"/>
> 
> 
> 
> 
> 
> 
> 
>
>
> *Java class which writes to cache:*
>
> private static void putAllGetAll(IgniteCache cache)
> throws IgniteException {
> System.out.println();
> System.out.println(">>> Starting putAll-getAll example.");
>
> final int keyCnt = 10;
>
> // Create batch.
> for (int bulkNum = 00; bulkNum < 1000; bulkNum++){
> Map batch = new HashMap<>();
>
> for (int i = 0; i < keyCnt; i++) {
> batch.put(bulkNum * keyCnt + i, "bulk-" + (bulkNum) +
> ".Value=" + (bulkNum * keyCnt + i));
> }
> System.out.println(">>> Bulk #" + bulkNum + " - prepared.");
>
> // Bulk-store entries in cache.
> cache.putAll(batch);
> System.out.println(">>> Bulk #" + bulkNum + " - stored in
> cache.");
> batch.clear();
> }
> }
>
> In parallel i've run task which read random keys from same cache:
>
> int keyCnt = 9000;
> while(true){
> rndKey = rndGen.nextInt(keyCnt);
> randomKeyFromCache = cache.get(rndKey);
> System.out.println("Got [key=" + rndKey  + ", val=" +
> randomKeyFromCache
> + ']');
> }
>
>
> *And error:*
>
> Exception in thread "main"
> org.apache.ignite.cache.CachePartialUpdateException: Failed to update keys
> (retry update if possible).: [23700331, 23700330, 23700332, 23700322,
> 23700345, 23700351, 23700337, 23700338, 23700342, 23700300]
> at
>
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.onPrimaryError(GridNearAtomicAbstractUpdateFuture.java:397)
> at
>
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicUpdateFuture.onPrimaryResponse(GridNearAtomicUpdateFuture.java:416)
> at
>
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateResponse(GridDhtAtomicCache.java:3073)
> at
>
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$500(GridDhtAtomicCache.java:130)
> at
>
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:285)
> at
>
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:280)
> at
>
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1060)
> at
>
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579)
> at
>
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
> at
>
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
> at
>
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
> at
>
> org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
> at
>
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555)
> at
>
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1183)
> at
>
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
> at
>
> 

Re: Re-post: java.io.IOException: Too many open files

2018-07-03 Thread ilya.kasnacheev
Hello!

I have tried to reproduce your case, but I don't observe any growth of
number of open file descriptors on Ignite side.

I think the problem here is on Go side. Please make sure to always close
connection if you open it. If your program is terminated, this is not so
strict, but if you create connections in a loop it can become a problem.

Also, socket:[2322160] is not necessarily a UNIX socket, it is most often a
TCP socket as well.

I also recommend changing '500 microseconds' to '500 milliseconds' because
there's not much you can expect to happen in half a millisecond or
one-2000th of a second. Especially when over a network.

Regards,



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Getting an exception when listing partitions of IgniteDataFrame

2018-07-03 Thread Ramakrishna Nalam
Hi,

Seeking help with someone on the group who has seen this error before, or
can help me figure out what it issue is.

I'm trying to get ignite's spark integration working with Spark 2.2.1.
The ignite version I'm using is 2.5.0. I'm running Ignite co-located with
the spark workers.

I have been able to successfully load data into Ignite using the snippet:

val igniteDf = spark.read .
format(IgniteDataFrameSettings.FORMAT_IGNITE) .
option(IgniteDataFrameSettings.OPTION_TABLE, "tmp_table") .
option(IgniteDataFrameSettings.OPTION_CONFIG_FILE,
"/home/hadoop/ignite-config.xml") .
load;

the command:

igniteDf.select("_c1").show()

works as expected. Other simple commands / functions like count(), max()
etc. also work well.

But when I try:

igniteDf.rdd.partitions

There is an exception thrown, with the following stacktrace:

java.lang.IllegalArgumentException: requirement failed:
partitions(0).partition == 4, but it should equal 0
  at scala.Predef$.require(Predef.scala:224)
  at
org.apache.spark.rdd.RDD$$anonfun$partitions$2$$anonfun$apply$3.apply(RDD.scala:254)
  at
org.apache.spark.rdd.RDD$$anonfun$partitions$2$$anonfun$apply$3.apply(RDD.scala:253)
  at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
  at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
  at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
  at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
  at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
  at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
  ... 58 elided

It's a bit surprising, because looks like the code that is invoked is
the getPartitions
function of IgniteRDD
,
which should just return Array(IgnitePartition(0), IgnitePartition(1)...
and so on).

I have compiled ignite-spark module from github using the latest commit and
used that as well, but the same error persists.

Will be happy to provide any additional information that will help zero in
on the issue!


Thanks,
Rama.


Re: SqlFieldsQuery Cannot create inner bean 'org.apache.ignite.cache.query.SqlFieldsQuery

2018-07-03 Thread Ilya Kasnacheev
Hello!

Unfortunately I'm not aware of any practices that will make Ignite (or
indeed any DB) foolproof. I know that people use human DBAs for that
purpose, which control which statements are allowed to be run on production
and which aren't.

Regards,

-- 
Ilya Kasnacheev

2018-07-02 21:01 GMT+03:00 ApacheUser :

> Thanks Ily ,
> could share any guidelines to control groupby?, Like didicated client nodes
> for connectivity from Tableau and SQL?
> Thanks
> Bhaskar
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Re: Partition eviction failed, this can cause grid hang. (Caused by: java.lang.IllegalStateException: Failed to get page IO instance (page content is corrupted))

2018-07-03 Thread siva
Hi,

We are also facing same issue ,recently upgraded to 2.5v from 2.3v  .

Running 3 server nodes and 1 client node(both Native persistence and cache
store  using ),all 3 sever nodes in baseline topology.

Baseline Topolgy:
node00,node01,node02

after some time nodes has been started,*node00* is disconnecting and then
entire cluster hangs up throwing exception 
 

after disconnecting *node00* nothing is working,even i want to remove
*node00* from baseline topolgy,its throwing exception like failed to 
connect cluster





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Is it possible to configure Apache Ignite QueryCursor to be autocloseable in the xml configuration file?

2018-07-03 Thread Igor Sapego
Are you on Linux or Windows?

Can you try nightly build [1] to check if the issue persist?

[1] -
https://ci.ignite.apache.org/project.html?projectId=Releases_NightlyRelease

Best Regards,
Igor


On Mon, Jul 2, 2018 at 8:02 PM tizh  wrote:

> Hi Igor,
> I am using version 2.5.0. Thanks
> Ti
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Re: Thin client doesn't support Expiry Policies.

2018-07-03 Thread Igor Sapego
I'm not sure if I understand you, but expiry policy is a cache
configuration attribute and does not relate to client.

What do you need exactly? Do you want to be able to create
caches dynamically with expiry policy using thin client?

Best Regards,
Igor


On Mon, Jul 2, 2018 at 9:13 PM ysc751206 
wrote:

> Hi,
>
> We are currently using Ignite 2.1/C# in our application and considering to
> use Ignite 2.5/C# because of thin client feature. But we notice that thin
> client doesn't support to cache value with expiry policy. We use that
> feature heavily in our application.
>
> May I ask if you are going to support that in the near future or if there
> is
> any workaround about this?
>
> Thanks
>
> Edison
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Re: Affinity calls in stream receiver

2018-07-03 Thread Denis Mekhanikov
David,

So, the problem is that the same class is loaded multiple times and it
wastes the metaspace, right?
Could you share a reproducer?

Denis

вт, 3 июл. 2018 г. в 0:58, David Harvey :

> We have a custom stream receiver that makes affinity calls. This all
> functions properly, but we see a very large number of the following
> messages for the same two  classes.   We also just tripped a 2GB limit on
> Metaspace size, which we came close to in the past.
>
> [18:41:50,365][INFO][pub-#6954%GridGainTrial%][GridDeploymentPerVersionStore]
> Class was deployed in SHARED or CONTINUOUS mode: class
> com.IgniteCallable
>
> So these affinity calls need to load classes that where loaded from client
> nodes, which may be related to why this happening, but my primary suspect
> is the fact that both classes are nested.  ( I had previously hit an issue
> where setting the peer-class-loading "userVersion" would cause ignite to
> thrown exceptions when the client node attempted to activate the cluster.
>   In that case, the Ignite call into the cluster was also using a nested
> class. )
>
> We will try flattening these classes to see if the problem goes away.
>
>
> *Disclaimer*
>
> The information contained in this communication from the sender is
> confidential. It is intended solely for use by the recipient and others
> authorized to receive it. If you are not the recipient, you are hereby
> notified that any disclosure, copying, distribution or taking action in
> relation of the contents of this information is strictly prohibited and may
> be unlawful.
>
> This email has been scanned for viruses and malware, and may have been
> automatically archived by *Mimecast Ltd*, an innovator in Software as a
> Service (SaaS) for business. Providing a *safer* and *more useful* place
> for your human generated data. Specializing in; Security, archiving and
> compliance. To find out more Click Here
> .
>


Re: org.apache.ignite.IgniteCheckedException: Unknown page IO type: 0

2018-07-03 Thread Denis Mekhanikov
If you want to bring your cluster into a valid state again, you can remove
the WAL and db directories and restart nodes.
But you will lose all data in this case, obviously.

What configuration properties did you change?

Denis

вт, 3 июл. 2018 г. в 4:23, 剑剑 <727418...@qq.com>:

> The node has not happened fault, is I modify configuration problem appears
> after the restart, now I need to how to correct the nodes offline, and then
> again in the form of a new node to join the cluster? All node profiles for
> the cluster are the same.
>
> 发自我的 iPhone
>
> 在 2018年7月3日,00:16,Denis Mekhanikov  写道:
>
> Looks like your persistence files are corrupted.
> You configured *LOG_ONLY* WAL mode. It doesn't guarantee survival of OS
> crushes and power failures.
> How did you restart your node?
>
> Denis
>
> пн, 2 июл. 2018 г. в 16:40, NO <727418...@qq.com>:
>
>> When I restart the node, I get the following error,
>> The problem persists after restarting the machine。
>>
>> ==
>> [2018-07-02T21:25:52,932][INFO
>> ][exchange-worker-#190][GridCacheDatabaseSharedManager] Read checkpoint
>> status
>> [startMarker=/data3/apache-ignite-persistence/node00-8c6172fa-0543-4b8d-937e-75ac27ba21ff/cp/1530535766680-f62c2aa7-4a26-45ad-b311-5b5e9ddc3f0e-START.bin,
>> endMarker=/data3/apache-ignite-persistence/node00-8c6172fa-0543-4b8d-937e-75ac27ba21ff/cp/1530535612596-2ccb2f7a-9578-44a7-ad29-ff5d6e990ae4-END.bin]
>> [2018-07-02T21:25:52,933][INFO
>> ][exchange-worker-#190][GridCacheDatabaseSharedManager] Checking memory
>> state [lastValidPos=FileWALPointer [idx=845169, fileOff=32892207,
>> len=7995], lastMarked=FileWALPointer [idx=845199, fileOff=43729777,
>> len=7995], lastCheckpointId=f62c2aa7-4a26-45ad-b311-5b5e9ddc3f0e]
>> [2018-07-02T21:25:52,933][WARN
>> ][exchange-worker-#190][GridCacheDatabaseSharedManager] Ignite node stopped
>> in the middle of checkpoint. Will restore memory state and finish
>> checkpoint on node start.
>> [2018-07-02T21:25:52,949][INFO
>> ][grid-nio-worker-tcp-comm-0-#153][TcpCommunicationSpi] Accepted incoming
>> communication connection [locAddr=/10.16.133.187:47100, rmtAddr=/
>> 10.16.133.186:22315]
>> [2018-07-02T21:25:53,131][INFO
>> ][grid-nio-worker-tcp-comm-1-#154][TcpCommunicationSpi] Accepted incoming
>> communication connection [locAddr=/10.16.133.187:47100, rmtAddr=/
>> 10.16.133.185:32502]
>> [2018-07-02T21:25:56,112][ERROR][exchange-worker-#190][GridDhtPartitionsExchangeFuture]
>> Failed to reinitialize local partitions (preloading will be stopped):
>> GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=4917,
>> minorTopVer=0], discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode
>> [id=3c06c945-de21-4b7f-8830-344306327643, addrs=[10.16.133.187, 127.0.0.1],
>> sockAddrs=[/127.0.0.1:47500, /10.16.133.187:47500], discPort=47500,
>> order=4917, intOrder=2496, lastExchangeTime=1530537954950, loc=true,
>> ver=2.4.0#20180305-sha1:aa342270, isClient=false], topVer=4917,
>> nodeId8=3c06c945, msg=null, type=NODE_JOINED, tstamp=1530537952291],
>> nodeId=3c06c945, evt=NODE_JOINED]
>> org.apache.ignite.IgniteCheckedException: Unknown page IO type: 0
>> at
>> org.apache.ignite.internal.processors.cache.persistence.tree.io.PageIO.getBPlusIO(PageIO.java:567)
>> ~[ignite-core-2.4.0.jar:2.4.0]
>> at
>> org.apache.ignite.internal.processors.cache.persistence.tree.io.PageIO.getPageIO(PageIO.java:478)
>> ~[ignite-core-2.4.0.jar:2.4.0]
>> at
>> org.apache.ignite.internal.processors.cache.persistence.tree.io.PageIO.getPageIO(PageIO.java:438)
>> ~[ignite-core-2.4.0.jar:2.4.0]
>> at
>> org.apache.ignite.internal.pagemem.wal.record.delta.DataPageInsertFragmentRecord.applyDelta(DataPageInsertFragmentRecord.java:58)
>> ~[ignite-core-2.4.0.jar:2.4.0]
>> at
>> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1967)
>> ~[ignite-core-2.4.0.jar:2.4.0]
>> at
>> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.restoreMemory(GridCacheDatabaseSharedManager.java:1827)
>> ~[ignite-core-2.4.0.jar:2.4.0]
>> at
>> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readCheckpointAndRestoreMemory(GridCacheDatabaseSharedManager.java:725)
>> ~[ignite-core-2.4.0.jar:2.4.0]
>> at
>> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.initCachesOnLocalJoin(GridDhtPartitionsExchangeFuture.java:741)
>> ~[ignite-core-2.4.0.jar:2.4.0]
>> at
>> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:626)
>> [ignite-core-2.4.0.jar:2.4.0]
>> at
>> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2337)
>> [ignite-core-2.4.0.jar:2.4.0]
>> at
>> 

Re: How long Ignite retries upon NODE_FAILED events

2018-07-03 Thread Evgenii Zhuravlev
Can you share the logs?

2018-07-02 20:54 GMT+03:00 HEWA WIDANA GAMAGE, SUBASH <
subash.hewawidanagam...@fmr.com>:

> Ok I did following poc real quick.
>
>
>
> 1.   Two nodes, started. And joined. Topology snapshot servers=2.
>
> 2.   In one node, I blocked the Ignite ports(47500, 47100 etc).
>
> 3.   Then After failureDetecitonTimeout,  it logged NODE_FAILED, and
> Topology snapshot servers=1 in each node.
>
> 4.   Then after 10-15 seconds, I unblock those ports.
>
> 5.   Then after few seconds, both nodes logged, Node joined, and
> topology snapshot server=2
>
>
>
> So it’s the same node, ID, because JVM is still up and running. And looks
> like it doesn’t forget.
>
>
>
> Can this “10-15 seconds” be any time ? Even in 1-2 hours if the node comes
> back, can it rejoin ?
>
>
>
>
>
>
>
>
>
> *From:* Evgenii Zhuravlev [mailto:e.zhuravlev...@gmail.com]
> *Sent:* Monday, July 02, 2018 1:25 PM
> *To:* user@ignite.apache.org
> *Subject:* Re: How long Ignite retries upon NODE_FAILED events
>
>
>
> If cluster already decided that node failed, it will be stopped after it
> will try to reconnect to the cluster with the same id
>
>
>
> 2018-07-02 18:37 GMT+03:00 HEWA WIDANA GAMAGE, SUBASH <
> subash.hewawidanagam...@fmr.com>:
>
> Yes failureDetectionTimeout determines the time it wait to mark a node
> failed. But my question is, after such node failed happened, and then what
> happens when that failed node becomes reachable in the network (less that
> failureDetectionTimeout) ?
>
>
>
> *From:* Evgenii Zhuravlev [mailto:e.zhuravlev...@gmail.com]
> *Sent:* Monday, July 02, 2018 11:05 AM
> *To:* user@ignite.apache.org
> *Subject:* Re: How long Ignite retries upon NODE_FAILED events
>
>
>
> Hi,
>
>
>
> by default, Ignite uses a mechanism, that can be configured using
> failureDetectionTimeout: https://apacheignite.readme.io/v2.
> 5/docs/tcpip-discovery#section-failure-detection-timeout
>
>
>
> Evgenii
>
>
>
> 2018-07-02 16:40 GMT+03:00 HEWA WIDANA GAMAGE, SUBASH <
> subash.hewawidanagam...@fmr.com>:
>
> Hi team,
>
>
>
> For example, let’s say one of the node is not down(JVM is up), but network
> not reachable from/to it. Then rest of the nodes will see  NODE_FAILED and
> started working as normal with reduced cluster size. If that failed node,
> the network from/to it, becomes normal again  after X minutes. Then,
>
> - will other nodes discover them, or will that node be able to figure it
> out ?
>
> - How long X can be at max? Is there max retry or timeout. (I seen
> joinTimeout param in discovery, but that’s seems only applicable for
> startup, like how long it should pause starting the node to let join others)
>
>
>
>
>