Re: Communication between Ignite Clusters

2020-10-14 Thread steve.hostettler
Thanks a lot!



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Communication between Ignite Clusters

2020-10-14 Thread steve.hostettler
Thanks a lot for your answer, 

So you would use a kafka for instance and then how to push the data? Using a
data streamer to balance the data on the right nodes of the second cluster?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Communication between Ignite Clusters

2020-10-14 Thread steve.hostettler
Hello and thank you for your answer.

Assuming we would have 2 clusters (e.g., we want to scale them
independently, upgrade them at different pace) that are manipulating
different datasets. The output of the first clusters being the input of the
second one.





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Communication between Ignite Clusters

2020-10-14 Thread steve.hostettler
Hello,

I would like to know what is the best way to integrate two different Apache
Ignite Clusters?
I can see the ignite client and the jdbc integration. 




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Optimise replicated caches in multinodes

2020-08-21 Thread steve.hostettler
Hello,

in order to provide "easy" multinode, we have set all of our caches to
REPLICATED and FULL_ASYNC. We get an improvment (Around 20%) with a second
node, but it is very clear that somehow the grid is waiting (95% cpu
utilization with 1 node, 75% with 2 nodes).

I understand that I should rather go for partitioned caches, but I am trying
here to provide an "easy" first step towards multinode by avoiding rewriting
the logic.

Looking at the statistics I get the following 

GridNearAtomicSingleUpdateRequest=766,197
GridDhtAtomicDeferredUpdateResponse=8,634
GridDhtAtomicSingleUpdateRequest=1,681,590
GridNearAtomicUpdateResponse=836,598

I assume that these are linked to the put in the replicated caches that are
asynchronously sent to the other node.

I would like to understand a bit better these numbers and the difference
between them.
Another question: is there a way to increase the batching of messages on the
grid. Especially when it comes to replicated caches?

Thanks a lot for your help



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Kubernetes : Connection timed out (Connection timed out)

2020-07-29 Thread steve.hostettler
Hello Dennis,

so it works again, it was apparently a transient Azure problem when creating
new clusters.
FYI the documentation and
https://github.com/apache/ignite/tree/master/modules/kubernetes/config/az
is not up to date and gave errors when executing them (from a K8S
perspective)



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Kubernetes : Connection timed out (Connection timed out)

2020-07-29 Thread steve.hostettler
Hello Denis,

same code, same config, today it works so it was an transient AKS thing. no
idea what

Best Regards



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Kubernetes : Connection timed out (Connection timed out)

2020-07-28 Thread steve.hostettler
Running 

 curl --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt -H
"Authorization: Bearer $(cat
/var/run/secrets/kubernetes.io/serviceaccount/token)" -H "Acept:
application/json" 
https://kubernetes.default.svc.cluster.local:443/api/v1/namespaces/default/endpoints/processing-engine-pe-v1-ignite

On my local K8S works but not on Azure. On Azure I end up with a Timeout.

Any Idea?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Kubernetes : Connection timed out (Connection timed out)

2020-07-28 Thread steve.hostettler
So it appears that since this morning I am not able to access the cluster
from within a pod in Azure. Very very strange

/ # wget http://kubernetes.default.svc.cluster.local:443/api
Connecting to kubernetes.default.svc.cluster.local:443 (10.0.0.1)



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Kubernetes : Connection timed out (Connection timed out)

2020-07-28 Thread steve.hostettler
Hello Jose, 

thanks for the answer. Are you on azure as well by any chance?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Kubernetes : Connection timed out (Connection timed out)

2020-07-28 Thread steve.hostettler
Hello,

making some progress
I added the missing 

hostNetwork: true

and now I get

Caused by: java.net.UnknownHostException:
kubernetes.default.svc.cluster.local



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Kubernetes : Connection timed out (Connection timed out)

2020-07-28 Thread steve.hostettler
Hello,

Since this morning I get a strange Timeout on the connection to the cluster
even with only one node.
It used to work just fine until yesterday evening. At that point I destroyed
the clustered and recreated it this morning.

I do so with a batch so it is supposed to be the same all the time. That's
what I do not understand, I cannot see what change to cause that problem.
Here is some logs

any idea??

2020-07-28 09:55:44,105 DEBUG
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi] (ServerService Thread
Pool -- 11) Node version to set: 2.7.5#20190603-sha1:be4f2a15
2020-07-28 09:55:44,117 DEBUG
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi] (ServerService Thread
Pool -- 11) Using parameter [localHost=0.0.0.0]
2020-07-28 09:55:44,117 DEBUG
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi] (ServerService Thread
Pool -- 11) Using parameter [localPort=47500]
2020-07-28 09:55:44,117 DEBUG
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi] (ServerService Thread
Pool -- 11) Using parameter [localPortRange=0]
2020-07-28 09:55:44,118 DEBUG
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi] (ServerService Thread
Pool -- 11) Using parameter [threadPri=10]
2020-07-28 09:55:44,118 DEBUG
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi] (ServerService Thread
Pool -- 11) Using parameter [failureDetectionTimeout=1]
2020-07-28 09:55:44,118 DEBUG
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi] (ServerService Thread
Pool -- 11) Using parameter [ipFinder=TcpDiscoveryIpFinderAdapter
[shared=true]]
2020-07-28 09:55:44,118 DEBUG
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi] (ServerService Thread
Pool -- 11) Using parameter [ipFinderCleanFreq=6]
2020-07-28 09:55:44,118 DEBUG
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi] (ServerService Thread
Pool -- 11) Using parameter [metricsUpdateFreq=2000]
2020-07-28 09:55:44,118 DEBUG
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi] (ServerService Thread
Pool -- 11) Using parameter [statsPrintFreq=0]
2020-07-28 09:55:44,122 DEBUG
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi] (ServerService Thread
Pool -- 11) Registered SPI MBean:
org.apache:clsLdr=407b70da,igniteInstanceName=com.wolterskluwer.processing,group=SPIs,name=TcpDiscoverySpi
2020-07-28 09:55:44,133 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi] (ServerService Thread
Pool -- 11) Connection check threshold is calculated: 1
2020-07-28 09:55:44,134 DEBUG
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-msg-worker-#2%com.wolterskluwer.processing%) Grid runnable
started: tcp-disco-msg-worker
2020-07-28 09:55:44,135 DEBUG
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-msg-worker-#2%com.wolterskluwer.processing%) Message worker
started [locNodeId=32c22aa5-430c-4352-a8d3-55e0a7b2671e]
2020-07-28 09:55:44,135 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi] (ServerService Thread
Pool -- 11) Successfully bound to TCP port [port=47500,
localHost=0.0.0.0/0.0.0.0, locNodeId=32c22aa5-430c-4352-a8d3-55e0a7b2671e]
2020-07-28 09:55:44,137 DEBUG
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi] (ServerService Thread
Pool -- 11) Local node initialized: TcpDiscoveryNode
[id=32c22aa5-430c-4352-a8d3-55e0a7b2671e, addrs=[10.244.0.8, 127.0.0.1],
sockAddrs=[processing-engine-pe-v1.master-864fb658c6-sdw56/10.244.0.8:47500,
/127.0.0.1:47500], discPort=47500, order=0, intOrder=0,
lastExchangeTime=1595930144127, loc=true, ver=2.7.5#20190603-sha1:be4f2a15,
isClient=false]
2020-07-28 09:55:44,138 DEBUG
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-srvr-#3%com.wolterskluwer.processing%) Grid runnable started:
tcp-disco-srvr
2020-07-28 09:55:44,149 DEBUG
[org.apache.ignite.spi.discovery.tcp.ipfinder.kubernetes.TcpDiscoveryKubernetesIpFinder]
(ServerService Thread Pool -- 11) Getting Apache Ignite endpoints from:
https://kubernetes.default.svc.cluster.local:443/api/v1/namespaces/default/endpoints/processing-engine-pe-v1-ignite
2020-07-28 09:55:50,187 DEBUG
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-msg-worker-#2%com.wolterskluwer.processing%) Message has been
added to queue: TcpDiscoveryStatusCheckMessage [creatorNode=TcpDiscoveryNode
[id=32c22aa5-430c-4352-a8d3-55e0a7b2671e, addrs=[10.244.0.8, 127.0.0.1],
sockAddrs=[processing-engine-pe-v1.master-864fb658c6-sdw56/10.244.0.8:47500,
/127.0.0.1:47500], discPort=47500, order=0, intOrder=0,
lastExchangeTime=1595930144127, loc=true, ver=2.7.5#20190603-sha1:be4f2a15,
isClient=false], failedNodeId=null, status=0,
super=TcpDiscoveryAbstractMessage [sndNodeId=null,
id=44e59d49371-32c22aa5-430c-4352-a8d3-55e0a7b2671e, verifierNodeId=null,
topVer=0, pendingIdx=0, failedNodes=null, isClient=false]]
2020-07-28 09:55:50,187 DEBUG
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-msg-worker-#2%com.wolterskluwer.processing%) Processing message
[cls=TcpDiscoveryStatusCheckMessage,
id=44e59d49371-32c22aa5-430c-4352-a8d3-55e0a7b2671e]

Re: How to evaluate memory consumption of indexes

2020-07-21 Thread steve.hostettler
Thanks for the answer, did not thing of using the persistence to assess the
size :)



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


How to evaluate memory consumption of indexes

2020-07-20 Thread steve.hostettler
Hello,

after having added a new index, I see a surge of off-heap consumption
(+10GB) but since it is a small index, I am surprised by it. Is there a way
to have more details about what's consuming off-heap?

Thanks in advance



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Ignite on AKS and RBAC issue

2020-07-13 Thread steve.hostettler
I found my mistake, it was indeed a glitch in the deployment yaml as I forgot
to specify the service account.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Ignite on AKS and RBAC issue

2020-07-10 Thread steve.hostettler
Hello Alex, thanks for the tip but putting everything on the namespace ignite
does not help.
I also rechecked the documentation. I still get the 403. 

Additional question : how does the service account and the service relate?

So I have a

1) service account
kubectl describe serviceaccount ignite -n ignite
Name:ignite
Namespace:   ignite
Labels:  app.kubernetes.io/managed-by=Helm
Annotations: meta.helm.sh/release-name: pe-v1
 meta.helm.sh/release-namespace: ignite
Image pull secrets:  
Mountable secrets:   ignite-token-htqrp
Tokens:  ignite-token-htqrp
Events:  

2) a clusterrole
kubectl describe clusterrole ignite -n ignite
Name: ignite
Labels:   app.kubernetes.io/managed-by=Helm
  release=pe-v1
Annotations:  meta.helm.sh/release-name: pe-v1
  meta.helm.sh/release-namespace: ignite
PolicyRule:
  Resources  Non-Resource URLs  Resource Names  Verbs
  -  -  --  -
  endpoints  [] []  [get list watch]
  pods   [] []  [get list watch]

3) a clusterrolebinding
kubectl describe clusterrolebinding ignite -n ignite
Name: ignite
Labels:   app.kubernetes.io/managed-by=Helm
  release=pe-v1
Annotations:  meta.helm.sh/release-name: pe-v1
  meta.helm.sh/release-namespace: ignite
Role:
  Kind:  ClusterRole
  Name:  ignite
Subjects:
  KindNameNamespace
  -
  ServiceAccount  ignite  ignite

4)a service
kubectl describe svc processing-engine-pe-v1-ignite -n ignite
Name:  processing-engine-pe-v1-ignite
Namespace: ignite
Labels:app.kubernetes.io/managed-by=Helm
Annotations:   meta.helm.sh/release-name: pe-v1
   meta.helm.sh/release-namespace: ignite
Selector:  type=processing-engine-pe-v1.node
Type:  ClusterIP
IP:None
Port:  service-discovery  47500/TCP
TargetPort:47500/TCP
Endpoints: 10.244.0.34:47500,10.244.1.31:47500
Session Affinity:  None
Events:

But somehow I still get a 403
2020-07-10 22:08:51,837 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi] (ServerService Thread
Pool -- 15) Successfully bound to TCP port [port=47500,
localHost=0.0.0.0/0.0.0.0, locNodeId=c651239a-2964-4b8b-915b-c055bcf410ed]
2020-07-10 22:08:52,029 ERROR
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi] (ServerService Thread
Pool -- 15) Failed to get registered addresses from IP finder on start
(retrying every 2000ms; change 'reconnectDelay' to configure the frequency
of retries).: class org.apache.ignite.spi.IgniteSpiException: Failed to
retrieve Ignite pods IP addresses.
at
org.apache.ignite.spi.discovery.tcp.ipfinder.kubernetes.TcpDiscoveryKubernetesIpFinder.getRegisteredAddresses(TcpDiscoveryKubernetesIpFinder.java:172)
at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.registeredAddresses(TcpDiscoverySpi.java:1900)
at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.resolvedAddresses(TcpDiscoverySpi.java:1848)

at
org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1377)
at java.lang.Thread.run(Thread.java:748)
at org.jboss.threads.JBossThread.run(JBossThread.java:485)
Caused by: java.io.IOException: Server returned HTTP response code: 403 for
URL:
https://kubernetes.default.svc.cluster.local:443/api/v1/namespaces/ignite/endpoints/processing-engine-pe-v1-ignite
at
sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1900)
2020-07-10 22:13:47,219 ERROR [org.jboss.as.controller.management-operation]
(main) WFLYCTL0190: Step handler
org.jboss.as.server.deployment.DeploymentHandlerUtil$1@63778853 for
operation add at address [("deployment" => "reg.war")] failed handling
operation rollback -- java.util.concurrent.TimeoutException:
java.util.concurrent.TimeoutException
at
org.jboss.as.controller.OperationContextImpl.waitForRemovals(OperationContextImpl.java:523)
at org.wildfly.swarm.bootstrap.Main.main(Main.java:87)

2020-07-10 22:13:52,220 ERROR [org.jboss.as.controller.management-operation]
(main) WFLYCTL0349: Timeout after [5] seconds waiting for service container
stability while finalizing an operation. Process must be restarted. Step
that first updated the service container was 'add' at address
'[("deployment" => "reg.war")]'
2020-07-10 22:13:52,225 ERROR [stderr] (main)
org.wildfly.swarm.container.DeploymentException:
org.wildfly.swarm.container.DeploymentException: THORN0004: Deployment
failed: WFLYCTL0344: Operation timed out awaiting service container
stability
2020-07-10 22:13:52,226 ERROR [stderr] (main)   at
org.wildfly.swarm.container.runtime.RuntimeDeployer.deploy(RuntimeDeployer.java:301)
2020-07-10 22:13:52,230 ERROR [stderr] (main)   at

Ignite on AKS and RBAC issue

2020-07-10 Thread steve.hostettler
Hello,

I  am deploying an embeded version of ignite on AKS and I am getting this
error:
Caused by: java.io.IOException: Server returned HTTP response code: 403 for
URL:
https://kubernetes.default.svc.cluster.local:443/api/v1/namespaces/default/endpoints/processing-engine-pe-v1-ignite
at
sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1900)
at
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498)


That sounds like a problem with the RBAC to me but I cannot nail it down.
So let me give my current configuration:

NAME  READY   STATUSRESTARTS  
AGE
processing-engine-pe-v1.master-69668fcb5b-zm7m8   1/1 Running   0 
9m6s
processing-engine-pe-v1.worker-7598949c5d-pkbfg   1/1 Running   0 
9m6s

As you can see 2 pods on the default namespace

So the configuration  is





The service is there
kubectl describe  svc processing-engine-pe-v1-ignite
Name:  processing-engine-pe-v1-ignite
Namespace: default
Labels:app.kubernetes.io/managed-by=Helm
Annotations:   meta.helm.sh/release-name: pe-v1
   meta.helm.sh/release-namespace: default
Selector:  type=processing-engine-pe-v1.node
Type:  ClusterIP
IP:None
Port:  service-discovery  47500/TCP
TargetPort:47500/TCP
Endpoints: 10.244.0.31:47500,10.244.1.28:47500
Session Affinity:  None
Events:

The service account
kubectl describe serviceaccount ignite
Name:ignite
Namespace:   default
Labels:  app.kubernetes.io/managed-by=Helm
Annotations: meta.helm.sh/release-name: pe-v1
 meta.helm.sh/release-namespace: default
Image pull secrets:  
Mountable secrets:   **
Tokens:  **
Events:  


The role
kubectl describe clusterrole ignite
Name: ignite
Labels:   app.kubernetes.io/managed-by=Helm
  release=pe-v1
Annotations:  meta.helm.sh/release-name: pe-v1
  meta.helm.sh/release-namespace: default
PolicyRule:
  Resources  Non-Resource URLs  Resource Names  Verbs
  -  -  --  -
  endpoints  [] []  [get list watch]
  pods   [] []  [get list watch]

The role binding
kubectl describe clusterrolebinding ignite
Name: ignite
Labels:   app.kubernetes.io/managed-by=Helm
  release=pe-v1
Annotations:  meta.helm.sh/release-name: pe-v1
  meta.helm.sh/release-namespace: default
Role:
  Kind:  ClusterRole
  Name:  ignite
Subjects:
  KindNameNamespace
  -
  ServiceAccount  ignite  default


Any idea of what I am missing?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Ignite persistence and activation

2020-06-18 Thread steve.hostettler
thanks a lot



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Using native persistence to "extend" memory

2020-06-16 Thread steve.hostettler
Thank you both for clarifying this. I usually also have large objects 3KB so
I thought about increasing the page size to 32KB to reduce the number of
pages and thus reduce the speed at which we get to the 2/3 of dirty pages.
Good idea?
On top of that during the process at some I generate a significant amount of
new objects so I also kept  increased checkpointPageBufferSize to 4GB. I
also writeThrottlingEnabled disabled.

Thanks for advising.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Using native persistence to "extend" memory

2020-06-16 Thread steve.hostettler
Thanks a lot for the recommendation. So keeping the WAL, disabling archiving.
I understand all records are kept on disk. 

Thanks again. Anything else?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Using native persistence to "extend" memory

2020-06-16 Thread steve.hostettler
Hello,

I am trying to use ignite persistence to "extend" memory. That is to replace
the swap that is not working very well.

Therefore since I do not care about recovery I disabled the WAL. Are there
other things you would recommend to configure to use the ignite persistence
as a sort of swap. For instance only persisting the less used pages most of
the time.

Thanks

Steve



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Ignite persistence and activation

2020-06-11 Thread steve.hostettler
Hello.

I am trying to implement ignite persistence but I stumbled upon the
following problems/questions. It is required to activate the cluster, that
much is clear but I have bootstrap code that is using technical caches that
I do not want to persist and more problematic I need to use
ignite.atomicReference and that as part of the initialization of the node.

I assume that I need to create a another region that is not persisted for
the so called system caches but what do I do with  ignite.atomicReference?


Thanks in advance



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: EXCEPTION_ACCESS_VIOLATION when Swap is enabled

2020-06-10 Thread steve.hostettler
No I will try that now



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: REPLICATED caches network overhead?

2020-01-04 Thread steve.hostettler
So to close this one, the problem was because I did not set the
replicatedOnly flag on the query. 



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: REPLICATED caches network overhead?

2020-01-03 Thread steve.hostettler
So in the mean time, I was able to confirm that there are messages sent to
another node about queries on replicated caches?

I get the following messages sent from node
1509c11f-c627-4696-a508-6b2c6bc99904 to 6b3b721d-611b-488e-bdbf-35a9dc16b8f0

GridH2QueryRequest [reqId=663310, caches=[-912644195],
topVer=AffinityTopologyVersion [topVer=2, minorTopVer=6], parts=null,
qryParts=null, pageSize=1024, qrys=[GridCacheSqlQuery [qry=SELECT
"XXX:0:XXX::5:20170430".__Z0._KEY __C0_0,
"DXX:0:XXX::5:20170430".__Z0._VAL __C0_1
FROM "XXX:0:XXX::5:20170430"."Entity_0" __Z0
WHERE (. , mvccSnapshot=null, txReq=null]


That sounds extremely counter-intuitive to me. What would a sql query need
to send a message to another node for a replicated cache.

Does anyone have an idea?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


REPLICATED caches network overhead?

2020-01-03 Thread steve.hostettler
Hello,

first let me wish you a happy new Year!

I am currently working with replicated and local caches and I observe a
strange behavior in my application. I do not know whether it is related to
my application or to ignite at this point and thus looking for information.

I have a REPLICATED cache that is loaded once and for all and then have a
job running on the partitions and filling in a LOCAL cache.

On one K8S node, it takes 61s but when I add a second node it goes up to
168s for the same amount of work. These number are not significant in itself
but I fail to understand why it would be case. If anything I expected a
speedup since I double the number of cores and consume locally (because
replicated) and put locally (because LOCAL).

Interestingly enough, with one node the CPU is at 100% and with 2 it goes
down to 30% which is usually the sign for IOs.


Could you please:
1)Confirm that my assumption is correct and that reading from a replicated
cache that is not modified after loading and writing to a LOCAL cache should
be pretty linearly scalable to the number of nodes (ignoring the initial
rebalancing of the partitions) 

2) someone tell me if we expect a lot of (technical) messages exchanged
between the nodes in this configuration.

Thanks a lot for your help



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Good metrics to assess affinity quality

2019-12-27 Thread steve.hostettler
Hello,

my objective is to come up with a good affinity to optimise data
distribution.To reduce local misses by improving co-location. My data
structure is however legacy and quite complex and I do not have a clear
candidate. Therefore, I would like to test several affinities and to measure
what would be the best. I guess one method is to run all computations with
all candidates but that would take ages so I am looking for a metrics to
measure the quality of the affinity. 

Question : there is an average get time in the cache metric. Is it correct
to expect that to increase in the case of a lot of  remote calls to other
nodes to get an entity that is not local?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Good metrics to assess affinity quality

2019-12-25 Thread steve.hostettler
Hello Igniters,

first let me wish you happy holidays. I am working to testing different
affinities and I was wondering what would be good metrics to assess affinity
quality. I have a near cache to deal with less than perfect affinities.



 I assume network traffic is a good one but is there something internal to
Ignite? Like the number of messages sent to request remote objects?

Many thanks in advance

Steve



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: K8S on Azure

2019-12-15 Thread steve.hostettler
Hi all,

sorry for the confusion, I was using the wrong ip finder (using a wrong
docker image of mine) on some of the nodes. It worked with K8S on
docker-desktop but not with K8S on Azure because of the way networking is
managed.

Apologies 



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: K8S on Azure

2019-12-14 Thread steve.hostettler
I continued my investigations and I started with only one pod and then scale
up.

I get the following message and on the new pod that proves that the two
started to communicate but it did not result in the new pod to join the
cluster as the topology did not change.


2019-12-14 18:04:20,203 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-srvr-#3%com.wolterskluwer.processing%) TCP discovery accepted
incoming connection [rmtAddr=/10.244.0.20, rmtPort=42603]
2019-12-14 18:04:20,217 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-srvr-#3%com.wolterskluwer.processing%) TCP discovery spawning a
new thread for connection [rmtAddr=/10.244.0.20, rmtPort=42603]
2019-12-14 18:04:20,218 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-sock-reader-#6%com.wolterskluwer.processing%) Started serving
remote node connection [rmtAddr=/10.244.0.20:42603, rmtPort=42603]
2019-12-14 18:04:20,221 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-sock-reader-#6%com.wolterskluwer.processing%) Received ping
request from the remote node
[rmtNodeId=e398622b-119f-4280-8219-3395b183a457, rmtAddr=/10.244.0.20:42603,
rmtPort=42603]
2019-12-14 18:04:20,223 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-sock-reader-#6%com.wolterskluwer.processing%) Finished writing
ping response [rmtNodeId=e398622b-119f-4280-8219-3395b183a457,
rmtAddr=/10.244.0.20:42603, rmtPort=42603]
2019-12-14 18:04:20,223 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-sock-reader-#6%com.wolterskluwer.processing%) Finished serving
remote node connection [rmtAddr=/10.244.0.20:42603, rmtPort=42603
2019-12-14 18:05:20,280 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-srvr-#3%com.wolterskluwer.processing%) TCP discovery accepted
incoming connection [rmtAddr=/10.244.0.20, rmtPort=49757]
2019-12-14 18:05:20,280 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-srvr-#3%com.wolterskluwer.processing%) TCP discovery spawning a
new thread for connection [rmtAddr=/10.244.0.20, rmtPort=49757]
2019-12-14 18:05:20,281 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-sock-reader-#7%com.wolterskluwer.processing%) Started serving
remote node connection [rmtAddr=/10.244.0.20:49757, rmtPort=49757]
2019-12-14 18:05:20,281 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-sock-reader-#7%com.wolterskluwer.processing%) Received ping
request from the remote node
[rmtNodeId=e398622b-119f-4280-8219-3395b183a457, rmtAddr=/10.244.0.20:49757,
rmtPort=49757]
2019-12-14 18:05:20,282 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-sock-reader-#7%com.wolterskluwer.processing%) Finished writing
ping response [rmtNodeId=e398622b-119f-4280-8219-3395b183a457,
rmtAddr=/10.244.0.20:49757, rmtPort=49757]
2019-12-14 18:05:20,282 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-sock-reader-#7%com.wolterskluwer.processing%) Finished serving
remote node connection [rmtAddr=/10.244.0.20:49757, rmtPort=49757
2019-12-14 18:06:20,336 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-srvr-#3%com.wolterskluwer.processing%) TCP discovery accepted
incoming connection [rmtAddr=/10.244.0.20, rmtPort=33781]
2019-12-14 18:06:20,336 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-srvr-#3%com.wolterskluwer.processing%) TCP discovery spawning a
new thread for connection [rmtAddr=/10.244.0.20, rmtPort=33781]
2019-12-14 18:06:20,337 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-sock-reader-#8%com.wolterskluwer.processing%) Started serving
remote node connection [rmtAddr=/10.244.0.20:33781, rmtPort=33781]
2019-12-14 18:06:20,338 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-sock-reader-#8%com.wolterskluwer.processing%) Received ping
request from the remote node
[rmtNodeId=e398622b-119f-4280-8219-3395b183a457, rmtAddr=/10.244.0.20:33781,
rmtPort=33781]
2019-12-14 18:06:20,338 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-sock-reader-#8%com.wolterskluwer.processing%) Finished writing
ping response [rmtNodeId=e398622b-119f-4280-8219-3395b183a457,
rmtAddr=/10.244.0.20:33781, rmtPort=33781]
2019-12-14 18:06:20,338 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-sock-reader-#8%com.wolterskluwer.processing%) Finished serving
remote node connection [rmtAddr=/10.244.0.20:33781, rmtPort=33781
2019-12-14 18:07:20,393 INFO 
[org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi]
(tcp-disco-srvr-#3%com.wolterskluwer.processing%) TCP discovery accepted
incoming connection [rmtAddr=/10.244.0.20, rmtPort=57657]
2019-12-14 18:07:20,394 INFO 

K8S on Azure

2019-12-14 Thread steve.hostettler
Hello,

I am trying to deploy an ignite based application on an Azure K8S using helm
charts.

I test my application locally on a docker-desktop based K8S with obviously
only one node (my laptop)
That works quite well and when I increase the replicas, the new pods join
the cluster and it works. The ignite service lists all the pods without any
issues.

When I deploy on Azure though not all of the pods join the cluster but some
do and I fail to understand why. I started with only one node.
The ignite service does list all the pods. I have no error message in none
of the console.

Any idea on how to investigate that?



Some log of the different pods:
>>> +--+
>>> Ignite ver. 2.7.5#20190603-sha1:be4f2a158bcf79a52ab4f372a6576f20c4f86954
>>> +--+
>>> OS name: Linux 4.15.0-1061-azure amd64
>>> CPU(s): 1
>>> Heap: 0.5GB
>>> VM name: 1@v1-processing-grid.master-545c9d7594-7pcxz
>>> Ignite instance name: com.wolterskluwer.processing
>>> Local node [ID=AA51A811-7844-4E1D-839F-59832EE6FB2A, order=2,
>>> clientMode=false]
>>> Local node addresses:
>>> [v1-processing-grid.master-545c9d7594-7pcxz/10.244.0.17, /127.0.0.1]
>>> Local ports: TCP:10500 TCP:10800 TCP:11211 TCP:11400 TCP:47100 TCP:47500
INFO  [org.apache.ignite.internal.managers.discovery.GridDiscoveryManager]
(ServerService Thread Pool -- 6) Topology snapshot [ver=2, locNode=aa51a811,
servers=2, clients=0, state=ACTIVE, CPUs=2, offheap=0.95GB, heap=1.0GB]

>>> +--+
>>> Ignite ver. 2.7.5#20190603-sha1:be4f2a158bcf79a52ab4f372a6576f20c4f86954
>>> +--+
>>> OS name: Linux 4.15.0-1061-azure amd64
>>> CPU(s): 1
>>> Heap: 0.5GB
>>> VM name: 1@v1-processing-grid.worker-866b666878-2g4j8
>>> Ignite instance name: com.wolterskluwer.processing
>>> Local node [ID=0CD902B3-271E-43FD-9F83-9E76570743C2, order=1,
>>> clientMode=false]
>>> Local node addresses:
>>> [v1-processing-grid.worker-866b666878-2g4j8/10.244.0.13, /127.0.0.1]
>>> Local ports: TCP:10500 TCP:10800 TCP:11211 TCP:11400 TCP:47100 UDP:47400
>>> TCP:47500
INFO  [org.apache.ignite.internal.managers.discovery.GridDiscoveryManager]
(ServerService Thread Pool -- 15) Topology snapshot [ver=1,
locNode=0cd902b3, servers=1, clients=0, state=ACTIVE, CPUs=1,
offheap=0.48GB, heap=0.5GB]

>>> +--+
>>> Ignite ver. 2.7.5#20190603-sha1:be4f2a158bcf79a52ab4f372a6576f20c4f86954
>>> +--+
>>> OS name: Linux 4.15.0-1061-azure amd64
>>> CPU(s): 1
>>> Heap: 0.5GB
>>> VM name: 1@v1-processing-grid.worker-866b666878-4kwrq
>>> Ignite instance name: com.wolterskluwer.processing
>>> Local node [ID=6CA4D1A2-BDCD-4F12-B185-A89CA57B4430, order=1,
>>> clientMode=false]
>>> Local node addresses:
>>> [v1-processing-grid.worker-866b666878-4kwrq/10.244.0.11, /127.0.0.1]
>>> Local ports: TCP:10500 TCP:10800 TCP:11211 TCP:11400 TCP:47100 UDP:47400
>>> TCP:47500 
INFO  [org.apache.ignite.internal.managers.discovery.GridDiscoveryManager]
(ServerService Thread Pool -- 6) Topology snapshot [ver=2, locNode=aa51a811,
servers=2, clients=0, state=ACTIVE, CPUs=2, offheap=0.95GB,
heap=1.0GB]-12-14 17:34:46,125 INFO 
[org.apache.ignite.internal.managers.discovery.GridDiscoveryManager]
(ServerService Thread Pool -- 6) Topology snapshot [ver=1, locNode=6ca4d1a2,
servers=1, clients=0, state=ACTIVE, CPUs=1, offheap=0.48GB, heap=0.5GB]

>>> +--+
>>> Ignite ver. 2.7.5#20190603-sha1:be4f2a158bcf79a52ab4f372a6576f20c4f86954
>>> +--+
>>> OS name: Linux 4.15.0-1061-azure amd64
>>> CPU(s): 1
>>> Heap: 0.5GB
>>> VM name: 1@v1-processing-grid.worker-866b666878-hdqdf
>>> Ignite instance name: com.wolterskluwer.processing
>>> Local node [ID=B9C318E9-42E2-43F4-BD58-8FC8C42C68B7, order=1,
>>> clientMode=false]
>>> Local node addresses:
>>> [v1-processing-grid.worker-866b666878-hdqdf/10.244.0.15, /127.0.0.1]
>>> Local ports: TCP:10500 TCP:10800 TCP:11211 TCP:11400 TCP:47100 UDP:47400
>>> TCP:47500 
INFO  [org.apache.ignite.internal.managers.discovery.GridDiscoveryManager]
(ServerService Thread Pool -- 15) Topology snapshot [ver=1,
locNode=b9c318e9, servers=1, clients=0, state=ACTIVE, CPUs=1,
offheap=0.48GB, heap=0.5GB]

>>> +--+
>>> Ignite ver. 2.7.5#20190603-sha1:be4f2a158bcf79a52ab4f372a6576f20c4f86954
>>> +--+
>>> OS name: Linux 4.15.0-1061-azure amd64
>>> CPU(s): 1
>>> Heap: 0.5GB
>>> VM name: 1@v1-processing-grid.worker-866b666878-m7d8s
>>> Ignite instance name: com.wolterskluwer.processing
>>> Local 

Affinity function and partition aware database loading

2018-10-04 Thread steve.hostettler
Hello,

I would like to enable partition aware data loading. I do have a composite
business key in the database (Oracle and SQL Server) that happen to be the
key of the object in the key. The most important part of that key is a
string

I can very easily compute a good affinity from that key, the problem is that
I would like to limit each load to only its subset of the data. Namely, the
data that will end up on the partitions of that node. 

Optimally, I would be able to compute the affinity along with the select
that loads the data from the database. That does not work, because the java
hashcode function is usually not implementable as a select.

In
https://apacheignite.readme.io/docs/data-loading#section-partition-aware-data-loading,
it is recommended to add a field with the partition id but that means that
the data are first loaded in the grid and then loading in the database. This
is not my, data are loaded in the DB through an ETL and then we load it in
the grid.

I do not want to add a technical field in the key (the partition), otherwise
it would mean that the business code would have to deal with it.

At this point, I considered several alternatives but none of them perform
correctly:
- Stored procedure in T-SQL and PL-SQL to compute the partition during the
select but it is horribly slow
- Extracting a number from the composed key in the select : SELECT LENGTH(A)
AS L, ascii(SUBSTR(A, LENGTH(A) - 1))+ascii(SUBSTR(A, LENGTH(A) -
2))+ascii(SUBSTR(A, LENGTH(A) - 3))+ascii(SUBSTR(A, LENGTH(A) - 4)) AS K 
FROM (SELECT 'abcdefgh' AS A FROM DUAL) but this is really ugly
- Using ORA_HASH and its SQLServer equivalent but the algorithm is
proprietary and I cannot use it in the affinity.


Does the community have an opinion on how to best solve that?

Thanks a lot in advance



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Simulate Read Only Caches

2018-08-31 Thread steve.hostettler
Precisely my intention :) Will switch to the developer list to ask for some
guidance.

Thanks



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: BinaryMarshaller (micro) Benchmark shows that keepBinary is a bit slower than systematically unmarshall the object

2018-08-29 Thread steve.hostettler
Hello all,

I actually found what the problem was.

I accessed the fields of the BinaryObject in that way:


IgniteCache cacheAsBinary =
cache.withNoRetries().withSkipStore().withKeepBinary();

///Do a million times
BinaryObject bo = cacheAsBinary .get(key);
Long field1 = o.field("field1");

but I should have done

IgniteCache cacheAsBinary =
cache.withNoRetries().withSkipStore().withKeepBinary();
BinaryType type = instance.binary().type(ValueObject.class);
BinaryField field = type.field("field1");
///Do a million times
BinaryObject bo = cacheAsBinary .get(key);
Long field1 = field.value(bo);


Of course resolving field1 is costly. This is similar to what you do when
looking up a field using reflection.
For the record, now on the micro benchmark, get a field as binary takes half
the time of doing a full unmarsharling.





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


BinaryMarshaller (micro) Benchmark

2018-08-28 Thread steve.hostettler
Hello,

I am puzzled by a micro-benchmark I made and that you can find here: 
https://github.com/hostettler/IgnitePerfTest.git
  

To reproduce the below results, just run:

$ mvn clean install
$ cd target
$ java -Xmx512m -Xmx512m -XX:+UseG1GC -jar benchmarks.jar


This yields the following result:
# Run complete. Total time: 00:13:39
Benchmark  Mode  Cnt   ScoreError 
Units
MHBenchmark.igniteRead  avgt  100  1800.052 ± 49.343  ns/op
MHBenchmark.igniteReadKeepBinary  avgt  100  1865.829 ± 45.450  ns/op

So the keepBinary is little bit slower then the standard get.



Here is the question: this very simple test basically compares getting a key
as Binary and directly.

My assumption (that I wanted to validate) was that the binary marshaller and
the withKeepBinary would usually be faster than unmarshalling the whole
object.

This micro-benchmarks proves otherwise. I do not doubt that I am missing
something but I cannot understand what I did wrong.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Simulate Read Only Caches

2018-08-28 Thread steve.hostettler
Hello,

I do have a bunch of caches that I would like to have replicated but to keep
them "read only" after a certain point. It is a fairly standard use case.
There are master (exchange rates) that are fixed once and for all for a
given (set of processes). Once loaded there is no reason to bother with
locking and transactionality.


I looked at the implementation and there are quite a number of gates and
checks that are in place. I wonder how to work around these. 

I there a way to simulate this? Maybe there are even more things that we can
temporary disable to speed up common read only use cases.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Ignite and Persistent Collections

2018-08-21 Thread steve.hostettler
Will do thanks



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Ignite and Persistent Collections

2018-08-20 Thread steve.hostettler
Hello,

I would like to know what is the position of the committers towards
Persistent Collections (https://pcollections.org/)?
Apparently it could massively reduce the marshalling/unmarshalling time.


Best Regards



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Business Intelligence connection to Ignite

2018-03-05 Thread steve.hostettler
Hello,

is there any best practice/recommendation on how to connect 3rd business
intelligente tools to Ignite.
For instance, is it possible to connect a BO universe to ignite?

Thansk for your help



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: copyOnRead to false

2017-09-05 Thread steve.hostettler
Hello,

thanks for the answer. The benchmark is actually our application stressed
with several volumes. Some quite complex to describe. However, for these
benchmarks we are only using one node.

Basically we are loading a set of caches from the database, do a lot of
querying both ScanQuery (on BinaryObjects) and SQLQueries.

Most of what we are doing is read only with lot of computations (at least we
segregated the caches that are r/w)

Based on what you described, I should witness an performance improvment.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


copyOnRead to false

2017-09-03 Thread steve.hostettler
Hello,

we did some benchmarks and did set the copyOnRead flag to false and it did
increase the processing time and the memory a little bit (10%). I cannot
figure out why this would be case since my understanding is that if anything
it should reduce the processing time by avoiding a clone.

One note: we heavely use the BinaryMarshaller and BinaryObjects without
deserializalation. Maybe this has an impact.

Many Thanks in advance

Steve




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Apache Ignite, Memory Class Storage, GPUs, and all that

2017-06-13 Thread steve.hostettler
Hello,

Having to explain the choice of Ignite internally, I wonder what is the
"official" position of Apache Ignite towards Storage Class Memory and using
GPUs.

On the SCM story, I guess it is just another way of allocating/freeing
memory in a kind of off-heap mode but on disk.

On the GPUs story, I guess is whether or not a library like jcuda can be
used inside Ignite transparently.

Could you elaborate on this topics?  But I would like to have more details
if possible.


Best Regards



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Apache-Ignite-Memory-Class-Storage-GPUs-and-all-that-tp13646.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


RE: Write behind and eventual consistency

2017-04-27 Thread steve.hostettler
Hi Val,

the use case is the following

1) Load data into the database from an external system
2) Once ready load it into the grid
3) Process something that does massive write behinds
4) Take a snapshot of the results (or) Do a backup of the tables   <<--- At
this point I need the eventual consistency to ...eventually be 

At step 4 I cannot afford to have some update still in progress. This is
even more important since because of write behind I cannot maintain
referential integrity (since the insert/update are done in a random order)



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Write-behind-and-eventual-consistency-tp12242p12287.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Write behind and eventual consistency

2017-04-25 Thread steve.hostettler
Hello,

Iwould like to enable the write-behind mode but since Ihave to sync the
overall process with other jobs (extract to the datawarehouse), I would like
to know how to make sure that there are no more write behind operations
waiting for completion. I had a look at the API but I do not see anything
like that.
What would you advise to do?

Many thanks in advance

Best Regards



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Write-behind-and-eventual-consistency-tp12242.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Identical objects on the same node

2017-04-09 Thread steve.hostettler
Hi Val,

I definitly consider this alternative ("normalizing" the objects like with
foreign keys) but it requires more work from my side to cope with graphes of
objects. In other terms, if I want to see data from the cache as "pure"
objects, I do have to implement the "ORM" layer on top of it with some sort
of lazy loading.  Therefore, I would have prefered to have it done at the
ignite level :)


Thanks for the advice.



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Identical-objects-on-the-same-node-tp11830p11845.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Identical objects on the same node

2017-04-09 Thread steve.hostettler
Hello Val, 

thanks for the answer. Would it be theoritecally possible from your point of
view?
I mean having some sort of serialized pointers to other byte arrays on the
same node?

What is the rational of serializing everything on the node? I get that
internode communication requires serialization but if I optimize my
algorithms to be highly local then this inter-node communication is reduced
to a minimum. 


Steve



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Identical-objects-on-the-same-node-tp11830p11839.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Identical objects on the same node

2017-04-08 Thread steve.hostettler
Hello,

I would to understand Ignite's behavior when I put identical objects (from
strings up to graphs of objects) in the same cache (as part of the graph of
different values) on the same node. In other words, is there some sort of
flyweight pattern implemented to reduce memory consumption?

Many thanks for your answer



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Identical-objects-on-the-same-node-tp11830.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Renaming a cache

2017-03-02 Thread steve.hostettler
Thanks for the answer. It is then much easier to add a level of indirection
and to have a random name for the cache. 



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Renaming-a-cache-tp10943p10987.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: LoadCache Performance decreases with the size of the cache

2016-12-27 Thread steve.hostettler
Hello Val,

you're right I was too quick to jump to a conclusion. Actually, the problem
comes from my code but it was not obvious to me.

I create the indexes with the following code:
-
Collection idxs = new ArrayList<>();

QueryIndex idx1 = new QueryIndex();
LinkedHashMap idxFlds1 = new LinkedHashMap<>();
idxFlds1.put("lotTypeFk", true);
idxFlds1.put("validOn", true);
idxFlds1.put("ideCounterpartyRef", true);
idx1.setFields(idxFlds1);
idxs.add(idx1);

QueryIndex idx2 = new QueryIndex("rowId");
idxs.add(idx2); 

qryEntity.setIndexes(idxs);
-

Because I did not specify the type, I thought the index type was SORTED.
Actually, when nothing has been specified the type == null.

The thing is that null is considered to be  FULLTEXT.

>From GridQueryProcessor.java
 if (idx.getIndexType() == QueryIndexType.SORTED || idx.getIndexType() ==
QueryIndexType.GEOSPATIAL) {

} else {
assert idx.getIndexType() == QueryIndexType.FULLTEXT;

for (String field : idx.getFields().keySet()) {
String alias = aliases.get(field);

if (alias != null)
field = alias;

d.addFieldToTextIndex(field);
}
}


I solved that problem simply by setting the type explicitly 
idx1.setIndexType(QueryIndexType.SORTED);


I will do a full load using this initialization code. If the problem
persists, I'll do a reproducer. The problem is that it might be a little bit
difficult. 



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/LoadCache-Performance-decreases-with-the-size-of-the-cache-tp9645p9765.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: LoadCache Performance decreases with the size of the cache

2016-12-26 Thread steve.hostettler
Hello,

while investigating, I understood why I do have a lot of locks on Lucene
Documents in java mission control. That is because as soon as there is
String in the index, this is handled by Lucene even if you do not want full
text search (The string being a identifier).


--- From IgniteH2Indexing.java
if (type().valueClass() == String.class) {
try {
luceneIdx = new GridLuceneIndex(ctx, schema.offheap,
schema.spaceName, type);
}
catch (IgniteCheckedException e1) {
throw new IgniteException(e1);
}
}


Altought I do understand why this has been done that way, I wonder whether
it would not be better to let the user choose it. Lucene brings a lot of
features but also has an impact on the performances. In my case, I know that
the index on the string will never ever been searched as a substring.

The Lucene Index management seems to heavily rely on locks and therefore the
more threads the more contention on the index.

Is this a known behavior? Am I missing something.
Furthermore, there seems to be a "rebuildIndexes" method.  I am not sure
what this method do but if it efefctively rebuilds the index then I could do
a measure without indexes (t1) and then a measure with indexes (t2). After
that I rebuild the indexes (t2). Hence, if t1 + t3 < t2 then it means that
disabling the indexes during load make sens from a performance perspective.
Am I correct?






--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/LoadCache-Performance-decreases-with-the-size-of-the-cache-tp9645p9738.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: LoadCache Performance decreases with the size of the cache

2016-12-23 Thread steve.hostettler
Hi Val,

first of all, let me thank you for this great product and incredible mailing
list.

Anonymizing our code will take a bit longer but I can give the following
numbers:

load time 10M records without indexes 10m2s
load time 10M records with a simple row id type index : 12m
load time 10M records with a simple row id type index + a composed index (3
columns : Long, Long, String) :  40 minutes

The VM's heap is 56Gb, At the end of the processing, it consumes .. GB with
a peak of 32Gb. Maximum GC pause is 2s with a pause every 2 minutes. So GC
does not sound to be an issue.

As for the performance I took 3 snapshots of 5 minutes each.
After 2 min : 
 


After 15 min
 

After 35min
 


I was also a bit surprised of the use of Lucene internally to H2.
 


Hope you can make sense of all of this.

Thanks and happy holidays



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/LoadCache-Performance-decreases-with-the-size-of-the-cache-tp9645p9725.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: LoadCache Performance decreases with the size of the cache

2016-12-23 Thread steve.hostettler
Hello guys,

I finally understood what the "problem" is. With 20 millions of records, the
indexing time is very expensive. Is there a way, like for a database, to
disable indexing during load time and to re-enable it after loading?
I guess that rebalancing the b-trees (or whatever is used to implement the
indexes) is a costly operation to perform while the cache is loading.

To my understanding, indexes are declared in the cache config and directly
enabled.




--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/LoadCache-Performance-decreases-with-the-size-of-the-cache-tp9645p9719.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: LoadCache Performance decreases with the size of the cache

2016-12-22 Thread steve.hostettler
Hello Val,

Yes I'll try to upload something on github tomorrow. Thanks for the help



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/LoadCache-Performance-decreases-with-the-size-of-the-cache-tp9645p9710.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: LoadCache Performance decreases with the size of the cache

2016-12-22 Thread steve.hostettler
Sure here is the code   

QueryEntity qryEntity = new QueryEntity();
qryEntity.setKeyType("myapp.bpepoc.model.MyModelKey");
qryEntity.setValueType("myapp.bpepoc.model.MyModel");

LinkedHashMap fields = new LinkedHashMap<>();
fields.put("rowId", "java.lang.Integer");
fields.put("validOn", "java.lang.Integer");
fields.put("myFk", "java.lang.Integer");
fields.put("rowType", "java.lang.Short");
fields.put("businessRef", "java.lang.String");
fields.put("status", "java.lang.Short");
fields.put("modifDate", "java.sql.Date");
fields.put("ideCreditGroupRef", "java.lang.String");
fields.put("ideSegmentationRef", "java.lang.String");
fields.put("ideInternalPartyRef", "java.lang.String");
fields.put("ideInternalOne", "java.lang.String");

qryEntity.setFields(fields);

Collection idxs = new ArrayList<>();
QueryIndex idx = new QueryIndex();
idx.setName("FPK");
LinkedHashMap idxFlds = new LinkedHashMap<>();
idxFlds.put("rowId", true);
idxFlds.put("myFk", true);
idxFlds.put("validOn", true);
idxFlds.put("businessRef", true);
idx.setFields(idxFlds);
idxs.add(idx);

qryEntity.setIndexes(idxs);

final List queryEntities = new ArrayList<>();
queryEntities.add(qryEntity);

final CacheConfiguration ccfg = new 
CacheConfiguration<>(cacheName);

ccfg.setQueryEntities(queryEntities);




--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/LoadCache-Performance-decreases-with-the-size-of-the-cache-tp9645p9705.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: What are typical load time for millions of records

2016-12-21 Thread steve.hostettler
Hi,

I am using the partition capability with multiple threads to improve the
data loading, but referring to my other post : I see that the performance
are not constant with respect to the size of the cache. 

Regards



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/What-are-typical-load-time-for-millions-of-records-tp9648p9682.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: LoadCache Performance decreases with the size of the cache

2016-12-21 Thread steve.hostettler
Hi Val and thanks for the reply.

I narrowed it down a little bit. The problem comes from the indexing. I
tried with no indexing/query fields and then half the fields queryable and
it turns out that with indexing the loading performances decrease over time.

First, I would like to know whether it is expected for the performances to
decrease with the size of te cache.
Second, I've seen a lot of locks  on the document manager of Lucene. Are
there some parameters to configure?

Steve




--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/LoadCache-Performance-decreases-with-the-size-of-the-cache-tp9645p9681.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


What are typical load time for millions of records

2016-12-20 Thread steve.hostettler
I'm currently trying to improve the load time. I would like to have some
feedbacks from other Ignite users as for the actual load performances.

In my setup, I have a DB in the same subnet with an Ignite node with 8
cores. It loads 20,000,000 records in 1h with the standard loadCache
implementation. Which amounts to 5,555 rows per seconds = 694 rows per
seconds and per threads.

I know it is very dependent upon the actual topology, size of the objects,
etc but I like to understand whether   my performances are reasonable. I'm
asking that because I must load 180,000,000 every day and this sounds not
feasible with 5,555 rows per node on the grid.





--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/What-are-typical-load-time-for-millions-of-records-tp9648.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


LoadCache Performance decrease with the size of the cache

2016-12-20 Thread steve.hostettler
Hello,

I am trying to increase the performance of the cache loading. I am
witnessing a strange behavior: as the number of objects increase in the
cache the number of objects loaded per seconds decrease. The database server
seems not to be the problem. To get some numbers I copy pasted the
CacheAbstractJdbcStore implementation and added a couple of logs to
understand what is going on.

In the method call : public Void call() throws Exception there is a block 

while (rs.next()) {  
K1 key = buildObject(em.cacheName, em.keyType(), em.keyKind(),
em.keyColumns(), em.keyCols, colIdxs,   rs);
   V1 val = buildObject(em.cacheName, em.valueType(), em.valueKind(),
em.valueColumns(), null, colIdxs, rs);
   clo.apply(key, val);
}

Apparently the performance of the statement clo.apply(key, val) decreases
over time.

I first thought of a problem with the hashcode method that generates
collision but I made sure that I use a unique row id and that equals and
hashcode are based on it.

Any advice that would help me to understand where the problem comes from?

many thanks in advance




--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/LoadCache-Performance-decrease-with-the-size-of-the-cache-tp9645.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Affinity based on the fields NOT in the key

2016-12-09 Thread steve.hostettler
Hello Alexey,

yes it does thanks. No that I know that it is the way of doing it, it's
fine.



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Affinity-based-on-the-fields-NOT-in-the-key-tp9457p9460.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Affinity based on the fields NOT in the key

2016-12-09 Thread steve.hostettler
Hello, 

first let me apologize if there is any double submit. I think that my first
post on this subject was not visible because I did not do the subscription
completely.


My understanding is that affinities can only be based on fields in the key.
Is this correct? This seems counter-intuitive since most of the time, the
affinity will be based on some sort of central entity and therefore on a
foreign key to that table (surrogate key). 

That would mean that if I have a primary key with surrogate key I would have
to add the central table surrogate key to object key (which will break 3rd
normal form). The other alternatives is to retrieve the object from the
cache in the affinity function but at that time I do not whether the other
cache has been already loaded. Furthermore, this would hit the performances
dramatically. 

Any advice?

Many thanks in advance 



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Affinity-based-on-the-fields-NOT-in-the-key-tp9457.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.