Hi Kevin, My considerations: 1) I see that amount of allocated heap is gradually increases over time. Can you confirm that you use the configuration with OFFHEAP which you shown several posts earlier? 2) Why do you have several nodes per host? Our recommended approach is to have a single node per machine. 3) The main things: you invoke loadCache() method. This method will execute the same query - "SELECT *" as you mentioned earlier - on all nodes. It means that your cluster will have to perform full-scan N times, where N - number of nodes.
I looked closely at our store API and I believe it is not suited for such cases (millions rows, lots of nodes in topology) well. We will think of better API to handle such scenarios. My recommendations for now: 1) Start no more than 1 node per host. This way you will decrease amount of scans from 16 to 2-3, which should make perf much better. Let each node take as much memory as possible. 2) If this approach still doesn't show good enough numbers, consider switching to *IgniteDataStreamer *instead - https://apacheignite.readme.io/docs/data-streamers It was designed specifically for efficient bulk data load by employing batching and affinity co-location techniques. You can do that in the following way: - Create *IgniteDataStreamer *for your cache; - Scan the table using JDBC, Hibernate of any other framework you have; - For each returned row, create appropriate cache key and value object; - Put key-value pair to the streamer: *IgniteDataStreamer.addData()*; - When scan is finished, close the streamer: IgniteDataStreamer.close(). This should be done only on one node. I believe the second approach should show much better numbers. Vladimir. On Wed, Apr 27, 2016 at 6:06 AM, Zhengqingzheng <[email protected]> wrote: > Hi Vladimir, > > Sorry to reply so late. The loadCache process took 8 hours to load all > the data (this time ,there is no exception occurred, but memory consumption > up to 56gb, 80% of all the heap that I have defined, which include 10 > nodes, each node allocate 7gb heap). > > I give you all the log files which were located at work/log/ folder. > Please see the attachment. > > > > > > *发件人:* Vladimir Ozerov [mailto:[email protected]] > *发送时间:* 2016年4月25日 20:21 > *收件人:* [email protected] > *主题:* Re: Ignite cache data size problem. > > > > Hi Kevin, > > > > I performed several experiments. Essentially, I put 1M entries of the > class you provided with fields initialized as follows: > > > > *for *(*int *i = 0; i < 1_000_000; i++) { > UniqueField field = *new *UniqueField(); > > field.setDate(*new *Date()); > field.setGuid(UUID.*randomUUID*().toString()); > field.setMsg(String.*valueOf*(i)); > > field.setNum(BigDecimal.*valueOf*(ThreadLocalRandom.*current*().nextDouble())); > field.setOId(String.*valueOf*(i)); > field.setOrgId(String.*valueOf*(i)); > > cache.put(i, field); > } > > > > My results are: > > 1) Onheap, no indexes - about 400Mb is required to store 1M objects, or > ~20Gb for 47M objects. > > 2) Onheap, with indexes - about 650Mb, or ~30Gb for 47M objects. > > 3) Offheap, with indexes - about 400Mb offheap memory is required, or > ~20Gb for all your objects. > > > > Could you please provide more information on the error you receive? Also, > you could try load entries in a batches of a well-defined size (say, 1M), > and see what happens to the system. I expect you should see similar numbers. > > > > Vladimir. > > > > On Fri, Apr 22, 2016 at 3:26 PM, kevin.zheng <[email protected]> > wrote: > > BTW, I created 4 + 3 nodes on two servers. > each node I called command like this ./ignite.sh -J-Xmx8g -J-Xms8g > > kind regards, > Kevin > > > > -- > View this message in context: > http://apache-ignite-users.70518.x6.nabble.com/Ignite-cache-data-size-problem-tp4449p4454.html > Sent from the Apache Ignite Users mailing list archive at Nabble.com. > > >
