In according to your suggestions I try to rewrite benchmark tests using jmh, here is link benchmark: https://github.com/aysheka/ignite-benchmarks/blob/master/test/src/test/java/com/example/ignite/IgniteCacheLocalReadTest.java <https://github.com/aysheka/ignite-benchmarks>
Warmup partially resolve problem with cold start and speed was increased up to 40% but I don’t think that it’s enough good. Now reading 1 million records take about 200 ms (~5000 entries per ms) can I improve this parameters and how? With best regards Alisher Alimov [email protected] > On 29 Nov 2016, at 11:41, Yakov Zhdanov <[email protected]> wrote: > > I think it is a bad idea to set number of threads greater than CPU count > given you do not block on any IO operations > > However, benchmark itself seems totally incorrect to me. The most important > issues are > 1. total time is too low. try to put it to a loop body and measure each > iterations. you can also (moreover, I would like you to) use jmh for local > operations benchmarks > 2. you don't have any warm up phase. > > Please refer to > http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java > > <http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java> > or any other resource explaining on how to create benchmarks. > > --Yakov > > 2016-11-29 15:14 GMT+07:00 Vladislav Pyatkov <[email protected] > <mailto:[email protected]>>: > Hi Alisher, > > It look doubt for me. You parallelize the job, but got a performance decrease. > I recommend to use a java profiler and try to separate long time methods. > > How are you get a list of local partition (can it contain excess numbers)? > And please check, has forckjoin pool enough size: > > -Djava.util.concurrent.ForkJoinPool.common.parallelism=1024 > > > On Nov 28, 2016 8:39 PM, "Alisher Alimov" <[email protected] > <mailto:[email protected]>> wrote: > I found only one way to parallelize read via ScanQuery > > int[] partitions = > this.ignite.affinity("test.cache").primaryPartitions(this.ignite.cluster().node()); > > startTime = System.currentTimeMillis(); > > Arrays.stream(partitions) > .parallel() > .forEach(partition -> { > ScanQuery<Object, Object> qry = new ScanQuery<>(partition); > qry.setLocal(true); > qry.setPageSize(5_000); > > QueryCursor<Cache.Entry<Object, Object>> query = cache.query(qry); > List<Cache.Entry<Object, Object>> all = query.getAll(); > }); > > System.out.println(String.format("Complete in: %dms", > System.currentTimeMillis() - startTime)); > > But it’s doesn’t help a lot (speed was downgrade on 10-20%) or there is > another good solution to do it? > > > > With best regards > Alisher Alimov > [email protected] <mailto:[email protected]> > > > > >> On 28 Nov 2016, at 19:38, Alexey Goncharuk <[email protected] >> <mailto:[email protected]>> wrote: >> >> Hi Alisher, >> >> As Nicolae suggested, try parallelizing your scan using per-partition >> iterator. This should give you almost linear performance growth up to the >> number of available CPUs. >> Also make sure to set CacheConfiguration#copyOnRead flag to false. >> >> --AG >> >> 2016-11-28 19:31 GMT+03:00 Marasoiu Nicolae <[email protected] >> <mailto:[email protected]>>: >> Regarding CPU load, a single thread of execution exists in the program so >> (at most) one core is used. So if you have 8 cores, it means that it is 8 to >> 16 times slower than a program able to use all the cores & CPU redundancy of >> the machine. >> In my tests, indeed, a core looks fully utilized. To me, scanning 1M >> key-values per second is pretty ok, but indeed, if LMAX got 6M transactions >> per core per second, it can perhaps go up, but something tells me this will >> not be the limitation of the typical application. >> >> Met vriendelijke groeten / Meilleures salutations / Best regards >> >> Nicolae Marasoiu >> Agile Developer >> >> E [email protected] <mailto:[email protected]> >> >> CEGEKA 15-17 Ion Mihalache Blvd. Tower Center Building, >> 4th,5th,6th,8th,9th fl >> RO-011171 Bucharest (RO), Romania >> T +40 21 336 20 65 >> WWW.CEGEKA.COM <http://www.cegeka.com/> >> <https://www.linkedin.com/company/cegeka-romania> >> De la: Alisher Alimov <[email protected] >> <mailto:[email protected]>> >> Trimis: 28 noiembrie 2016 15:27 >> Către: [email protected] <mailto:[email protected]> >> Subiect: Performance question >> >> Hello! >> >> I have write and run a simple performance test to check >> IgniteCache#localEntries and found that current method is not enough fast. >> >> Ignite ignite = Ignition.start(); >> >> CacheConfiguration<UUID, UUID> cacheConfiguration = new >> CacheConfiguration<>(); >> cacheConfiguration.setBackups(0); >> >> IgniteCache<UUID, UUID> cache = ignite.getOrCreateCache("test.cache"); >> >> for (int i = 0; i < 1_000_000; i++) { >> cache.put(UUID.randomUUID(), UUID.randomUUID()); >> } >> >> long startTime = System.currentTimeMillis(); >> >> cache.localEntries(CachePeekMode.PRIMARY).forEach(entry -> { >> }); >> >> System.out.println(String.format("Complete in: %dms", >> System.currentTimeMillis() - startTime)); >> >> Reading local entries take about 1s (1000 rows per ms) that’s is low. >> Test was run on server with provided configuration with default Ignite >> configs, load average was about 0 and CPU was not busy more than 10% >> Intel(R) Xeon(R) CPU E5645 @ 2.40GHz >> >> >> May be I do or configure something wrong or current speed is normal? >> >> >> With best regards >> Alisher Alimov >> [email protected] <mailto:[email protected]> >> >> >> >> >> > >
