Val, Yakov, Sorry for delay, I need time to think and to do some tests.
Anyway, extending the API and supply default implementation - is good. It makes frameworks more flexible and usable. But your proposal of extension will not solve the problem that I have raise. Please, read the next with special attention. Current implementation IgniteCache.loadCache causes parallel execution of IgniteCache.localLoadCache on each node in the cluster. It's bad implementation, but it's *right semantic*. You propose to extend IgniteCache.localLoadCache and use it to load data on all the nodes. It's bad semantic. But it also leads to bad implementation. Please note why. When you filter the data with the supplied IgniteBiPredicate, you may access the data that must be co-located. Hence to load the data to all the nodes, you need access to all the related data partitioned by the cluster. This leads to great network overhead and near caches overload. And that is why am I wondering that IgniteBiPredicate is executed for every key supplied by Cache.loadCache, but not only for those keys, which will be stored on this node. My opinion in conclusion. localLoadCache should first filter a key by the affinity function and the current cache topology, *then *invoke the predicate, and then store the entity in the cache (possibly by invoking the supplied closure). All associated partitions should be locked for the time of loading. IgniteCache.loadCache should perform Cache.loadCache on the one (or some more) nodes, then transfer entities to the remote nodes, *then *invoke the predicate and closure on the remote nodes. 2016-11-22 2:16 GMT+03:00 Valentin Kulichenko <valentin.kuliche...@gmail.com >: > Guys, > > I created a ticket for this: > https://issues.apache.org/jira/browse/IGNITE-4255 > > Feel free to provide comments. > > -Val > > On Sat, Nov 19, 2016 at 6:56 AM, Yakov Zhdanov <yzhda...@apache.org> > wrote: > > > > > > > > > > Why not store the partition ID in the database and query only local > > > partitions? Whatever approach we design with a DataStreamer will be > > slower > > > than this. > > > > > > > Because this can be some generic DB. Imagine the app migrating to IMDG. > > > > I am pretty sure that in many cases approach with data streamer will be > > faster and in many cases approach with multiple queries will be faster. > And > > the choice should depend on many factors. I like Val's suggestions. I > think > > he goes in the right direction. > > > > --Yakov > > > -- Thanks, Alexandr Kuramshin