Hey Mircea,

On the right track but not exactly. I do not use separate thread per 
key. Input keys for map/combine/reduce are split into a List of Lists 
and then each lists is submitted to Executor as a separate Runnable. 
Have a look at submitToExecutor method on 
https://github.com/vblagoje/infinispan/tree/t_2284_new

What I intend to do is move this code to DataContainer because the 
minimal overhead of directly iterating keys inside container and 
iterating in parallel should gives a further edge.

Stay tuned,
Vladimir

On 12/6/2013, 11:18 AM, Mircea Markus wrote:
> Thanks Vladimir, I like the hands on approach!
> Adding -dev, there's a lot of interest around the parallel M/R so I think 
> others will have some thoughts on it as well.
>
> So what you're basically doing in your branch is iterate over all the keys in 
> the cache and then for each key invoke the mapping in a separate thread. 
> Whilst this would work, I think it has some drawbacks:
> - the iteration over the keys in the container happens in sequence, albeit 
> the mapping phases happening in parallel. This speeds things up a bit but not 
> as much as having the iteration
> happening in parallel, especially when the mapper is fast, which I think it's 
> pretty common.
> - the StatelessTask + some smaller objects are being created for each 
> iterated key. That's a lot of noise for the GC imo
>
> I think delegating the parallel iteration to the DataContainer (similar to 
> AdvancedCacheLoader.process (Executor)) would be a better approach IMO:
> - the logic is reusable for other components as well, such as querying (to 
> implement full-scan-like search, or a general purpose parallel iterator over 
> the keys
> - object creation is reduced
> - the DefaultDetaContainer uses an EquivalentConcurrentHashMapV8 for holding 
> the entries, which already supports parallel iteration so the heavy lifting 
> is already in place
>
> On Dec 4, 2013, at 5:16 PM, Vladimir Blagojevic <vblag...@redhat.com> wrote:
>
>> Here is my M/R parallel execution solution updated to master 
>> https://github.com/vblagoje/infinispan/tree/t_2284_new
>>
>> Now, I'll work on your solution which I am starting to like actually the 
>> more I think about it. Although I have to admit that I would eviscerate some 
>> of your interfaces like these KeyFilters into more prominent packages so we 
>> can all use the same interfaces. Also I would see if we can genericize some 
>> of your interfaces and implementations.
>>
>> Will keep you updated.
>>
>> Vladimir
> Cheers,

_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Reply via email to