Hi Ahmed, Muhammad,

An MToNPartitioningMergingConnectorDescriptor would work. What you need to do 
is to set the location constraint of the global aggregator to have a location 
constraint with a cardinality of 1.
Everything else should just work.

Cheers,
Abdullah.

> On Feb 21, 2018, at 2:00 PM, Ahmed Eldawy <[email protected]> wrote:
> 
> Here's some context about this problem. We are trying to build a simple
> GroupBy-Aggregate function using Hyracks. Think about the following SQL
> query
> SELECT id, COUNT(id) FROM dataset GROUP BY id;
> Our design has two operators, local aggregator and global aggregator.
> The local aggregator processes one input split at a time and computes its
> histogram in the form of <ID, count>.
> The global aggregator combines all the pairs <ID, count> produced by the
> local aggregators to produce the final output as <ID, Sum(count)>
> The question is which type of connector we should use to connect the local
> aggregator to the global aggregator? While we know that an MtoN hash
> connector will work, where each machine combines a subset of the keys, our
> design is to combine all of them in a single machine. In other words, there
> has to be only one instance of the global aggregator running on a single
> machine.
> 
> 
> On Tue, Feb 20, 2018 at 2:53 PM, Muhammad Abu Bakar Siddique <
> [email protected]> wrote:
> 
>> Hi,
>> I am trying to code a very simple example that can compute a single
>> histogram from two different files. I am able to compute separate
>> histograms to for each file using  OneToOneConnectorDescriptor. Now, I want
>> to combine these two maps into one map. I could not find any
>> MToOneConnector, where I can combine these two maps into one. Can somebody
>> please guide me how to do in a correct way?
>> What I did:
>> 1. Created two splits for input files
>> 2. Connected input to myOperatorDescriptor using
>> OneToOneConnectorDescriptor
>> 3. Connected myOperatorDescriptor to the output using
>> OneToOneConnectorDescriptor
>> 4. myOperatorDescriptor is reading the files and computing the histogram
>> (in HashMap) for each file
>> What I need to do:
>> 1. Combine the maps into one.
>> 
> 
> 
> 
> -- 
> 
> Ahmed Eldawy
> Assistant Professor
> http://www.cs.ucr.edu/~eldawy
> Tel: +1 (951) 827-5654

Reply via email to