Hi Ahmed, Muhammad, An MToNPartitioningMergingConnectorDescriptor would work. What you need to do is to set the location constraint of the global aggregator to have a location constraint with a cardinality of 1. Everything else should just work.
Cheers, Abdullah. > On Feb 21, 2018, at 2:00 PM, Ahmed Eldawy <[email protected]> wrote: > > Here's some context about this problem. We are trying to build a simple > GroupBy-Aggregate function using Hyracks. Think about the following SQL > query > SELECT id, COUNT(id) FROM dataset GROUP BY id; > Our design has two operators, local aggregator and global aggregator. > The local aggregator processes one input split at a time and computes its > histogram in the form of <ID, count>. > The global aggregator combines all the pairs <ID, count> produced by the > local aggregators to produce the final output as <ID, Sum(count)> > The question is which type of connector we should use to connect the local > aggregator to the global aggregator? While we know that an MtoN hash > connector will work, where each machine combines a subset of the keys, our > design is to combine all of them in a single machine. In other words, there > has to be only one instance of the global aggregator running on a single > machine. > > > On Tue, Feb 20, 2018 at 2:53 PM, Muhammad Abu Bakar Siddique < > [email protected]> wrote: > >> Hi, >> I am trying to code a very simple example that can compute a single >> histogram from two different files. I am able to compute separate >> histograms to for each file using OneToOneConnectorDescriptor. Now, I want >> to combine these two maps into one map. I could not find any >> MToOneConnector, where I can combine these two maps into one. Can somebody >> please guide me how to do in a correct way? >> What I did: >> 1. Created two splits for input files >> 2. Connected input to myOperatorDescriptor using >> OneToOneConnectorDescriptor >> 3. Connected myOperatorDescriptor to the output using >> OneToOneConnectorDescriptor >> 4. myOperatorDescriptor is reading the files and computing the histogram >> (in HashMap) for each file >> What I need to do: >> 1. Combine the maps into one. >> > > > > -- > > Ahmed Eldawy > Assistant Professor > http://www.cs.ucr.edu/~eldawy > Tel: +1 (951) 827-5654
