Re: How can I record some position of context in Reduce()?

Michel Segel Wed, 10 Apr 2013 09:06:21 -0700

Not sure what is meant by a non equi join.

Are you saying something like for every row in X, join it to all of the rows in 
Y where Y.a < something?


Is that what you are suggesting?


Sent from a remote device. Please excuse any typos...

Mike Segel

On Apr 10, 2013, at 9:11 AM, Vikas Jadhav <[email protected]> wrote:

> How are you going to support NON EQUI Join using MapReduce ?
> As per my understanding there is only one way to do this is
> to bring all data to one reducer then reducer will know lesser/greater
> values correctly.
> Correct me if I am wrong.
> Thank You.
>  
>   Regards,
>   Vikas
>  
> 
> 
> On Wed, Apr 10, 2013 at 4:22 PM, Michel Segel <[email protected]> 
> wrote:
>> Can you show an example of your join?
>> All joins are an equality in that the key has to match.
>> Whether its a one to one , one to many, or many to many remains to be seen.
>> 
>> 
>> Sent from a remote device. Please excuse any typos...
>> 
>> Mike Segel
>> 
>> On Apr 9, 2013, at 10:35 AM, Effyroth Gu <[email protected]> wrote:
>> 
>>> Only equality joins, outer joins, and left semi joins are supported in 
>>> Hive. Hive does not support join conditions that are not equality 
>>> conditions as it is very difficult to express such conditions as a 
>>> map/reduce job. Also, more than two tables can be joined in Hive.
>>> 
>>> 
>>> 2013/4/9 Michael Segel <[email protected]>
>>>> Hi,
>>>> 
>>>> Your cross join is supported in both pig and hive. (Cross, and Theta 
>>>> joins) 
>>>> 
>>>> So there must be code to do this. 
>>>> 
>>>> Essentially in the reducer you would have your key and then the set of 
>>>> rows that match the key. You would then perform the cross product on the 
>>>> key's result set and output them to the collector as separate rows. 
>>>> 
>>>> I'm not sure why you would need the reduce context. 
>>>> 
>>>> But then again, I'm still on my first cup of coffee. ;-)
>>>> 
>>>> 
>>>> On Apr 9, 2013, at 12:15 AM, Vikas Jadhav <[email protected]> wrote:
>>>> 
>>>>> Hi
>>>>> I am also woring on join using MapReduce
>>>>> i think instead of finding postion of table in RawKeyValuIterator.
>>>>> what we can do modify context.write method to alway write key as table 
>>>>> name or id
>>>>> then we dont need to find postion we can get Key and Value from 
>>>>> "reducerContext"
>>>>>  
>>>>> befor calling reducer.run(reducerContext) in ReduceTask.java we can  add
>>>>> method join in Reducer.java Reducer class and give call to 
>>>>> reducer.join(reduceContext)
>>>>>  
>>>>>  
>>>>> I just wonder how r going to support NON EQUI join.
>>>>>  
>>>>> I am also having same problem how to do join if datasets cant fit in to 
>>>>> memory.
>>>>>  
>>>>>  
>>>>> for now i am cloning using following code :
>>>>>  
>>>>>  
>>>>> KEYIN key = context.getCurrentKey() ;
>>>>> KEYIN outKey = null;
>>>>> try {
>>>>>     outKey = (KEYIN)key.getClass().newInstance();
>>>>>    }
>>>>> catch(Exception e)
>>>>>  {}         
>>>>> ReflectionUtils.copy(context.getConfiguration(), key, outKey);       
>>>>> 
>>>>>  Iterable<VALUEIN> values = context.getValues();
>>>>>  ArrayList<VALUEIN> myValues = new ArrayList<VALUEIN>();
>>>>>  for(VALUEIN value: values) {        
>>>>>    VALUEIN outValue = null;
>>>>>     try {
>>>>>          outValue = (VALUEIN)value.getClass().newInstance();
>>>>>    }
>>>>>    catch(Exception e)    {}          
>>>>>    ReflectionUtils.copy(context.getConfiguration(), value, outValue);
>>>>>  }
>>>>>  
>>>>>  
>>>>> if you have found any other solution please feel free to share
>>>>>  
>>>>> Thank You.
>>>>>  
>>>>>        
>>>>>  
>>>>>  
>>>>> 
>>>>> 
>>>>> On Thu, Mar 14, 2013 at 1:53 PM, Roth Effy <[email protected]> wrote:
>>>>>> In reduce() we have:
>>>>>> 
>>>>>> key1 values1
>>>>>> key2 values2
>>>>>> ...
>>>>>> keyn valuesn
>>>>>> 
>>>>>> so,what i want to do is join all values like a SQL:
>>>>>> 
>>>>>> select * from values1,values2...valuesn;
>>>>>> 
>>>>>> if memory is not enough to cache values,how to complete the join 
>>>>>> operation?
>>>>>> my idea is clone the reducecontext,but it maybe not easy.
>>>>>> 
>>>>>> Any help will be appreciated.
>>>>>> 
>>>>>> 
>>>>>> 2013/3/13 Roth Effy <[email protected]>
>>>>>>> I want a n:n join as Cartesian product,but the DataJoinReducerBase 
>>>>>>> looks like only support equal join.
>>>>>>> I want a non-equal join,but I have no idea now.
>>>>>>> 
>>>>>>> 
>>>>>>> 2013/3/13 Azuryy Yu <[email protected]>
>>>>>>>> you want a n:n join or 1:n join?
>>>>>>>> 
>>>>>>>> On Mar 13, 2013 10:51 AM, "Roth Effy" <[email protected]> wrote:
>>>>>>>>> I want to join two table data in reducer.So I need to find the start 
>>>>>>>>> of the table.
>>>>>>>>> someone said the DataJoinReducerBase can help me,isn't it?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 2013/3/13 Azuryy Yu <[email protected]>
>>>>>>>>>> you cannot use RecordReader in Reducer.
>>>>>>>>>>  
>>>>>>>>>> what's the mean of you want get the record position? I cannot 
>>>>>>>>>> understand, can you give a simple example?
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Wed, Mar 13, 2013 at 9:56 AM, Roth Effy <[email protected]> 
>>>>>>>>>> wrote:
>>>>>>>>>>> sorry，I still can't understand how to use recordreader in the 
>>>>>>>>>>> reduce(),because the input is a RawKeyValueIterator in the class 
>>>>>>>>>>> reducecontext.so,I'm confused.
>>>>>>>>>>> anyway,thank you.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 2013/3/12 samir das mohapatra <[email protected]>
>>>>>>>>>>>> Through the RecordReader and FileStatus you can get it.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Tue, Mar 12, 2013 at 4:08 PM, Roth Effy <[email protected]> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> Hi,everyone,
>>>>>>>>>>>>> I want to join the k-v pairs in Reduce(),but how to get the 
>>>>>>>>>>>>> record position?
>>>>>>>>>>>>> Now,what I thought is to save the context status,but class 
>>>>>>>>>>>>> Context doesn't implement a clone construct method.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Any help will be appreciated.
>>>>>>>>>>>>> Thank you very much.
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> 
>>>>> 
>>>>> Thanx and Regards
>>>>>  Vikas Jadhav
> 
> 
> 
> -- 
> 
> 
> Thanx and Regards
>  Vikas Jadhav

Re: How can I record some position of context in Reduce()?

Reply via email to