Re: How can I record some position of context in Reduce()?

Vikas Jadhav Wed, 10 Apr 2013 21:19:16 -0700

I wil express it in SQL form

select * from table1, table2 where table1.attr < table2.attr


it is also called theta join where theta can be <, >, <=,>=,!=



On Wed, Apr 10, 2013 at 9:35 PM, Michel Segel <[email protected]>wrote:

> Not sure what is meant by a non equi join.
>
> Are you saying something like for every row in X, join it to all of the
> rows in Y where Y.a < something?
>
> Is that what you are suggesting?
>
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Apr 10, 2013, at 9:11 AM, Vikas Jadhav <[email protected]>
> wrote:
>
> How are you going to support NON EQUI Join using MapReduce ?
> As per my understanding there is only one way to do this is
> to bring all data to one reducer then reducer will know lesser/greater
> values correctly.
> Correct me if I am wrong.
> Thank You.
>
> *  Regards,*
> *  Vikas *
>
>
>
> On Wed, Apr 10, 2013 at 4:22 PM, Michel Segel 
> <[email protected]>wrote:
>
>> Can you show an example of your join?
>> All joins are an equality in that the key has to match.
>> Whether its a one to one , one to many, or many to many remains to be
>> seen.
>>
>>
>> Sent from a remote device. Please excuse any typos...
>>
>> Mike Segel
>>
>> On Apr 9, 2013, at 10:35 AM, Effyroth Gu <[email protected]> wrote:
>>
>> Only equality joins, outer joins, and left semi joins are supported in
>> Hive. Hive does not support join conditions that are not equality
>> conditions as it is very difficult to express such conditions as a
>> map/reduce job. Also, more than two tables can be joined in Hive.
>>
>>
>> 2013/4/9 Michael Segel <[email protected]>
>>
>>> Hi,
>>>
>>> Your cross join is supported in both pig and hive. (Cross, and Theta
>>> joins)
>>>
>>> So there must be code to do this.
>>>
>>> Essentially in the reducer you would have your key and then the set of
>>> rows that match the key. You would then perform the cross product on the
>>> key's result set and output them to the collector as separate rows.
>>>
>>> I'm not sure why you would need the reduce context.
>>>
>>> But then again, I'm still on my first cup of coffee. ;-)
>>>
>>>
>>> On Apr 9, 2013, at 12:15 AM, Vikas Jadhav <[email protected]>
>>> wrote:
>>>
>>> Hi
>>> I am also woring on join using MapReduce
>>> i think instead of finding postion of table in RawKeyValuIterator.
>>> what we can do modify context.write method to alway write key as table
>>> name or id
>>> then we dont need to find postion we can get Key and Value from
>>> "reducerContext"
>>>
>>> befor calling reducer.run(reducerContext) in ReduceTask.java we can  add
>>> method join in Reducer.java Reducer class and give call to
>>> reducer.join(reduceContext)
>>>
>>>
>>> I just wonder how r going to support NON EQUI join.
>>>
>>> I am also having same problem how to do join if datasets cant fit in to
>>> memory.
>>>
>>>
>>> for now i am cloning using following code :
>>>
>>>
>>> KEYIN key = context.getCurrentKey() ;
>>> KEYIN outKey = null;
>>> try {
>>>     outKey = (KEYIN)key.getClass().newInstance();
>>>    }
>>> catch(Exception e)
>>>  {}
>>> ReflectionUtils.copy(context.getConfiguration(), key, outKey);
>>>
>>>  Iterable<VALUEIN> values = context.getValues();
>>>  ArrayList<VALUEIN> myValues = new ArrayList<VALUEIN>();
>>>  for(VALUEIN value: values) {
>>>    VALUEIN outValue = null;
>>>     try {
>>>          outValue = (VALUEIN)value.getClass().newInstance();
>>>    }
>>>    catch(Exception e)    {}
>>>    ReflectionUtils.copy(context.getConfiguration(), value, outValue);
>>>  }
>>>
>>>
>>> if you have found any other solution please feel free to share
>>>
>>> Thank You.
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Mar 14, 2013 at 1:53 PM, Roth Effy <[email protected]> wrote:
>>>
>>>> In reduce() we have:
>>>>
>>>> key1 values1
>>>> key2 values2
>>>> ...
>>>> keyn valuesn
>>>>
>>>> so,what i want to do is join all values like a SQL:
>>>>
>>>> select * from values1,values2...valuesn;
>>>>
>>>> if memory is not enough to cache values,how to complete the join
>>>> operation?
>>>> my idea is clone the reducecontext,but it maybe not easy.
>>>>
>>>> Any help will be appreciated.
>>>>
>>>>
>>>> 2013/3/13 Roth Effy <[email protected]>
>>>>
>>>>> I want a n:n join as Cartesian product,but the DataJoinReducerBase looks
>>>>> like only support equal join.
>>>>> I want a non-equal join,but I have no idea now.
>>>>>
>>>>>
>>>>> 2013/3/13 Azuryy Yu <[email protected]>
>>>>>
>>>>>> you want a n:n join or 1:n join?
>>>>>> On Mar 13, 2013 10:51 AM, "Roth Effy" <[email protected]> wrote:
>>>>>>
>>>>>>> I want to join two table data in reducer.So I need to find the start
>>>>>>> of the table.
>>>>>>> someone said the DataJoinReducerBase can help me,isn't it?
>>>>>>>
>>>>>>>
>>>>>>> 2013/3/13 Azuryy Yu <[email protected]>
>>>>>>>
>>>>>>>> you cannot use RecordReader in Reducer.
>>>>>>>>
>>>>>>>> what's the mean of you want get the record position? I cannot
>>>>>>>> understand, can you give a simple example?
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Mar 13, 2013 at 9:56 AM, Roth Effy <[email protected]>wrote:
>>>>>>>>
>>>>>>>>> sorry，I still can't understand how to use recordreader in the
>>>>>>>>> reduce(),because the input is a RawKeyValueIterator in the class
>>>>>>>>> reducecontext.so,I'm confused.
>>>>>>>>> anyway,thank you.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2013/3/12 samir das mohapatra <[email protected]>
>>>>>>>>>
>>>>>>>>>> Through the RecordReader and FileStatus you can get it.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Mar 12, 2013 at 4:08 PM, Roth Effy <[email protected]>wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,everyone,
>>>>>>>>>>> I want to join the k-v pairs in Reduce(),but how to get the
>>>>>>>>>>> record position?
>>>>>>>>>>> Now,what I thought is to save the context status,but class
>>>>>>>>>>> Context doesn't implement a clone construct method.
>>>>>>>>>>>
>>>>>>>>>>> Any help will be appreciated.
>>>>>>>>>>> Thank you very much.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> *
>>> *
>>> *
>>>
>>> Thanx and Regards*
>>> * Vikas Jadhav*
>>>
>>>
>>>
>>
>
>
> --
> *
> *
> *
>
> Thanx and Regards*
> * Vikas Jadhav*
>
>


-- 
*
*
*

  Regards,*
*   Vikas *

Re: How can I record some position of context in Reduce()?

Reply via email to