Not sure what is meant by a non equi join. Are you saying something like for every row in X, join it to all of the rows in Y where Y.a < something?
Is that what you are suggesting? Sent from a remote device. Please excuse any typos... Mike Segel On Apr 10, 2013, at 9:11 AM, Vikas Jadhav <[email protected]> wrote: > How are you going to support NON EQUI Join using MapReduce ? > As per my understanding there is only one way to do this is > to bring all data to one reducer then reducer will know lesser/greater > values correctly. > Correct me if I am wrong. > Thank You. > > Regards, > Vikas > > > > On Wed, Apr 10, 2013 at 4:22 PM, Michel Segel <[email protected]> > wrote: >> Can you show an example of your join? >> All joins are an equality in that the key has to match. >> Whether its a one to one , one to many, or many to many remains to be seen. >> >> >> Sent from a remote device. Please excuse any typos... >> >> Mike Segel >> >> On Apr 9, 2013, at 10:35 AM, Effyroth Gu <[email protected]> wrote: >> >>> Only equality joins, outer joins, and left semi joins are supported in >>> Hive. Hive does not support join conditions that are not equality >>> conditions as it is very difficult to express such conditions as a >>> map/reduce job. Also, more than two tables can be joined in Hive. >>> >>> >>> 2013/4/9 Michael Segel <[email protected]> >>>> Hi, >>>> >>>> Your cross join is supported in both pig and hive. (Cross, and Theta >>>> joins) >>>> >>>> So there must be code to do this. >>>> >>>> Essentially in the reducer you would have your key and then the set of >>>> rows that match the key. You would then perform the cross product on the >>>> key's result set and output them to the collector as separate rows. >>>> >>>> I'm not sure why you would need the reduce context. >>>> >>>> But then again, I'm still on my first cup of coffee. ;-) >>>> >>>> >>>> On Apr 9, 2013, at 12:15 AM, Vikas Jadhav <[email protected]> wrote: >>>> >>>>> Hi >>>>> I am also woring on join using MapReduce >>>>> i think instead of finding postion of table in RawKeyValuIterator. >>>>> what we can do modify context.write method to alway write key as table >>>>> name or id >>>>> then we dont need to find postion we can get Key and Value from >>>>> "reducerContext" >>>>> >>>>> befor calling reducer.run(reducerContext) in ReduceTask.java we can add >>>>> method join in Reducer.java Reducer class and give call to >>>>> reducer.join(reduceContext) >>>>> >>>>> >>>>> I just wonder how r going to support NON EQUI join. >>>>> >>>>> I am also having same problem how to do join if datasets cant fit in to >>>>> memory. >>>>> >>>>> >>>>> for now i am cloning using following code : >>>>> >>>>> >>>>> KEYIN key = context.getCurrentKey() ; >>>>> KEYIN outKey = null; >>>>> try { >>>>> outKey = (KEYIN)key.getClass().newInstance(); >>>>> } >>>>> catch(Exception e) >>>>> {} >>>>> ReflectionUtils.copy(context.getConfiguration(), key, outKey); >>>>> >>>>> Iterable<VALUEIN> values = context.getValues(); >>>>> ArrayList<VALUEIN> myValues = new ArrayList<VALUEIN>(); >>>>> for(VALUEIN value: values) { >>>>> VALUEIN outValue = null; >>>>> try { >>>>> outValue = (VALUEIN)value.getClass().newInstance(); >>>>> } >>>>> catch(Exception e) {} >>>>> ReflectionUtils.copy(context.getConfiguration(), value, outValue); >>>>> } >>>>> >>>>> >>>>> if you have found any other solution please feel free to share >>>>> >>>>> Thank You. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Thu, Mar 14, 2013 at 1:53 PM, Roth Effy <[email protected]> wrote: >>>>>> In reduce() we have: >>>>>> >>>>>> key1 values1 >>>>>> key2 values2 >>>>>> ... >>>>>> keyn valuesn >>>>>> >>>>>> so,what i want to do is join all values like a SQL: >>>>>> >>>>>> select * from values1,values2...valuesn; >>>>>> >>>>>> if memory is not enough to cache values,how to complete the join >>>>>> operation? >>>>>> my idea is clone the reducecontext,but it maybe not easy. >>>>>> >>>>>> Any help will be appreciated. >>>>>> >>>>>> >>>>>> 2013/3/13 Roth Effy <[email protected]> >>>>>>> I want a n:n join as Cartesian product,but the DataJoinReducerBase >>>>>>> looks like only support equal join. >>>>>>> I want a non-equal join,but I have no idea now. >>>>>>> >>>>>>> >>>>>>> 2013/3/13 Azuryy Yu <[email protected]> >>>>>>>> you want a n:n join or 1:n join? >>>>>>>> >>>>>>>> On Mar 13, 2013 10:51 AM, "Roth Effy" <[email protected]> wrote: >>>>>>>>> I want to join two table data in reducer.So I need to find the start >>>>>>>>> of the table. >>>>>>>>> someone said the DataJoinReducerBase can help me,isn't it? >>>>>>>>> >>>>>>>>> >>>>>>>>> 2013/3/13 Azuryy Yu <[email protected]> >>>>>>>>>> you cannot use RecordReader in Reducer. >>>>>>>>>> >>>>>>>>>> what's the mean of you want get the record position? I cannot >>>>>>>>>> understand, can you give a simple example? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, Mar 13, 2013 at 9:56 AM, Roth Effy <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>>> sorry,I still can't understand how to use recordreader in the >>>>>>>>>>> reduce(),because the input is a RawKeyValueIterator in the class >>>>>>>>>>> reducecontext.so,I'm confused. >>>>>>>>>>> anyway,thank you. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 2013/3/12 samir das mohapatra <[email protected]> >>>>>>>>>>>> Through the RecordReader and FileStatus you can get it. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Mar 12, 2013 at 4:08 PM, Roth Effy <[email protected]> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> Hi,everyone, >>>>>>>>>>>>> I want to join the k-v pairs in Reduce(),but how to get the >>>>>>>>>>>>> record position? >>>>>>>>>>>>> Now,what I thought is to save the context status,but class >>>>>>>>>>>>> Context doesn't implement a clone construct method. >>>>>>>>>>>>> >>>>>>>>>>>>> Any help will be appreciated. >>>>>>>>>>>>> Thank you very much. >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> >>>>> Thanx and Regards >>>>> Vikas Jadhav > > > > -- > > > Thanx and Regards > Vikas Jadhav
