I wil express it in SQL form select * from table1, table2 where table1.attr < table2.attr
it is also called theta join where theta can be <, >, <=,>=,!= On Wed, Apr 10, 2013 at 9:35 PM, Michel Segel <[email protected]>wrote: > Not sure what is meant by a non equi join. > > Are you saying something like for every row in X, join it to all of the > rows in Y where Y.a < something? > > Is that what you are suggesting? > > > Sent from a remote device. Please excuse any typos... > > Mike Segel > > On Apr 10, 2013, at 9:11 AM, Vikas Jadhav <[email protected]> > wrote: > > How are you going to support NON EQUI Join using MapReduce ? > As per my understanding there is only one way to do this is > to bring all data to one reducer then reducer will know lesser/greater > values correctly. > Correct me if I am wrong. > Thank You. > > * Regards,* > * Vikas * > > > > On Wed, Apr 10, 2013 at 4:22 PM, Michel Segel > <[email protected]>wrote: > >> Can you show an example of your join? >> All joins are an equality in that the key has to match. >> Whether its a one to one , one to many, or many to many remains to be >> seen. >> >> >> Sent from a remote device. Please excuse any typos... >> >> Mike Segel >> >> On Apr 9, 2013, at 10:35 AM, Effyroth Gu <[email protected]> wrote: >> >> Only equality joins, outer joins, and left semi joins are supported in >> Hive. Hive does not support join conditions that are not equality >> conditions as it is very difficult to express such conditions as a >> map/reduce job. Also, more than two tables can be joined in Hive. >> >> >> 2013/4/9 Michael Segel <[email protected]> >> >>> Hi, >>> >>> Your cross join is supported in both pig and hive. (Cross, and Theta >>> joins) >>> >>> So there must be code to do this. >>> >>> Essentially in the reducer you would have your key and then the set of >>> rows that match the key. You would then perform the cross product on the >>> key's result set and output them to the collector as separate rows. >>> >>> I'm not sure why you would need the reduce context. >>> >>> But then again, I'm still on my first cup of coffee. ;-) >>> >>> >>> On Apr 9, 2013, at 12:15 AM, Vikas Jadhav <[email protected]> >>> wrote: >>> >>> Hi >>> I am also woring on join using MapReduce >>> i think instead of finding postion of table in RawKeyValuIterator. >>> what we can do modify context.write method to alway write key as table >>> name or id >>> then we dont need to find postion we can get Key and Value from >>> "reducerContext" >>> >>> befor calling reducer.run(reducerContext) in ReduceTask.java we can add >>> method join in Reducer.java Reducer class and give call to >>> reducer.join(reduceContext) >>> >>> >>> I just wonder how r going to support NON EQUI join. >>> >>> I am also having same problem how to do join if datasets cant fit in to >>> memory. >>> >>> >>> for now i am cloning using following code : >>> >>> >>> KEYIN key = context.getCurrentKey() ; >>> KEYIN outKey = null; >>> try { >>> outKey = (KEYIN)key.getClass().newInstance(); >>> } >>> catch(Exception e) >>> {} >>> ReflectionUtils.copy(context.getConfiguration(), key, outKey); >>> >>> Iterable<VALUEIN> values = context.getValues(); >>> ArrayList<VALUEIN> myValues = new ArrayList<VALUEIN>(); >>> for(VALUEIN value: values) { >>> VALUEIN outValue = null; >>> try { >>> outValue = (VALUEIN)value.getClass().newInstance(); >>> } >>> catch(Exception e) {} >>> ReflectionUtils.copy(context.getConfiguration(), value, outValue); >>> } >>> >>> >>> if you have found any other solution please feel free to share >>> >>> Thank You. >>> >>> >>> >>> >>> >>> >>> On Thu, Mar 14, 2013 at 1:53 PM, Roth Effy <[email protected]> wrote: >>> >>>> In reduce() we have: >>>> >>>> key1 values1 >>>> key2 values2 >>>> ... >>>> keyn valuesn >>>> >>>> so,what i want to do is join all values like a SQL: >>>> >>>> select * from values1,values2...valuesn; >>>> >>>> if memory is not enough to cache values,how to complete the join >>>> operation? >>>> my idea is clone the reducecontext,but it maybe not easy. >>>> >>>> Any help will be appreciated. >>>> >>>> >>>> 2013/3/13 Roth Effy <[email protected]> >>>> >>>>> I want a n:n join as Cartesian product,but the DataJoinReducerBase looks >>>>> like only support equal join. >>>>> I want a non-equal join,but I have no idea now. >>>>> >>>>> >>>>> 2013/3/13 Azuryy Yu <[email protected]> >>>>> >>>>>> you want a n:n join or 1:n join? >>>>>> On Mar 13, 2013 10:51 AM, "Roth Effy" <[email protected]> wrote: >>>>>> >>>>>>> I want to join two table data in reducer.So I need to find the start >>>>>>> of the table. >>>>>>> someone said the DataJoinReducerBase can help me,isn't it? >>>>>>> >>>>>>> >>>>>>> 2013/3/13 Azuryy Yu <[email protected]> >>>>>>> >>>>>>>> you cannot use RecordReader in Reducer. >>>>>>>> >>>>>>>> what's the mean of you want get the record position? I cannot >>>>>>>> understand, can you give a simple example? >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Mar 13, 2013 at 9:56 AM, Roth Effy <[email protected]>wrote: >>>>>>>> >>>>>>>>> sorry,I still can't understand how to use recordreader in the >>>>>>>>> reduce(),because the input is a RawKeyValueIterator in the class >>>>>>>>> reducecontext.so,I'm confused. >>>>>>>>> anyway,thank you. >>>>>>>>> >>>>>>>>> >>>>>>>>> 2013/3/12 samir das mohapatra <[email protected]> >>>>>>>>> >>>>>>>>>> Through the RecordReader and FileStatus you can get it. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Mar 12, 2013 at 4:08 PM, Roth Effy <[email protected]>wrote: >>>>>>>>>> >>>>>>>>>>> Hi,everyone, >>>>>>>>>>> I want to join the k-v pairs in Reduce(),but how to get the >>>>>>>>>>> record position? >>>>>>>>>>> Now,what I thought is to save the context status,but class >>>>>>>>>>> Context doesn't implement a clone construct method. >>>>>>>>>>> >>>>>>>>>>> Any help will be appreciated. >>>>>>>>>>> Thank you very much. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>> >>> >>> >>> -- >>> * >>> * >>> * >>> >>> Thanx and Regards* >>> * Vikas Jadhav* >>> >>> >>> >> > > > -- > * > * > * > > Thanx and Regards* > * Vikas Jadhav* > > -- * * * Regards,* * Vikas *
