sorry I meant leftJoin not joinLeft On Thu, May 23, 2019 at 3:43 PM Alex Levenson <[email protected]> wrote:
> Scalding's `join` methods (join, joinLeft, etc) are the way to join 2 > large tables. It's implemented as a standard map/reduce shuffle join, and > scales horizontally, though it does require sending the full dataset across > the network from the mappers to the reducers. > > If you have skew in your keyspace (some keys appear far more often than > others) you can use a skew join, which has special handling for frequently > appearing keys. You can tell if you have skew in your keyspace from your > hadoop counters and from the symptom of a small number of your (many) > reducers taking much much longer than the others. > > On Thu, May 23, 2019 at 3:40 PM Saket Kumar <[email protected]> > wrote: > >> Thanks for replying to this. Is there any other technique in scalding to >> join two large tables? >> >> >> >> On Thursday, May 23, 2019 at 3:15:12 PM UTC-7, Alex Levenson wrote: >>> >>> Yes, we don't have that feature in scalding unfortunately. >>> >>> >>> On Thu, May 23, 2019 at 3:11 PM Rajat Ahuja <[email protected]> wrote: >>> >>>> @Alex It is efficient if data sets are already partitioned so that we >>>> do not pass it through reducers to partition it. >>>> @Saket Scalding Library does not support sorted bucketed join as of >>>> now. >>>> >>>> Thanks >>>> Rajat Ahuja >>>> >>>> On Fri, May 24, 2019 at 3:28 AM 'Alex Levenson' via Scalding >>>> Development <[email protected]> wrote: >>>> >>>>> I'm not very familiar with that. I did some googling, it looks like >>>>> that's for merging two already sorted datasets, is that right? >>>>> >>>>> On Thu, May 23, 2019 at 2:34 PM Saket Kumar <[email protected]> >>>>> wrote: >>>>> >>>>>> There is a feature in Hive to do Sorted Merge Bucket join. How can >>>>>> this be implemented in Scalding? >>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "Scalding Development" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/scalding-dev/fc7f4c54-651c-4ef8-aae1-5798c206b9fa%40googlegroups.com >>>>>> <https://groups.google.com/d/msgid/scalding-dev/fc7f4c54-651c-4ef8-aae1-5798c206b9fa%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> >>>>> -- >>>>> Alex Levenson >>>>> @THISWILLWORK >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Scalding Development" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/scalding-dev/CA%2Bkkn9-Kbcfr31%2B%2BLXQnetBJyQD6oi17BUShuLW71_S0OOXxjA%40mail.gmail.com >>>>> <https://groups.google.com/d/msgid/scalding-dev/CA%2Bkkn9-Kbcfr31%2B%2BLXQnetBJyQD6oi17BUShuLW71_S0OOXxjA%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>> >>> -- >>> Alex Levenson >>> @THISWILLWORK >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "Scalding Development" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/scalding-dev/17ce3eb7-b90e-4c7e-9442-5fd3d6088e55%40googlegroups.com >> <https://groups.google.com/d/msgid/scalding-dev/17ce3eb7-b90e-4c7e-9442-5fd3d6088e55%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > > -- > Alex Levenson > @THISWILLWORK > -- Alex Levenson @THISWILLWORK -- You received this message because you are subscribed to the Google Groups "Scalding Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/scalding-dev/CA%2Bkkn995GUOcsi5Eka0%2B_g%3D%2BVEn-SkY3GQw7aMkUXWgY7b9EPw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
