No, it is not merging of sorted datasets. There is a concept in HIVE where you can bucket a table on join columns and it creates that many files. Then when join is performed only similar buckets are joined which happens at the map side as the size can be loaded into memory. https://learning.oreilly.com/library/view/apache-hive-cookbook/9781782161080/ch07s06.html
On Thursday, May 23, 2019 at 2:58:45 PM UTC-7, Alex Levenson wrote: > > I'm not very familiar with that. I did some googling, it looks like that's > for merging two already sorted datasets, is that right? > > On Thu, May 23, 2019 at 2:34 PM Saket Kumar <[email protected] > <javascript:>> wrote: > >> There is a feature in Hive to do Sorted Merge Bucket join. How can this >> be implemented in Scalding? >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Scalding Development" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/scalding-dev/fc7f4c54-651c-4ef8-aae1-5798c206b9fa%40googlegroups.com >> >> <https://groups.google.com/d/msgid/scalding-dev/fc7f4c54-651c-4ef8-aae1-5798c206b9fa%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > > -- > Alex Levenson > @THISWILLWORK > -- You received this message because you are subscribed to the Google Groups "Scalding Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/scalding-dev/b18f3a3e-37df-4710-a4f9-4c8117b0d964%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
