Re: Bucket join in Scalding

Saket Kumar Thu, 23 May 2019 15:08:11 -0700

No, it is not merging of sorted datasets. There is a concept in HIVE where 
you can bucket a table on join columns and it creates that many files. Then 
when join is performed only similar buckets are joined which happens at the 
map side as the size can be loaded into memory. 
https://learning.oreilly.com/library/view/apache-hive-cookbook/9781782161080/ch07s06.html



On Thursday, May 23, 2019 at 2:58:45 PM UTC-7, Alex Levenson wrote:
>
> I'm not very familiar with that. I did some googling, it looks like that's 
> for merging two already sorted datasets, is that right?
>
> On Thu, May 23, 2019 at 2:34 PM Saket Kumar <[email protected] 
> <javascript:>> wrote:
>
>> There is a feature in Hive to do Sorted Merge Bucket join. How can this 
>> be implemented in Scalding? 
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Scalding Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/scalding-dev/fc7f4c54-651c-4ef8-aae1-5798c206b9fa%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/scalding-dev/fc7f4c54-651c-4ef8-aae1-5798c206b9fa%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
> -- 
> Alex Levenson
> @THISWILLWORK
>

-- 
You received this message because you are subscribed to the Google Groups 
"Scalding Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/scalding-dev/b18f3a3e-37df-4710-a4f9-4c8117b0d964%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Bucket join in Scalding

Reply via email to