In our hive instance, we have one large fact-type table that joins to several dimension tables on integer keys. I know from reading the Language Manual that in ordering joins it is best to join the largest table last in the sequence in order to minimize memory usage. This won't work in the situation where you want to join the large fact table to more than one dimension. Something like:
select ... from small_table1 join big_table on ... join small_table2 on ... I have to imagine this is a pretty common pattern, is there any guidance for doing this sort of star schema join?
