Issues with joining across large tables

Ryan LeCompte Sun, 25 Oct 2009 18:40:13 -0700

Hello all,

Should I expect to be able to do a Hive JOIN between two tables that have
about 10 or 15GB of data each? What I'm noticing (for a simple JOIN) is that
all the map tasks complete, but the reducers just hang at around 87% or so
(for the first set of 4 reducers), and then they eventually just get killed
due to inability to respond by the cluster. I can do a JOIN between a large
table and a very small table of 10 or so records just fine.


Any thoughts?

Thanks,
Ryan

Issues with joining across large tables

Reply via email to