Hi, > its submitting the whole table to the job. if I use a view with the >filter > baked in, will that help? I don't want to have to jack up the JVM for >the > client/HiveServer2 to accommodate the full table.
Which hive version are you using? If you¹re on a recent version like hive-1.0, this should be a map-reduce only problem. The LocalTask in map-reduce will download the entire table to the local task for processing on the HiveServer2 gateway nodes. Tez has a broadcast edge designed to fix exactly this sort of scalability problem (i.e HiveServer2 machines dying of CPU). Cheers, Gopal