Hi, We are making Hive read to few files in HDFS from Reducer(s) part of a map-reduce job,
This works good when launching few reducers ( say 5). But when we launch more that, the initial connection to Hive Server2 takes longer time( around 10 mins ). We have configured hive-site.xml to allow parallel execution . 1. Is this advisable to read HDFS data via hive from reducers. ? or what are the best practices for this scenario ? 2. Is there a way increase hive concurrent access performance Regards, Malli
