On Mon, Jul 6, 2009 at 11:06 AM, Irfan Mohammed <irfan...@gmail.com> wrote:
> I am working on writing to HDFS files. Will update you by end of day today. > > There are always 10 concurrent mappers running. I keep setting the > setNumMaps(5) and also the following properties in mapred-site.xml to 3 but > still end up running 10 concurrent maps. > Is your input ten files? > > There are 5 regionservers and the online regions are as follows : > > m1 : -ROOT-,,0 > m2 : txn_m1,,1245462904101 > m3 : txn_m4,,1245462942282 > m4 : txn_m2,,1245462890248 > m5 : .META.,,1 > txn_m3,,1245460727203 > So, that looks like 4 regions from table txn? So thats about 1 region per regionserver? > I have setAutoFlush(false) and also writeToWal(false) with the same > behaviour. > If you did above and still takes 10 minutes, then that would seem to rule out hbase (batching should have big impact on uploads and then setting writeToWAL to false, should double throughput over whatever you were seeing previous). St.Ack