RE: One weird problem of my MR job upon hbase table.

2013-01-14 Thread Liu, Raymond
Hi For feedback. With a lot of profiling works, I guess I found the most promise cause of my problem. It's not because one disk is slow or something ( though I do have slow disk on different region servers, but the lagging behind pattern seems not related to the disk slowness pattern) It

Re: One weird problem of my MR job upon hbase table.

2013-01-07 Thread Doug Meil
Hi there, The HBase RefGuide has a comprehensive case study on such a case. This might not be the exact problem, but the diagnostic approach should help. http://hbase.apache.org/book.html#casestudies.slownode On 1/4/13 10:37 PM, Liu, Raymond raymond@intel.com wrote: Hi I encounter

Re: One weird problem of my MR job upon hbase table.

2013-01-07 Thread Michael Segel
Where did he mention he was attempting to bond the ports? Sorry if I missed it? On Jan 7, 2013, at 7:37 AM, Doug Meil doug.m...@explorysmedical.com wrote: Hi there, The HBase RefGuide has a comprehensive case study on such a case. This might not be the exact problem, but the diagnostic

One weird problem of my MR job upon hbase table.

2013-01-04 Thread Liu, Raymond
Hi I encounter a weird lag behind map task issue here : I have a small hadoop/hbase cluster with 1 master node and 4 regionserver node all have 16 CPU with map and reduce slot set to 24. A few table is created with regions distributed on each region node evenly ( say 16 region for each region

Re: One weird problem of my MR job upon hbase table.

2013-01-04 Thread Ted Yu
Did you use TableInputFormat in your MR job ? Did you use the one from mapred or mapreduce ? What version of HBase are you using ? Did you take a look at Ganglia to see if there is any bottleneck in your cluster ? You mentioned a few changes upon config file shortly before this problem

RE: One weird problem of my MR job upon hbase table.

2013-01-04 Thread Liu, Raymond
Hi Ted Thanks for your reply Did you use TableInputFormat in your MR job ? No, a custom one which do the same split work, but input for each map task is the split, and the map task open htable and read the specific region by itself. Did you use the one from mapred or mapreduce ? All

Re: One weird problem of my MR job upon hbase table.

2013-01-04 Thread Ted Yu
Since a custom InputFormat was used, I assume you have verified that the map tasks ran on the region server which hosts the regions being scanned. If you were doing aggregation through this MR job, you can consider using AggregateProtocol. Cheers On Fri, Jan 4, 2013 at 8:08 PM, Liu, Raymond

RE: One weird problem of my MR job upon hbase table.

2013-01-04 Thread Liu, Raymond
Since a custom InputFormat was used, I assume you have verified that the map tasks ran on the region server which hosts the regions being scanned. Yes, this inputFormat's behavior is verified with same data before. And , btw, I try to replace it with original TableInputFormat. Similar