Re: Poor HBase map-reduce scan performance
I tweaked Enis's snapshot input format and backported it to 0.94.6 and have snapshot scanning functional on my system. Performance is dramatically better, as expected i suppose. I'm seeing about 3.6x faster performance vs TableInputFormat. Also, HBase doesn't get bogged down during a scan as the regionserver is being bypassed. I'm very excited by this. There are some issues with file permissions and library dependencies but nothing that can't be worked out. On Jun 5, 2013, at 6:03 PM, lars hofhansl la...@apache.org wrote: That's exactly the kind of pre-fetching I was investigating a bit ago (made a patch, but ran out of time). This pre-fetching is strictly client only, where the client keeps the server busy while it is processing the previous batch, but filling up a 2nd buffer. -- Lars From: Sandy Pratt prat...@adobe.com To: user@hbase.apache.org user@hbase.apache.org Sent: Wednesday, June 5, 2013 10:58 AM Subject: Re: Poor HBase map-reduce scan performance Yong, As a thought experiment, imagine how it impacts the throughput of TCP to keep the window size at 1. That means there's only one packet in flight at a time, and total throughput is a fraction of what it could be. That's effectively what happens with RPC. The server sends a batch, then does nothing while it waits for the client to ask for more. During that time, the pipe between them is empty. Increasing the batch size can help a bit, in essence creating a really huge packet, but the problem remains. There will always be stalls in the pipe. What you want is for the window size to be large enough that the pipe is saturated. A streaming API accomplishes that by stuffing data down the network pipe as quickly as possible. Sandy On 6/5/13 7:55 AM, yonghu yongyong...@gmail.com wrote: Can anyone explain why client + rpc + server will decrease the performance of scanning? I mean the Regionserver and Tasktracker are the same node when you use MapReduce to scan the HBase table. So, in my understanding, there will be no rpc cost. Thanks! Yong On Wed, Jun 5, 2013 at 10:09 AM, Sandy Pratt prat...@adobe.com wrote: https://issues.apache.org/jira/browse/HBASE-8691 On 6/4/13 6:11 PM, Sandy Pratt prat...@adobe.com wrote: Haven't had a chance to write a JIRA yet, but I thought I'd pop in here with an update in the meantime. I tried a number of different approaches to eliminate latency and bubbles in the scan pipeline, and eventually arrived at adding a streaming scan API to the region server, along with refactoring the scan interface into an event-drive message receiver interface. In so doing, I was able to take scan speed on my cluster from 59,537 records/sec with the classic scanner to 222,703 records per second with my new scan API. Needless to say, I'm pleased ;) More details forthcoming when I get a chance. Thanks, Sandy On 5/23/13 3:47 PM, Ted Yu yuzhih...@gmail.com wrote: Thanks for the update, Sandy. If you can open a JIRA and attach your producer / consumer scanner there, that would be great. On Thu, May 23, 2013 at 3:42 PM, Sandy Pratt prat...@adobe.com wrote: I wrote myself a Scanner wrapper that uses a producer/consumer queue to keep the client fed with a full buffer as much as possible. When scanning my table with scanner caching at 100 records, I see about a 24% uplift in performance (~35k records/sec with the ClientScanner and ~44k records/sec with my P/C scanner). However, when I set scanner caching to 5000, it's more of a wash compared to the standard ClientScanner: ~53k records/sec with the ClientScanner and ~60k records/sec with the P/C scanner. I'm not sure what to make of those results. I think next I'll shut down HBase and read the HFiles directly, to see if there's a drop off in performance between reading them directly vs. via the RegionServer. I still think that to really solve this there needs to be sliding window of records in flight between disk and RS, and between RS and client. I'm thinking there's probably a single batch of records in flight between RS and client at the moment. Sandy On 5/23/13 8:45 AM, Bryan Keller brya...@gmail.com wrote: I am considering scanning a snapshot instead of the table. I believe this is what the ExportSnapshot class does. If I could use the scanning code from ExportSnapshot then I will be able to scan the HDFS files directly and bypass the regionservers. This could potentially give me a huge boost in performance for full table scans. However, it doesn't really address the poor scan performance against a table.
HBase Multiget taking more time
Hi All, HBase multiget call taking large time and throwing time out exception. I am retrieving only 50 records in one call. The size of each record is 20 KB. java.net.SocketTimeoutException: 6 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.50.122:48695remote=ct-0096/ 192.168.50.177:60020] hTable = new HTable(conf, tableName); results = hTable.get(rows); Cluster Detail: 1 master, 1 regionserver and 8 regions -- Thanks, Ankit Jain
NullPointerException when opening a region on new table creation
Hi, I created a new table on my cluster today and hit a weird issue which I have not come across before. I wanted to run it by the list and see if anyone has seen this issue before and if not should I open a JIRA for it. it's still unclear of why it would happen. I create the table programmatically using the HBaseAdmin api's and not through the shell. hbase: 0.94.4 hadoop: 1.0.4 There are 2 stack traces back to back and I think one might be leading to the other, but I have to dive in deeper to confirm this. Thanks, Viral ===StackTrace=== 2013-06-25 09:58:46,041 DEBUG org.apache.hadoop.hbase.util.FSTableDescriptors: Exception during readTableDecriptor. Current table name = test_table_id org.apache.hadoop.hbase.TableInfoMissingException: No .tableinfo file under hdfs://ec2-54-242-168-35.compute-1.amazonaws.com:8020/hbase/test_table_id at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorModtime(FSTableDescriptors.java:416) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorModtime(FSTableDescriptors.java:408) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:163) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:126) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2829) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2802) at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) 2013-06-25 09:58:46,094 WARN org.apache.hadoop.hbase.util.FSTableDescriptors: The following folder is in HBase's root directory and doesn't contain a table descriptor, do consider deleting it: test_table_id 2013-06-25 09:58:46,094 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0xaa3f4e93eed83504 Attempting to transition node 8eac5d6cf6ce4c61fb47bf357af60213 from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING 2013-06-25 09:58:46,151 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0xaa3f4e93eed83504 Successfully transitioned node 8eac5d6cf6ce4c61fb47bf357af60213 from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING 2013-06-25 09:58:46,151 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Opening region: {NAME = 'test_table_id,,1372103768783.8eac5d6cf6ce4c61fb47bf357af60213.', STARTKEY = '', ENDKEY = '', ENCODED = 8eac5d6cf6ce4c61fb47bf357af60213,} 2013-06-25 09:58:46,152 INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: System coprocessor org.apache.hadoop.hbase.coprocessors.GroupBy was loaded successfully with priority (536870911). 2013-06-25 09:58:46,152 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open of region=test_table_id,,1372103768783.8eac5d6cf6ce4c61fb47bf357af60213., starting to roll back the global memstore size. java.lang.IllegalStateException: Could not instantiate a region instance. at org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:3776) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3954) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedConstructorAccessor42.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:3773) ... 7 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.loadTableCoprocessors(RegionCoprocessorHost.java:159) at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.init(RegionCoprocessorHost.java:151) at org.apache.hadoop.hbase.regionserver.HRegion.init(HRegion.java:455) ... 11 more
Re: NullPointerException when opening a region on new table creation
Hi Viral, This exception is when you tries to read/open the able, right? Any exception when you have created it? Your table has not been fully created. So it's just normal for HBase to not open it. The issue is on the creation time. Do you sill have the logs? Thanks, JM 2013/6/25 Viral Bajaria viral.baja...@gmail.com: Hi, I created a new table on my cluster today and hit a weird issue which I have not come across before. I wanted to run it by the list and see if anyone has seen this issue before and if not should I open a JIRA for it. it's still unclear of why it would happen. I create the table programmatically using the HBaseAdmin api's and not through the shell. hbase: 0.94.4 hadoop: 1.0.4 There are 2 stack traces back to back and I think one might be leading to the other, but I have to dive in deeper to confirm this. Thanks, Viral ===StackTrace=== 2013-06-25 09:58:46,041 DEBUG org.apache.hadoop.hbase.util.FSTableDescriptors: Exception during readTableDecriptor. Current table name = test_table_id org.apache.hadoop.hbase.TableInfoMissingException: No .tableinfo file under hdfs://ec2-54-242-168-35.compute-1.amazonaws.com:8020/hbase/test_table_id at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorModtime(FSTableDescriptors.java:416) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorModtime(FSTableDescriptors.java:408) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:163) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:126) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2829) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2802) at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) 2013-06-25 09:58:46,094 WARN org.apache.hadoop.hbase.util.FSTableDescriptors: The following folder is in HBase's root directory and doesn't contain a table descriptor, do consider deleting it: test_table_id 2013-06-25 09:58:46,094 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0xaa3f4e93eed83504 Attempting to transition node 8eac5d6cf6ce4c61fb47bf357af60213 from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING 2013-06-25 09:58:46,151 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0xaa3f4e93eed83504 Successfully transitioned node 8eac5d6cf6ce4c61fb47bf357af60213 from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING 2013-06-25 09:58:46,151 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Opening region: {NAME = 'test_table_id,,1372103768783.8eac5d6cf6ce4c61fb47bf357af60213.', STARTKEY = '', ENDKEY = '', ENCODED = 8eac5d6cf6ce4c61fb47bf357af60213,} 2013-06-25 09:58:46,152 INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: System coprocessor org.apache.hadoop.hbase.coprocessors.GroupBy was loaded successfully with priority (536870911). 2013-06-25 09:58:46,152 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open of region=test_table_id,,1372103768783.8eac5d6cf6ce4c61fb47bf357af60213., starting to roll back the global memstore size. java.lang.IllegalStateException: Could not instantiate a region instance. at org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:3776) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3954) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedConstructorAccessor42.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:3773) ... 7 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.loadTableCoprocessors(RegionCoprocessorHost.java:159)
Re: HBase Multiget taking more time
Hi Ankit, Can you please provide more information? HBase version, Hadoop version, logs from the region servers where your cells are, logs from the master, etc.? With only one single region server, all the queries are going to it. So the multiget most probably overwhelmed it. Do you have metrics from it? Like GC time, CPU/Memory usage, etc.? Thanks, JM 2013/6/25 Ankit Jain ankitjainc...@gmail.com: Hi All, HBase multiget call taking large time and throwing time out exception. I am retrieving only 50 records in one call. The size of each record is 20 KB. java.net.SocketTimeoutException: 6 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.50.122:48695remote=ct-0096/ 192.168.50.177:60020] hTable = new HTable(conf, tableName); results = hTable.get(rows); Cluster Detail: 1 master, 1 regionserver and 8 regions -- Thanks, Ankit Jain
Re: HBase Multiget taking more time
Double your timeout from 60K to 120K. While I don't think its the problem... its just a good idea. What happens if you drop the 50 down to 25? Do you still fail? If not, go to 35, etc... until you hit a point where you fail. As Jean-Marc said, we need a bit more information. Including some hardware info too. Thx -Mike On Jun 25, 2013, at 4:03 AM, Ankit Jain ankitjainc...@gmail.com wrote: Hi All, HBase multiget call taking large time and throwing time out exception. I am retrieving only 50 records in one call. The size of each record is 20 KB. java.net.SocketTimeoutException: 6 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.50.122:48695remote=ct-0096/ 192.168.50.177:60020] hTable = new HTable(conf, tableName); results = hTable.get(rows); Cluster Detail: 1 master, 1 regionserver and 8 regions -- Thanks, Ankit Jain
Re: HBase Multiget taking more time
Hi Jean-Marc/Michael, Thanks for the reply. Hardware detail: Processor: 8 core RAM: 16 GB. We have allotted 4GB of RAM to HBase and also we are ingesting data into HBase in parallel with rate of 50 records(Each record has 20 KB) per sec. Please find the attached GC log. Thanks, Ankit On Tue, Jun 25, 2013 at 6:03 PM, Ankit Jain ankitjainc...@gmail.com wrote: Hi Jean-Marc/Michael, Thanks for the reply. Hardware detail: Processor: 8 core RAM: 16 GB. We have allotted 4GB of RAM to HBase and also we are ingesting data into HBase in parallel with rate of 50 records(Each record has 20 KB) per sec. Please find the attached GC log. Thanks, Ankit On Tue, Jun 25, 2013 at 5:43 PM, Michael Segel michael_se...@hotmail.comwrote: Double your timeout from 60K to 120K. While I don't think its the problem... its just a good idea. What happens if you drop the 50 down to 25? Do you still fail? If not, go to 35, etc... until you hit a point where you fail. As Jean-Marc said, we need a bit more information. Including some hardware info too. Thx -Mike On Jun 25, 2013, at 4:03 AM, Ankit Jain ankitjainc...@gmail.com wrote: Hi All, HBase multiget call taking large time and throwing time out exception. I am retrieving only 50 records in one call. The size of each record is 20 KB. java.net.SocketTimeoutException: 6 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.50.122:48695remote=ct-0096/ 192.168.50.177:60020] hTable = new HTable(conf, tableName); results = hTable.get(rows); Cluster Detail: 1 master, 1 regionserver and 8 regions -- Thanks, Ankit Jain -- Thanks, Ankit Jain -- Thanks, Ankit Jain
Re: HBase Multiget taking more time
Hi Ankit, Attachements are not working well with the mailing list. Can you post them on pastbin? Also, can you please post the servers (region and master) logs? thanks, JM 2013/6/25 Ankit Jain ankitjainc...@gmail.com: Hi Jean-Marc/Michael, Thanks for the reply. Hardware detail: Processor: 8 core RAM: 16 GB. We have allotted 4GB of RAM to HBase and also we are ingesting data into HBase in parallel with rate of 50 records(Each record has 20 KB) per sec. Please find the attached GC log. Thanks, Ankit On Tue, Jun 25, 2013 at 6:03 PM, Ankit Jain ankitjainc...@gmail.com wrote: Hi Jean-Marc/Michael, Thanks for the reply. Hardware detail: Processor: 8 core RAM: 16 GB. We have allotted 4GB of RAM to HBase and also we are ingesting data into HBase in parallel with rate of 50 records(Each record has 20 KB) per sec. Please find the attached GC log. Thanks, Ankit On Tue, Jun 25, 2013 at 5:43 PM, Michael Segel michael_se...@hotmail.com wrote: Double your timeout from 60K to 120K. While I don't think its the problem... its just a good idea. What happens if you drop the 50 down to 25? Do you still fail? If not, go to 35, etc... until you hit a point where you fail. As Jean-Marc said, we need a bit more information. Including some hardware info too. Thx -Mike On Jun 25, 2013, at 4:03 AM, Ankit Jain ankitjainc...@gmail.com wrote: Hi All, HBase multiget call taking large time and throwing time out exception. I am retrieving only 50 records in one call. The size of each record is 20 KB. java.net.SocketTimeoutException: 6 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.50.122:48695remote=ct-0096/ 192.168.50.177:60020] hTable = new HTable(conf, tableName); results = hTable.get(rows); Cluster Detail: 1 master, 1 regionserver and 8 regions -- Thanks, Ankit Jain -- Thanks, Ankit Jain -- Thanks, Ankit Jain
Re: HBase Multiget taking more time
Hi Jean-Marc, Below is the GC logs: 2013-06-25T17:54:03.408+0530: 81287.747: [GC [PSYoungGen: 107898K-11598K(112896K)] 917679K-842276K(1117696K), 0.0055720 secs] [Times: user=0.04 sys=0.00, real=0.01 secs] 2013-06-25T17:54:09.152+0530: 81293.491: [GC [PSYoungGen: 94337K-1381K(83008K)] 925015K-848523K(1087808K), 0.0046760 secs] [Times: user=0.03 sys=0.01, real=0.00 secs] 2013-06-25T17:54:11.803+0530: 81296.142: [GC [PSYoungGen: 82929K-16820K(111552K)] 930071K-864126K(1116352K), 0.0046660 secs] [Times: user=0.02 sys=0.01, real=0.00 secs] 2013-06-25T17:54:14.055+0530: 81298.394: [GC [PSYoungGen: 97249K-23006K(112064K)] 944555K-888959K(1116864K), 0.0060900 secs] [Times: user=0.05 sys=0.00, real=0.01 secs] 2013-06-25T17:54:17.851+0530: 81302.190: [GC [PSYoungGen: 101799K-4420K(110784K)] 967752K-890974K(1115584K), 0.0061190 secs] [Times: user=0.04 sys=0.00, real=0.00 secs] 2013-06-25T17:54:21.291+0530: 81305.630: [GC [PSYoungGen: 82640K-21774K(111296K)] 969194K-908649K(1116096K), 0.0037320 secs] [Times: user=0.02 sys=0.00, real=0.01 secs] 2013-06-25T17:54:23.920+0530: 81308.259: [GC [PSYoungGen: 98984K-23970K(110848K)] 985859K-923253K(1115648K), 0.0132180 secs] [Times: user=0.07 sys=0.00, real=0.02 secs] 2013-06-25T17:54:25.903+0530: 81310.242: [GC [PSYoungGen: 99601K-17204K(91904K)] 998884K-935208K(1096704K), 0.0082780 secs] [Times: user=0.03 sys=0.00, real=0.01 secs] 2013-06-25T17:54:28.685+0530: 81313.024: [GC [PSYoungGen: 91832K-14296K(104512K)] 1009836K-963108K(1109312K), 0.0063080 secs] [Times: user=0.05 sys=0.00, real=0.00 secs] 2013-06-25T17:54:32.422+0530: 81316.761: [GC [PSYoungGen: 88012K-17955K(90752K)] 1036825K-981292K(1095552K), 0.0061840 secs] [Times: user=0.04 sys=0.00, real=0.01 secs] 2013-06-25T17:54:35.643+0530: 81319.982: [GC [PSYoungGen: 90685K-5938K(103424K)] 1054022K-985778K(1108224K), 0.0047240 secs] [Times: user=0.03 sys=0.00, real=0.01 secs] 2013-06-25T17:54:35.648+0530: 81319.987: [Full GC [PSYoungGen: 5938K-0K(103424K)] [ParOldGen: 979840K-780760K(1011392K)] 985778K-780760K(1114816K) [PSPermGen: 26558K-26558K(26816K)], 0.1978330 secs] [Times: user=0.90 sys=0.03, real=0.20 secs] 2013-06-25T17:54:38.959+0530: 81323.298: [GC [PSYoungGen: 71862K-3092K(74112K)] 852623K-783853K(1085504K), 0.0020070 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 2013-06-25T17:54:42.335+0530: 81326.674: [GC [PSYoungGen: 74060K-2528K(101696K)] 854821K-783289K(1113088K), 0.0022380 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 2013-06-25T17:54:45.300+0530: 81329.639: [GC [PSYoungGen: 72589K-6572K(75776K)] 853350K-787333K(1087168K), 0.0035930 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 2013-06-25T17:54:47.950+0530: 81332.289: [GC [PSYoungGen: 75518K-16388K(98432K)] 856279K-797149K(1109824K), 0.0052960 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 2013-06-25T17:54:50.034+0530: 81334.374: [GC [PSYoungGen: 84647K-31290K(98752K)] 865408K-822643K(1110144K), 0.0062490 secs] [Times: user=0.04 sys=0.00, real=0.01 secs] 2013-06-25T17:54:52.452+0530: 81336.791: [GC [PSYoungGen: 98695K-40435K(96384K)] 890048K-844556K(1107776K), 0.0096790 secs] [Times: user=0.05 sys=0.01, real=0.01 secs] 2013-06-25T17:54:54.979+0530: 81339.318: [GC [PSYoungGen: 96368K-27980K(83392K)] 900489K-840470K(1094784K), 0.0061010 secs] [Times: user=0.05 sys=0.00, real=0.00 secs] 2013-06-25T17:54:57.448+0530: 81341.787: [GC [PSYoungGen: 83285K-27164K(105344K)] 895774K-839653K(1116736K), 0.0055360 secs] [Times: user=0.04 sys=0.00, real=0.00 secs] 2013-06-25T17:55:00.187+0530: 81344.526: [GC [PSYoungGen: 81227K-20893K(75136K)] 893716K-833382K(1086528K), 0.0072650 secs] [Times: user=0.04 sys=0.00, real=0.01 secs] 2013-06-25T17:55:02.649+0530: 81346.988: [GC [PSYoungGen: 75054K-33701K(100672K)] 887543K-846190K(1112064K), 0.0052480 secs] [Times: user=0.04 sys=0.00, real=0.00 secs] 2013-06-25T17:55:04.561+0530: 81348.900: [GC [PSYoungGen: 85784K-47890K(102272K)] 898273K-860380K(1113664K), 0.0074260 secs] [Times: user=0.05 sys=0.00, real=0.01 secs] 2013-06-25T17:55:06.052+0530: 81350.391: [GC [PSYoungGen: 99402K-31269K(106432K)] 911892K-862255K(1117824K), 0.0073410 secs] [Times: user=0.05 sys=0.00, real=0.01 secs] 2013-06-25T17:55:07.678+0530: 81352.017: [GC [PSYoungGen: 80828K-19721K(70336K)] 911813K-867359K(1081728K), 0.0069240 secs] [Times: user=0.05 sys=0.00, real=0.01 secs] 2013-06-25T17:55:09.710+0530: 81354.050: [GC [PSYoungGen: 70202K-19123K(99840K)] 917840K-867139K(232K), 0.0056040 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 2013-06-25T17:55:11.601+0530: 81355.940: [GC [PSYoungGen: 69114K-33268K(101952K)] 917130K-895918K(1113344K), 0.0112820 secs] [Times: user=0.07 sys=0.00, real=0.01 secs] 2013-06-25T17:55:13.459+0530: 81357.798: [GC [PSYoungGen: 82859K-28418K(98560K)] 945509K-893292K(1109952K), 0.0062720 secs] [Times: user=0.04 sys=0.00, real=0.01 secs] 2013-06-25T17:55:15.101+0530: 81359.441: [GC [PSYoungGen: 77529K-18880K(67648K)] 942403K-906685K(1079040K), 0.0063850 secs] [Times: user=0.04 sys=0.00, real=0.00 secs]
HBase Merge
Hi, Is there a way we can merge two HBase tables using a self-defined merging function on each row?
Re: HBase Replication is talking to the wrong peer
Did you find what the issue was? From your other thread it looks like you got it working. Thx, J-D On Mon, Jun 17, 2013 at 11:48 PM, Asaf Mesika asaf.mes...@gmail.com wrote: Hi, I have two cluster setup in a lab, each has 1 Master and 3 RS. I'm inserting roughly 15GB into the master cluster, but I see between 5 - 10 minutes delay between master and slave cluster (ageOfLastShippedOp) them. On my Graphite I see that replicateLogEntries_num_ops is increasing in one region server (IP 85) of the slave cluster, out of 3 (IPs 83,84,85). I ran a grep on the logs of each region server of the master, and saw Chosen peer message saying the following: RS ip 74: Chosen peer 83 RS ip 75: Chosen peer 85 RS ip 76: Chosen peer 85 So first problem: Why only two slave RS (83,85) are receiving replicated log entries instead of 3? Second and biggest problem: I ran netstat -tnp and grepped for 83,84,85 on the RS ip 74, and saw that it is in fact talking with RS 85! This was correlated with the Graphite graph of replicateLogEntries_num_ops which showed that only RS 85 was receiving replicated log entries. For me it looks like a bug. Anyone has any ideas how to solve those two issues?
Re: HBase Merge
Hum. I don't know any already made way to do that. You will most probably have to do a MR job on one table, and query the 2nd table in the map method to procude the merged row as the output... JM 2013/6/25 Bochun Zhang boc...@umich.edu: Hi, Is there a way we can merge two HBase tables using a self-defined merging function on each row?
Re: coprocessorExec got stucked with generic type
I think I have run into a similar situation like Pavel. My method returns MapLong, ArrayListFoo, where Foo is public class Foo implements Writable { String something; longcounter1; longcounter2; blah }; And I got the following exception when I called my coprocessor method from a client, java.io.NotSerializableException: Foo at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1164) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:330) at java.util.ArrayList.writeObject(ArrayList.java:570) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:945) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1469) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1400) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1158) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:330) at java.util.HashMap.writeObject(HashMap.java:1001) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:945) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1469) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1400) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1158) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:330) at org.apache.hadoop.hbase.io.HbaseObjectWritable.writeObject(HbaseObjectWritable.java:540) at org.apache.hadoop.hbase.client.coprocessor.ExecResult.write(ExecResult.java:76) at org.apache.hadoop.hbase.io.HbaseObjectWritable.writeObject(HbaseObjectWritable.java:525) at org.apache.hadoop.hbase.io.HbaseObjectWritable.write(HbaseObjectWritable.java:335) at org.apache.hadoop.hbase.ipc.HBaseServer$Call.setResponse(HBaseServer.java:365) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1442) What I do not understand is, how does the predicate at line 526 in HbaseObjectWritable is true if Foo does not implement Serializable? } else if (Serializable.class.isAssignableFrom(declClass)) BTW, making Foo serializable worked. Regards, Kim On Tue, Jun 11, 2013 at 9:48 AM, Gary Helmling ghelml...@gmail.com wrote: Does your NameAndDistance class implement org.apache.hadoop.io.Writable? If so, it _should_ be serialized correctly. There was a past issue handling generic types in coprocessor endpoints, but that was fixed way back (long before 0.94.2). So, as far as I know, this should all be working, assuming that NameAndDistance can be serialized. On Mon, Jun 10, 2013 at 9:36 AM, Pavel Hančar pavel.han...@gmail.com wrote: It's org.apache.commons.lang.SerializationUtils I have it in hbase-0.94.2-cdh4.2.1/lib/commons-lang-2.5.jar Pavel 2013/6/10 Ted Yu yuzhih...@gmail.com I searched for SerializationUtils class in hadoop (both branch-1 and branch-2) I also searched for SerializationUtils in hbase codebase. I didn't seem to find it. Is it an internal class of your project ? Cheers On Mon, Jun 10, 2013 at 6:11 AM, Pavel Hančar pavel.han...@gmail.com wrote: I see, that's probably big nonsense to return ArrayList (or array) of another classes from coprocessor, because it's a list of pointers. The solution is to serialize it to byte[] by SerializationUtils.serialize( Serializable http://java.sun.com/javase/6/docs/api/java/io/Serializable.html?is-external=true obj). Pavel 2013/6/10 Pavel Hančar pavel.han...@gmail.com Hello, can I return from an EndPoint a generic type? I try to return ArrayListNameAndDistance from an EndPoint method (where NameAndDistance is a simpe class with two public variables name and distance). But when I return unempty ArrayList, the coprocessorExec get stucked. Thanks, Pavel Hančar
Question about local reads and HDFS 347
I was looking at HDFS 347 and the nice long story with impressive benchmarks and that it should really help with region server performance. The question I had was whether it would still help if we were already using the short circuit local reads setting already provided by HBase. Are there any other significant improvements there ? Thanks Varun
Re: Question about local reads and HDFS 347
you also need local reads on HDFS, even you configured it in HBase. --Send from my Sony mobile. On Jun 26, 2013 6:11 AM, Varun Sharma va...@pinterest.com wrote: I was looking at HDFS 347 and the nice long story with impressive benchmarks and that it should really help with region server performance. The question I had was whether it would still help if we were already using the short circuit local reads setting already provided by HBase. Are there any other significant improvements there ? Thanks Varun
Re: NullPointerException when opening a region on new table creation
Hi JM, Yeah you are right about when the exception happens. I just went through all the logs of table creation and don't see an exception. Though there was a LONG pause when doing the create table. I cannot find any kind of logs on the hbase side as to why the long pause happened around that time. The bigger problem now is the table does not show up in the hbase ui and I can't do a drop on it. At the same time the regionserver logs are flooded with that exception. I think I will have to muck around with ZK and remove traces of that table. Will try to repro this issue but it seems weird since I am able to create other tables with no issue. Thanks, Viral On Tue, Jun 25, 2013 at 4:22 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Viral, This exception is when you tries to read/open the able, right? Any exception when you have created it? Your table has not been fully created. So it's just normal for HBase to not open it. The issue is on the creation time. Do you sill have the logs? Thanks, JM 2013/6/25 Viral Bajaria viral.baja...@gmail.com: Hi, I created a new table on my cluster today and hit a weird issue which I have not come across before. I wanted to run it by the list and see if anyone has seen this issue before and if not should I open a JIRA for it. it's still unclear of why it would happen. I create the table programmatically using the HBaseAdmin api's and not through the shell. hbase: 0.94.4 hadoop: 1.0.4 There are 2 stack traces back to back and I think one might be leading to the other, but I have to dive in deeper to confirm this. Thanks, Viral ===StackTrace=== 2013-06-25 09:58:46,041 DEBUG org.apache.hadoop.hbase.util.FSTableDescriptors: Exception during readTableDecriptor. Current table name = test_table_id org.apache.hadoop.hbase.TableInfoMissingException: No .tableinfo file under hdfs:// ec2-54-242-168-35.compute-1.amazonaws.com:8020/hbase/test_table_id at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorModtime(FSTableDescriptors.java:416) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorModtime(FSTableDescriptors.java:408) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:163) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:126) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2829) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2802) at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) 2013-06-25 09:58:46,094 WARN org.apache.hadoop.hbase.util.FSTableDescriptors: The following folder is in HBase's root directory and doesn't contain a table descriptor, do consider deleting it: test_table_id 2013-06-25 09:58:46,094 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0xaa3f4e93eed83504 Attempting to transition node 8eac5d6cf6ce4c61fb47bf357af60213 from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING 2013-06-25 09:58:46,151 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0xaa3f4e93eed83504 Successfully transitioned node 8eac5d6cf6ce4c61fb47bf357af60213 from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING 2013-06-25 09:58:46,151 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Opening region: {NAME = 'test_table_id,,1372103768783.8eac5d6cf6ce4c61fb47bf357af60213.', STARTKEY = '', ENDKEY = '', ENCODED = 8eac5d6cf6ce4c61fb47bf357af60213,} 2013-06-25 09:58:46,152 INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: System coprocessor org.apache.hadoop.hbase.coprocessors.GroupBy was loaded successfully with priority (536870911). 2013-06-25 09:58:46,152 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open of region=test_table_id,,1372103768783.8eac5d6cf6ce4c61fb47bf357af60213., starting to roll back the global memstore size. java.lang.IllegalStateException: Could not instantiate a region instance. at org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:3776) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3954) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108) at
Re: Question about local reads and HDFS 347
On Tue, Jun 25, 2013 at 3:10 PM, Varun Sharma va...@pinterest.com wrote: I was looking at HDFS 347 and the nice long story with impressive benchmarks and that it should really help with region server performance. The question I had was whether it would still help if we were already using the short circuit local reads setting already provided by HBase. Are there any other significant improvements there ? High-level, SSR no longer requires all-access by a special hbase user [1]. HDFS-347, in the issue, also reports some improvement over current SSR. St.Ack 1. http://hadoop-common.472056.n3.nabble.com/HDFS-347-and-HDFS-2246-issues-different-td3998413.html