Re: how to pre split a table whose row key is MD5(url)?
On Tue, May 13, 2014 at 9:58 AM, Liam Slusser lslus...@gmail.com wrote: You can also create a table via the hbase shell with pre-split tables like this... Here is a 32-byte split into 16 different regions, using base16 (ie a md5 hash) for the key-type. create 't1', {NAME = 'f1'}, {SPLITS= ['1000', '2000', '3000', '4000', '5000', '6000', '7000', '8000', '9000', 'a000', 'b000', 'c000', 'd000', 'e000', 'f000']} To make this easier to type, you don't even need the 0 padding. Just '1', '2', '3', ... 'f' is enough :) thanks, liam On Tue, May 13, 2014 at 6:49 AM, sudhakara st sudhakara...@gmail.com wrote: you can pre-splite table using you hex characters string for start key, end key and using number of regions to spilit ** HTableDescriptor tableDes = new HTableDescriptor(tableName); tableDes.setValue(HTableDescriptor.SPLIT_POLICY, KeyPrefixRegionSplitPolicy.class.getName()); byte[][] splits = getHexSplits(SPLIT_START_KEY,SPLIT_END_KEY,NUM_OF_REGION_SPLIT); admin.createTable(tableDes, splits); ** private byte[][] getHexSplits(String startKey, String endKey, int numRegions) { byte[][] splits = new byte[numRegions - 1][]; BigInteger lowestKey = new BigInteger(startKey, 8); //considering for first 8bytes to spilte BigInteger highestKey = new BigInteger(endKey, 8); BigInteger range = highestKey.subtract(lowestKey); BigInteger regionIncrement = range.divide(BigInteger.valueOf(numRegions)); lowestKey = lowestKey.add(regionIncrement); for (int i = 0; i numRegions - 1; i++) { BigInteger key = lowestKey.add(regionIncrement.multiply(BigInteger.valueOf(i))); byte[] b = String.format(%016x, key).getBytes(); splits[i] = b; } return splits; } * On Mon, May 12, 2014 at 7:07 AM, Li Li fancye...@gmail.com wrote: thanks. I will try this. by the way, byte range is -128 - 127 On Mon, May 12, 2014 at 6:13 AM, Michael Segel michael_se...@hotmail.com wrote: Simple answer… you really can’t. The best thing you can do is to pre split the table in to 4 regions based on splitting the first byte in to 4 equal ranges. (0-63,64-127,128-191,191-255) And hope that you’ll have an even split. In theory, over time you will. On May 8, 2014, at 1:58 PM, Li Li fancye...@gmail.com wrote: say I have 4 region server. How to pre split a table using MD5 as row key? -- Regards, ...sudhakara
Re: Error loading SHA-1 keys with load bulk
Are you using HFileOutputFormat.configureIncrementalLoad() to set up the partitioner and the reducers? That will take care of ordering your keys. J-D On Thu, May 1, 2014 at 5:38 AM, Guillermo Ortiz konstt2...@gmail.comwrote: I have been looking at the code in HBase, but, I don't really understand what this error happens. Why can I put in HBase those keys? 2014-04-30 17:57 GMT+02:00 Guillermo Ortiz konstt2...@gmail.comjavascript:_e(%7B%7D,'cvml','konstt2...@gmail.com '); : I'm using HBase with MapReduce to load a lot of data, so I have decide to do it with bulk load. I parse my keys with SHA1, but when I try to load them, I got this exception. java.io.IOException: Added a key not lexically larger than previous key=\x00(6e9e59f36a7ec2ac54635b2d353e53e677839046\x01l\x00\x00\x01E\xB3\xC9\xC7\x0E, lastkey=\x00(b313a9f1f57c8a07c81dc3221c6151cf3637506a\x01l\x00\x00\x01E\xAE\x18k\x87\x0E at org.apache.hadoop.hbase.io.hfile.AbstractHFileWriter.checkKey(AbstractHFileWriter.java:207) at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:324) at org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:289) at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:1206) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat$1.write(HFileOutputFormat.java:168) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat$1.write(HFileOutputFormat.java:124) at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:551) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) I work with HBase 0.94.6. I have been loking for if I could define any reducer, since, I have defined no one. I have read something about KeyValueSortReducer but, I don'tknow if there's something that extends TableReducer or I'm lookging for a wrong way.
Re: Re: replication verifyrep
On Tue, Apr 15, 2014 at 12:17 AM, Hansi Klose hansi.kl...@web.de wrote: Hi Jean-Daniel, thank you for your answer and bring some light into the darkness. You're welcome! You can see the bad rows listed in the user logs for your MR job. What log do you mean. The output from the command line? I only see the count of GOOD or BAD rows. Are the bad rows listed in that log which are not replicated? You started VerifyReplication via hadoop jar, so it's a MapReduce job. Go to your JobTracker's web UI, you should see your jobs there, then checkout one of them and click on one of the completed maps then look for the log. The bad rows are listed in that output. J-D
Re: replication verifyrep
Yeah you should use endtime, it was fixed as part of https://issues.apache.org/jira/browse/HBASE-10395. You can see the bad rows listed in the user logs for your MR job. J-D On Mon, Apr 14, 2014 at 3:06 AM, Hansi Klose hansi.kl...@web.de wrote: Hi, I wrote a little script which should control the running replication. The script is triggered by cron and executes the following command with the actual time stamp in endtime and a time stamp = endtime - 1080 milli seconds. So the time frame is 3 hours. hadoop jar /usr/lib/hbase/hbase.jar verifyrep --starttime=1397217601927 --endtime=1397228401927 --families=t 1 tablename 21 After some running's the script found some BADROWS. 14/04/11 17:04:05 INFO mapred.JobClient: BADROWS=176 14/04/11 17:04:05 INFO mapred.JobClient: GOODROWS=2 I executed the same command 20 Minutes later in the shell and got : hadoop jar /usr/lib/hbase/hbase.jar verifyrep --starttime=1397217601927 --endtime=1397228401927 --families=t 1 tablename 21 14/04/11 17:21:03 INFO mapred.JobClient: BADROWS=178 After that I run the command with the same start time and the actual timestamp an end time, so the time frame is greater but with the same start time. And now I got : hadoop jar /usr/lib/hbase/hbase.jar verifyrep --starttime=1397217601927 --endtime=1397230074876 --families=t 1 tablename 21 14/04/11 17:28:28 INFO mapred.JobClient: GOODROWS=184 Is there something wrong with the command? In our metrics i could not see that three is an Issue at that time. We are a little bit confused about the endtime. In all documents they talk about stoptime. But we found that in the job configuration there is no parameter called stoptime. We found the verifyrep.startTime which hold the value of the starttime in our command and verifyrep.endTime which is alway 0 when we use stoptime in the command. So we decided to use endtime Even in the code http://hbase.apache.org/xref/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.html they use: static long endTime = Long.MAX_VALUE; Which name is the right on? endtime or stoptime? We use cdh 4.2.0. Regards Hansi
Re: How to decide the next HMaster?
It's a simple leader election via ZooKeeper. J-D On Tue, Apr 8, 2014 at 7:18 AM, gortiz gor...@pragsis.com wrote: Could someone explain me which it's the process to select the next HMaster when the current one is gone down?? I've been looking for information about it in the documentation, but, I haven't found anything.
Re: block cache size questions
On Mon, Mar 17, 2014 at 6:01 AM, Linlin Du linlindu2...@hotmail.com wrote: Hi all, First question: According to documentation, hfile.block.cache.size is by default 40 percentage of maximum heap (-Xmx setting). If -Xmx is not used and only -Xms is used, what will it be in this case? Second question: This 40% heap space is shared by all entities (stores). How much block cache is used by each store (entity)? Is it allocated on demand? If a region has never been read after the region server is up, will block cache be allocated for it? The block cache stores blocks, which are chunks of an HFile, and are by default around 64KB in size. Allocation is on-demand. Learn more here http://hbase.apache.org/book.html#block.cache Third question: Is it possible to tell from .META when a region was last read? If so, how? Many thanks, Linlin
Re: latest stable hbase-0.94.13 cannot start master: java.lang.RuntimeException: Failed suppression of fs shutdown hook
Resurrecting this old thread. The following error: java.lang.RuntimeException: Failed suppression of fs shutdown hook Is caused when HBase is compiled against Hadoop 1 and has Hadoop 2 jars on its classpath. Someone on IRC just had the same issue and I was able to repro after seeing the classpath. J-D On Wed, Nov 13, 2013 at 7:00 AM, Ted Yu yuzhih...@gmail.com wrote: Your hbase.rootdir config parameter points to file: instead of hdfs: Where is hadoop-2.2.0 running ? You also need to build tar ball using hadoop 2 profile. See the following in pom.xml: profile for building against Hadoop 2.0.0-alpha. Activate using: mvn -Dhadoop.profile=2.0 -- profile idhadoop-2.0/id On Wed, Nov 13, 2013 at 6:13 AM, jason_vasd...@mcafee.com wrote: Good day - I'm an hadoop hbase newbie, so please excuse me if this is a known issue - hoping someone might send me a simple fix ! I installed the latest stable tarball : hbase-0.94.13.tar.gz , and followed the instructions at docs/book/quickstart.html . (After installing hadoop-2.2.0, and running the resourcemanager nodemanager, which are both running and presenting web-pages at the configured ports OK). My hbase-site.xml now looks like: configuration property namehbase.rootdir/name valuefile:///home/jason/3P/hbase/data/value /property property namehbase.zookeeper.property.dataDir/name value/home/jason/3P/hbase/zookeeper-data/value /property /configuration I try to start hbase as instructed in the QuickStart guide: $ bin/hbase-start.sh starting master, logging to /home/jason/3P/hbase-0.94.13/logs/hbase-jason-master-jvds.out But the master does NOT start . I think it is a bug that the hbase-start.sh script does not complain that hbase failed to start. Shall I raise a JIRA issue on this ? Anyway, when I look in the logs/hbase-jason-master-jvds.log file, I see that a Java exception occurred : 2013-11-13 13:52:06,316 INFO org.apache.hadoop.hbase.master.ActiveMasterManager: Deleting ZNode for /hbase/backup-masters/jvds,52926,1384350725521 from backup master directory 2013-11-13 13:52:06,318 INFO org.apache.zookeeper.server.PrepRequestProcessor: Got user-level KeeperException when processing sessionid:0x14251bbb3d4 type:delete cxid:0x13 zxid:0xb txntype:-1 reqpath:n/a Error Path:/hbase/backup-masters/jvds,52926,1384350725521 Error:KeeperErrorCode = NoNode for /hbase/backup-masters/jvds,52926,1384350725521 2013-11-13 13:52:06,320 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node /hbase/backup-masters/jvds,52926,1384350725521 already deleted, and this is not a retry 2013-11-13 13:52:06,320 INFO org.apache.hadoop.hbase.master.ActiveMasterManager: Master=jvds,52926,1384350725521 2013-11-13 13:52:06,348 INFO org.apache.hadoop.hbase.master.SplitLogManager: timeout = 30 2013-11-13 13:52:06,348 INFO org.apache.hadoop.hbase.master.SplitLogManager: unassigned timeout = 18 2013-11-13 13:52:06,348 INFO org.apache.hadoop.hbase.master.SplitLogManager: resubmit threshold = 3 2013-11-13 13:52:06,352 INFO org.apache.hadoop.hbase.master.SplitLogManager: found 0 orphan tasks and 0 rescan nodes 2013-11-13 13:52:06,385 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library 2013-11-13 13:52:06,385 ERROR org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master java.lang.RuntimeException: Failed suppression of fs shutdown hook: Thread[Thread-27,5,main] at org.apache.hadoop.hbase.regionserver.ShutdownHook.suppressHdfsShutdownHook(ShutdownHook.java:196) at org.apache.hadoop.hbase.regionserver.ShutdownHook.install(ShutdownHook.java:83) at org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:191) at org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:420) at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:149) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:104) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2120) 2013-11-13 13:52:06,386 ERROR org.apache.hadoop.io.nativeio.NativeIO: Unable to initialize NativeIO libraries java.lang.NoSuchFieldError: workaroundNonThreadSafePasswdCalls at org.apache.hadoop.io.nativeio.NativeIO.initNative(Native Method) at org.apache.hadoop.io.nativeio.NativeIO.clinit(NativeIO.java:58) at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:653) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509) at
Re: Who creates hbase root.dir ?
IIRC it used to be an issue if the folder was already existing, even if empty. It's not the case anymore. J-D On Fri, Feb 7, 2014 at 3:38 PM, Jay Vyas jayunit...@gmail.com wrote: Hi hbase. In normal installations, Im wondering who should create hbase root.dir. 1) I have seen pseudo-distributed mode docs implying that Hbase is smart enough to do it by itself. Let HBase create the hbase.rootdir directory. If you don't, you'll get warning saying HBase needs a migration run because the directory is missing files expected by HBase (it'll create them if you let it). 2) But in bigtop, I see mkdir in the init-hdfs.sh : su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir /hbase' su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown hbase:hbase /hbase' So whats the right way to maintain hbase-root ? -- Jay Vyas http://jayunit100.blogspot.com
Re: MultiMaster HBase: --backup really needed ?
The problem with having a bunch of master racing is that it's not evident for the operator who won, so specifying --backup to all but one master ensures that you always easily know where the master is. Relevant code from HMaster.java: // If we're a backup master, stall until a primary to writes his address if (!c.getBoolean(HConstants.MASTER_TYPE_BACKUP, HConstants.DEFAULT_MASTER_TYPE_BACKUP)) { return; } J-D On Mon, Dec 9, 2013 at 9:37 AM, Bryan Beaudreault bbeaudrea...@hubspot.comwrote: I've run HBase from version 0.90.2 to our current 0.94.6 (CDH 4.3) and have never specified a --backup option on any of my commands with regard to the master. You're correct that they race to be active, and failover is completely automatic in the case of one master going down. TBH I've never even heard of a --backup argument, so I'm wondering if it is something extremely old or extremely new :) On Mon, Dec 9, 2013 at 6:24 AM, Manuel de Ferran manuel.defer...@gmail.comwrote: Greetings, I'm playing without MultiMaster, and I was wondering if --backup is really needed. As far as I have observed, masters race to be the active one. Is there any drawback in not mentioning --backup on additional nodes ? Regards, -- Manuel DE FERRAN
Re: Question about the HBase thrift server
That's right, round robin should only be applied when you start answering some client request and stick to it until you're done. J-D On Fri, Dec 6, 2013 at 9:17 PM, Varun Sharma va...@pinterest.com wrote: Hi everyone, I have a question about the hbase thrift server and running scans in particular. The thrift server maintains a map of int - ResultScanner(s). These integers are passed back to the client. Now in a typical setting people run many thrift servers and round robin rpc(s) to them. It seems that for scans, such a technique of just round robinning is simply not going to work. If a scan integer ID has been retrieved from a certain thrift server A, all the next() calls and close calls should fall on that server. I just wanted to make sure I got this thinking right and there isn't really a way around this because scans, unlike gets have associated state. Thanks ! Varun
Re: You Are Dead Exception due to promotion failure
It reads that it spent 89 seconds doing a CMS concurrent mark, but really just spent 14 seconds of user CPU and 4 seconds of system CPU doing it. Where are the other 70 seconds? It's often just swapping, and less likely it can also be CPU starvation. J-D On Fri, Nov 1, 2013 at 1:40 AM, Asaf Mesika asaf.mes...@gmail.com wrote: Can you please explain why is this suspicious? On Monday, October 7, 2013, Jean-Daniel Cryans wrote: This line: [CMS-concurrent-mark: 12.929/88.767 secs] [Times: user=14.30 sys=3.74, real=88.77 secs] Is suspicious. Are you swapping? J-D On Mon, Oct 7, 2013 at 8:34 AM, prakash kadel prakash.ka...@gmail.com javascript:; wrote: Also, why is the CMS not kicking in early, i have set XX:+ UseCMSInitiatingOccupancyOnly??? Sincerely, Prakash On Tue, Oct 8, 2013 at 12:32 AM, prakash kadel prakash.ka...@gmail.com wrote: Hello, I am getting this YADE all the time HBASE_HEAPSIZE=8000 Settings: -ea -XX:+UseConcMarkSweepGC -XX:MaxGCPauseMillis=200 -XX:+HeapDumpOnOutOfMemoryError -XX:+CMSIncrementalMode -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=50 -XX:+UseCMSInitiatingOccupancyOnly -XX:NewSize=256m -XX:MaxNewSize=256m it seems there is promotion failure and the CMS take too long 2013-10-07T01:22:55.784+0900: [GC [ParNew: 235968K-26176K(235968K), 0.3219980 secs] 7709485K-7538063K(8165824K) icms_dc=0 , 0.3221100 secs] [Times: user=0.27 sys=0.01, real=0.33 secs] 2013-10-07T01:23:07.361+0900: [GC [ParNew: 235842K-26176K(235968K), 0.1899680 secs] 7747729K-7578713K(8165824K) icms_dc=0 , 0.1900700 secs] [Times: user=0.26 sys=0.02, real=0.19 secs] 2013-10-07T01:23:20.154+0900: [GC [ParNew: 235803K-26176K(235968K), 0.2428200 secs] 7788341K-7615284K(8165824K) icms_dc=0 , 0.2429570 secs] [Times: user=0.25 sys=0.02, real=0.24 secs] 2013-10-07T01:23:34.594+0900: [GC [ParNew: 235889K-26176K(235968K), 0.2440980 secs] 7824998K-7651179K(8165824K) icms_dc=0 , 0.2442130 secs] [Times: user=0.20 sys=0.03, real=0.25 secs] 2013-10-07T01:23:47.666+0900: [GC [ParNew: 235906K-26176K(235968K), 0.2998100 secs] 7860909K-7686832K(8165824K) icms_dc=3 , 0.3020280 secs] [Times: user=0.23 sys=0.04, real=0.30 secs] 2013-10-07T01:23:57.216+0900: [GC [1 CMS-initial-mark: 7660656K(7929856K)] 7788778K(8165824K), 3.7665320 secs] [Times: user=0.07 sys=0.06, real=3.77 secs] 2013-10-07T01:24:05.508+0900: [GC [ParNew: 235811K-26176K(235968K), 0.4632860 secs] 7896468K-7721167K(8165824K) icms_dc=3 , 0.4634100 secs] [Times: user=0.21 sys=0.03, real=0.46 secs] 2013-10-07T01:24:19.889+0900: [GC [ParNew: 235812K-26176K(235968K), 0.3531980 secs] 7930804K-7755633K(8165824K) icms_dc=3 , 0.3533230 secs] [Times: user=0.24 sys=0.06, real=0.35 secs] 2013-10-07T01:24:32.832+0900: [GC [ParNew: 235968K-26176K(235968K), 0.6298370 secs] 7965425K-7790643K(8165824K) icms_dc=3 , 0.6299530 secs] [Times: user=0.23 sys=0.03, real=0.63 secs] 2013-10-07T01:24:43.629+0900: [GC [ParNew: 235800K-26176K(235968K), 0.3190580 secs] 8000268K-782K(8165824K) icms_dc=3 , 0.3191840 secs] [Times: user=0.24 sys=0.02, real=0.32 secs] 2013-10-07T01:24:56.005+0900: [GC [ParNew: 235848K-26176K(235968K), 0.4839400 secs] 8035228K-7860300K(8165824K) icms_dc=3 , 0.4840480 secs] [Times: user=0.31 sys=0.03, real=0.49 secs] 2013-10-07T01:25:07.282+0900: [GC [ParNew: 235750K-26176K(235968K), 0.3423250 secs] 8069875K-7895852K(8165824K) icms_dc=9 , 0.3424380 secs] [Times: user=0.21 sys=0.06, real=0.34 secs] 2013-10-07T01:25:19.853+0900: [GC [ParNew (promotion failed): 235745K-235745K(235968K), 0.3339710 secs][CMS2013-10-07T01:25:29.750+0900: [CMS-concurrent-mark: 12.929/88.767 secs] [Times: user=14.30 sys=3.74, real=88.77 secs] (concurrent mode failure): 7899125K-2882954K(7929856K), 42.8279810 secs] 8105422K-2882954K(8165824K), [CMS Perm : 31956K-31861K(53340K)] icms_dc=9 , 43.1621090 secs] [Times: user=10.40 sys=1.89, real=43.16 secs] 2013-10-07T01:26:08.288+0900: [GC [1 CMS-initial-mark: 2882954K(7929856K)] 2978434K(8165824K), 0.0965830 secs] [Times: user=0.04 sys=0.00, real=0.09 secs] Heap par new generation total 235968K, used 197697K [0x000606e0, 0x000616e0, 0x000616e0) eden space 209792K, 94% used [0x000606e0, 0x000612f10718, 0x000613ae) from space 26176K, 0% used [0x00061547, 0x00061547,
Re: HBase ShutdownHook problem
(LocalHBaseCluster.java:420) at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:149) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:104) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2100) at HBase.HMasterThread.run(HMasterThread.java:19) Salih Kardan On Fri, Oct 25, 2013 at 9:00 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: What's happening before this stack trace in the log? J-D On Fri, Oct 25, 2013 at 6:10 AM, Salih Kardan karda...@gmail.com wrote: Hi all I am getting the error below while starting hbase (hbase 0.94.11). I guess since hbase cannot connect to hadoop, I get this error. *java.lang.RuntimeException: Failed suppression of fs shutdown hook: Thread[Thread-8,5,main] at* * org.apache.hadoop.hbase.regionserver.ShutdownHook.suppressHdfsShutdownHook(ShutdownHook.java:196) at * *org.apache.hadoop.hbase.regionserver.ShutdownHook.install(ShutdownHook.java:83) at * *org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:191) at * *org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:420) at * *org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:149) at* * org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:104) at * *org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at * *org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at* * org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2100)* my /etc/hosts file only contains (127.0.0.1 - machine name).
Re: HBase ShutdownHook problem
What's happening before this stack trace in the log? J-D On Fri, Oct 25, 2013 at 6:10 AM, Salih Kardan karda...@gmail.com wrote: Hi all I am getting the error below while starting hbase (hbase 0.94.11). I guess since hbase cannot connect to hadoop, I get this error. *java.lang.RuntimeException: Failed suppression of fs shutdown hook: Thread[Thread-8,5,main] at* * org.apache.hadoop.hbase.regionserver.ShutdownHook.suppressHdfsShutdownHook(ShutdownHook.java:196) at * *org.apache.hadoop.hbase.regionserver.ShutdownHook.install(ShutdownHook.java:83) at * *org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:191) at * *org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:420) at * *org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:149) at* * org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:104) at * *org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at * *org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76) at* * org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2100)* my /etc/hosts file only contains (127.0.0.1 - machine name).
Re: HBase Random Read latency 100ms
On Wed, Oct 9, 2013 at 10:59 AM, Vladimir Rodionov vrodio...@carrieriq.comwrote: I can't say for SCR. There is a possibility that the feature is broken, of course. But the fact that hbase.regionserver.checksum.verify does not affect performance means that OS caches effectively HDFS checksum files. See OS cache + SCR VS HBase CRC over OS cache+SCR in this document I shared some time ago: https://docs.google.com/spreadsheet/pub?key=0Ao87IrzZJSaydENaem5USWg4TlRKcHl0dEtTS2NBOUEoutput=html In an all-in-memory test it shows a pretty big difference. J-D Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Ramu M S [ramu.ma...@gmail.com] Sent: Wednesday, October 09, 2013 12:11 AM To: user@hbase.apache.org; lars hofhansl Subject: Re: HBase Random Read latency 100ms Hi All, Sorry. There was some mistake in the tests (Clients were not reduced, forgot to change the parameter before running tests). With 8 Clients and, SCR Enabled : Average Latency is 25 ms, IO Wait % is around 8 SCR Disabled: Average Latency is 10 ms, IO Wait % is around 2 Still, SCR disabled gives better results, which confuse me. Can anyone clarify? Also, I tried setting the parameter (hbase.regionserver.checksum.verify as true) Lars suggested with SCR disabled. Average Latency is around 9.8 ms, a fraction lesser. Thanks Ramu On Wed, Oct 9, 2013 at 3:32 PM, Ramu M S ramu.ma...@gmail.com wrote: Hi All, I just ran only 8 parallel clients, With SCR Enabled : Average Latency is 80 ms, IO Wait % is around 8 With SCR Disabled: Average Latency is 40 ms, IO Wait % is around 2 I always thought SCR enabled, allows a client co-located with the DataNode to read HDFS file blocks directly. This gives a performance boost to distributed clients that are aware of locality. Is my understanding wrong OR it doesn't apply to my scenario? Meanwhile I will try setting the parameter suggested by Lars and post you the results. Thanks, Ramu On Wed, Oct 9, 2013 at 2:29 PM, lars hofhansl la...@apache.org wrote: Good call. Could try to enable hbase.regionserver.checksum.verify, which will cause HBase to do its own checksums rather than relying on HDFS (and which saves 1 IO per block get). I do think you can expect the index blocks to be cached at all times. -- Lars From: Vladimir Rodionov vrodio...@carrieriq.com To: user@hbase.apache.org user@hbase.apache.org Sent: Tuesday, October 8, 2013 8:44 PM Subject: RE: HBase Random Read latency 100ms Upd. Each HBase Get = 2 HDFS read IO (index block + data block)= 4 File IO (data + .crc) in a worst case. I think if Bloom Filter is enabled than it is going to be 6 File IO in a worst case (large data set), therefore you will have not 5 IO requests in queue but up to 20-30 IO requests in a queue This definitely explains 100ms avg latency. Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Vladimir Rodionov Sent: Tuesday, October 08, 2013 7:24 PM To: user@hbase.apache.org Subject: RE: HBase Random Read latency 100ms Ramu, You have 8 server boxes and 10 client. You have 40 requests in parallel - 5 per RS/DN? You have 5 requests on random reads in a IO queue of your single RAID1. With avg read latency of 10 ms, 5 requests in queue will give us 30ms. Add some overhead of HDFS + HBase and you will probably have your issue explained ? Your bottleneck is your disk system, I think. When you serve most of requests from disks as in your large data set scenario, make sure you have adequate disk sub-system and that it is configured properly. Block Cache and OS page can not help you in this case as working data set is larger than both caches. Good performance numbers in small data set scenario are explained by the fact that data fits into OS page cache and Block Cache - you do not read data from disk even if you disable block cache. Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Ramu M S [ramu.ma...@gmail.com] Sent: Tuesday, October 08, 2013 6:00 PM To: user@hbase.apache.org Subject: Re: HBase Random Read latency 100ms Hi All, After few suggestions from the mails earlier I changed the following, 1. Heap Size to 16 GB 2. Block Size to 16KB 3. HFile size to 8 GB (Table now has 256 regions, 32 per server) 4. Data Locality Index is 100 in all RS I have clients running in 10 machines, each with 4 threads. So total 40. This is same in all tests. Result: 1. Average latency is still 100ms.
Re: You Are Dead Exception due to promotion failure
This line: [CMS-concurrent-mark: 12.929/88.767 secs] [Times: user=14.30 sys=3.74, real=88.77 secs] Is suspicious. Are you swapping? J-D On Mon, Oct 7, 2013 at 8:34 AM, prakash kadel prakash.ka...@gmail.comwrote: Also, why is the CMS not kicking in early, i have set XX:+ UseCMSInitiatingOccupancyOnly??? Sincerely, Prakash On Tue, Oct 8, 2013 at 12:32 AM, prakash kadel prakash.ka...@gmail.com wrote: Hello, I am getting this YADE all the time HBASE_HEAPSIZE=8000 Settings: -ea -XX:+UseConcMarkSweepGC -XX:MaxGCPauseMillis=200 -XX:+HeapDumpOnOutOfMemoryError -XX:+CMSIncrementalMode -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=50 -XX:+UseCMSInitiatingOccupancyOnly -XX:NewSize=256m -XX:MaxNewSize=256m it seems there is promotion failure and the CMS take too long 2013-10-07T01:22:55.784+0900: [GC [ParNew: 235968K-26176K(235968K), 0.3219980 secs] 7709485K-7538063K(8165824K) icms_dc=0 , 0.3221100 secs] [Times: user=0.27 sys=0.01, real=0.33 secs] 2013-10-07T01:23:07.361+0900: [GC [ParNew: 235842K-26176K(235968K), 0.1899680 secs] 7747729K-7578713K(8165824K) icms_dc=0 , 0.1900700 secs] [Times: user=0.26 sys=0.02, real=0.19 secs] 2013-10-07T01:23:20.154+0900: [GC [ParNew: 235803K-26176K(235968K), 0.2428200 secs] 7788341K-7615284K(8165824K) icms_dc=0 , 0.2429570 secs] [Times: user=0.25 sys=0.02, real=0.24 secs] 2013-10-07T01:23:34.594+0900: [GC [ParNew: 235889K-26176K(235968K), 0.2440980 secs] 7824998K-7651179K(8165824K) icms_dc=0 , 0.2442130 secs] [Times: user=0.20 sys=0.03, real=0.25 secs] 2013-10-07T01:23:47.666+0900: [GC [ParNew: 235906K-26176K(235968K), 0.2998100 secs] 7860909K-7686832K(8165824K) icms_dc=3 , 0.3020280 secs] [Times: user=0.23 sys=0.04, real=0.30 secs] 2013-10-07T01:23:57.216+0900: [GC [1 CMS-initial-mark: 7660656K(7929856K)] 7788778K(8165824K), 3.7665320 secs] [Times: user=0.07 sys=0.06, real=3.77 secs] 2013-10-07T01:24:05.508+0900: [GC [ParNew: 235811K-26176K(235968K), 0.4632860 secs] 7896468K-7721167K(8165824K) icms_dc=3 , 0.4634100 secs] [Times: user=0.21 sys=0.03, real=0.46 secs] 2013-10-07T01:24:19.889+0900: [GC [ParNew: 235812K-26176K(235968K), 0.3531980 secs] 7930804K-7755633K(8165824K) icms_dc=3 , 0.3533230 secs] [Times: user=0.24 sys=0.06, real=0.35 secs] 2013-10-07T01:24:32.832+0900: [GC [ParNew: 235968K-26176K(235968K), 0.6298370 secs] 7965425K-7790643K(8165824K) icms_dc=3 , 0.6299530 secs] [Times: user=0.23 sys=0.03, real=0.63 secs] 2013-10-07T01:24:43.629+0900: [GC [ParNew: 235800K-26176K(235968K), 0.3190580 secs] 8000268K-782K(8165824K) icms_dc=3 , 0.3191840 secs] [Times: user=0.24 sys=0.02, real=0.32 secs] 2013-10-07T01:24:56.005+0900: [GC [ParNew: 235848K-26176K(235968K), 0.4839400 secs] 8035228K-7860300K(8165824K) icms_dc=3 , 0.4840480 secs] [Times: user=0.31 sys=0.03, real=0.49 secs] 2013-10-07T01:25:07.282+0900: [GC [ParNew: 235750K-26176K(235968K), 0.3423250 secs] 8069875K-7895852K(8165824K) icms_dc=9 , 0.3424380 secs] [Times: user=0.21 sys=0.06, real=0.34 secs] 2013-10-07T01:25:19.853+0900: [GC [ParNew (promotion failed): 235745K-235745K(235968K), 0.3339710 secs][CMS2013-10-07T01:25:29.750+0900: [CMS-concurrent-mark: 12.929/88.767 secs] [Times: user=14.30 sys=3.74, real=88.77 secs] (concurrent mode failure): 7899125K-2882954K(7929856K), 42.8279810 secs] 8105422K-2882954K(8165824K), [CMS Perm : 31956K-31861K(53340K)] icms_dc=9 , 43.1621090 secs] [Times: user=10.40 sys=1.89, real=43.16 secs] 2013-10-07T01:26:08.288+0900: [GC [1 CMS-initial-mark: 2882954K(7929856K)] 2978434K(8165824K), 0.0965830 secs] [Times: user=0.04 sys=0.00, real=0.09 secs] Heap par new generation total 235968K, used 197697K [0x000606e0, 0x000616e0, 0x000616e0) eden space 209792K, 94% used [0x000606e0, 0x000612f10718, 0x000613ae) from space 26176K, 0% used [0x00061547, 0x00061547, 0x000616e0) to space 26176K, 0% used [0x000613ae, 0x000613ae, 0x00061547) concurrent mark-sweep generation total 7929856K, used 2882954K [0x000616e0, 0x0007fae0, 0x0007fae0) concurrent-mark-sweep perm gen total 53340K, used 31960K [0x0007fae0, 0x0007fe217000, 0x0008) What is wrong here? please give me some suggestions. Sincerely, Prakash
Re: Upcoming HBase bay area user and dev meetups
While we're on the topic of upcoming meetups, there's also a meetup at Facebook's NYC office the week of Strata/Hadoop World (10/28). There's still room for about 50 attendees. http://www.meetup.com/HBase-NYC/events/135434632/ J-D On Mon, Oct 7, 2013 at 2:10 PM, Enis Söztutar e...@apache.org wrote: Hi guys, I just wanted to give a heads up on upcoming bay area user and dev meetups which will happen on the same day, October 24th. ( special thanks Stack for pushing this.) The user meetup will start at 6:30, and the talks scheduled so far are: + Steven Noels will talk about using the Lily Indexer to search your HBase content: http://ngdata.github.io/hbase-indexer/ + St.Ack will talk about what is in hbase-0.96.0 + Enis will talk about Mapreduce over HBase snapshots (HBASE-8369) There will be food and beers as usual. The event page is at http://www.meetup.com/hbaseusergroup/events/140759692/. Please write me or Stack off-list if you want to give a talk. There is still room for one more talk. The dev meetup will start at 4pm. Some of the suggested topics include: + When is 0.98.0? + When is 1.0? What makes for an HBase 1.0. + Assignment Manager + What next on MTTR? The event page is at: http://www.meetup.com/hackathon/events/144366512/. Feel free to suggest / bring up topics that you think is important for post-0.96. Enis
Re: You Are Dead Exception due to promotion failure
Swapping and Java simply don't go well together. You need to ensure that the committed memory is smaller than the available memory. Also see http://hbase.apache.org/book.html#perf.os.swap I haven't looked closely at your GC output but even if CMS was kicking as early as it's supposed to, the fact that you are swapping might just screw up everything. J-D On Mon, Oct 7, 2013 at 3:13 PM, prakash kadel prakash.ka...@gmail.comwrote: BTW, if i disable the swap at all. What will happen in the above situation? currently it starts swapping at 90% Sincerely On Tue, Oct 8, 2013 at 7:09 AM, prakash kadel prakash.ka...@gmail.com wrote: thanks, yup, it seems so. I have 48 gb memory. i see it swaps at that point. btw, why is the CMS not kicking in early? do you have any idea? sincerely On Tue, Oct 8, 2013 at 3:00 AM, Jean-Daniel Cryans jdcry...@apache.org wrote: This line: [CMS-concurrent-mark: 12.929/88.767 secs] [Times: user=14.30 sys=3.74, real=88.77 secs] Is suspicious. Are you swapping? J-D On Mon, Oct 7, 2013 at 8:34 AM, prakash kadel prakash.ka...@gmail.com wrote: Also, why is the CMS not kicking in early, i have set XX:+ UseCMSInitiatingOccupancyOnly??? Sincerely, Prakash On Tue, Oct 8, 2013 at 12:32 AM, prakash kadel prakash.ka...@gmail.com wrote: Hello, I am getting this YADE all the time HBASE_HEAPSIZE=8000 Settings: -ea -XX:+UseConcMarkSweepGC -XX:MaxGCPauseMillis=200 -XX:+HeapDumpOnOutOfMemoryError -XX:+CMSIncrementalMode -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=50 -XX:+UseCMSInitiatingOccupancyOnly -XX:NewSize=256m -XX:MaxNewSize=256m it seems there is promotion failure and the CMS take too long 2013-10-07T01:22:55.784+0900: [GC [ParNew: 235968K-26176K(235968K), 0.3219980 secs] 7709485K-7538063K(8165824K) icms_dc=0 , 0.3221100 secs] [Times: user=0.27 sys=0.01, real=0.33 secs] 2013-10-07T01:23:07.361+0900: [GC [ParNew: 235842K-26176K(235968K), 0.1899680 secs] 7747729K-7578713K(8165824K) icms_dc=0 , 0.1900700 secs] [Times: user=0.26 sys=0.02, real=0.19 secs] 2013-10-07T01:23:20.154+0900: [GC [ParNew: 235803K-26176K(235968K), 0.2428200 secs] 7788341K-7615284K(8165824K) icms_dc=0 , 0.2429570 secs] [Times: user=0.25 sys=0.02, real=0.24 secs] 2013-10-07T01:23:34.594+0900: [GC [ParNew: 235889K-26176K(235968K), 0.2440980 secs] 7824998K-7651179K(8165824K) icms_dc=0 , 0.2442130 secs] [Times: user=0.20 sys=0.03, real=0.25 secs] 2013-10-07T01:23:47.666+0900: [GC [ParNew: 235906K-26176K(235968K), 0.2998100 secs] 7860909K-7686832K(8165824K) icms_dc=3 , 0.3020280 secs] [Times: user=0.23 sys=0.04, real=0.30 secs] 2013-10-07T01:23:57.216+0900: [GC [1 CMS-initial-mark: 7660656K(7929856K)] 7788778K(8165824K), 3.7665320 secs] [Times: user=0.07 sys=0.06, real=3.77 secs] 2013-10-07T01:24:05.508+0900: [GC [ParNew: 235811K-26176K(235968K), 0.4632860 secs] 7896468K-7721167K(8165824K) icms_dc=3 , 0.4634100 secs] [Times: user=0.21 sys=0.03, real=0.46 secs] 2013-10-07T01:24:19.889+0900: [GC [ParNew: 235812K-26176K(235968K), 0.3531980 secs] 7930804K-7755633K(8165824K) icms_dc=3 , 0.3533230 secs] [Times: user=0.24 sys=0.06, real=0.35 secs] 2013-10-07T01:24:32.832+0900: [GC [ParNew: 235968K-26176K(235968K), 0.6298370 secs] 7965425K-7790643K(8165824K) icms_dc=3 , 0.6299530 secs] [Times: user=0.23 sys=0.03, real=0.63 secs] 2013-10-07T01:24:43.629+0900: [GC [ParNew: 235800K-26176K(235968K), 0.3190580 secs] 8000268K-782K(8165824K) icms_dc=3 , 0.3191840 secs] [Times: user=0.24 sys=0.02, real=0.32 secs] 2013-10-07T01:24:56.005+0900: [GC [ParNew: 235848K-26176K(235968K), 0.4839400 secs] 8035228K-7860300K(8165824K) icms_dc=3 , 0.4840480 secs] [Times: user=0.31 sys=0.03, real=0.49 secs] 2013-10-07T01:25:07.282+0900: [GC [ParNew: 235750K-26176K(235968K), 0.3423250 secs] 8069875K-7895852K(8165824K) icms_dc=9 , 0.3424380 secs] [Times: user=0.21 sys=0.06, real=0.34 secs] 2013-10-07T01:25:19.853+0900: [GC [ParNew (promotion failed): 235745K-235745K(235968K), 0.3339710 secs][CMS2013-10-07T01:25:29.750+0900: [CMS-concurrent-mark: 12.929/88.767 secs] [Times: user=14.30 sys=3.74, real=88.77 secs] (concurrent mode failure): 7899125K-2882954K(7929856K), 42.8279810 secs] 8105422K-2882954K(8165824K), [CMS Perm : 31956K-31861K(53340K)] icms_dc=9 , 43.1621090 secs] [Times: user=10.40 sys=1.89, real=43.16 secs] 2013-10-07T01:26:08.288+0900: [GC [1 CMS-initial-mark: 2882954K(7929856K)] 2978434K(8165824K), 0.0965830 secs] [Times: user=0.04 sys=0.00, real=0.09 secs] Heap par new generation total 235968K, used 197697K [0x000606e0, 0x000616e0, 0x000616e0) eden space 209792K, 94% used [0x000606e0
Re: hbase.master parameter?
hbase.master was removed when we added zookeeper, so now a client will do a lookup in ZK instead of talking to a pre-determined master. So in a way, hbase.zookeeper.quorum is what replaces hbase.master FWIW that was done in 0.20.0 which was released in September of 2009, so hbase.master has been removed 4 years ago. J-D On Fri, Oct 4, 2013 at 8:11 AM, Jay Vyas jayunit...@gmail.com wrote: Oh wow. looking in the source this really is an old parameter: It appears that the answer to my question is : that these are the master parameters: src/main/resources/hbase-default.xml:namehbase.master.port/name src/main/resources/hbase-default.xml: namehbase.master.info.port/name src/main/resources/hbase-default.xml: namehbase.master.info.bindAddress/name src/main/resources/hbase-default.xml: namehbase.master.dns.interface/name src/main/resources/hbase-default.xml: namehbase.master.dns.nameserver/name src/main/resources/hbase-default.xml: namehbase.master.logcleaner.ttl/name src/main/resources/hbase-default.xml: namehbase.master.logcleaner.plugins/name src/main/resources/hbase-default.xml: valueorg.apache.hadoop.hbase.master.cleaner.TimeToLiveLogCleaner/value src/main/resources/hbase-default.xml: namehbase.master.keytab.file/name src/main/resources/hbase-default.xml: namehbase.master.kerberos.principal/name src/main/resources/hbase-default.xml: namehbase.master.hfilecleaner.plugins/name src/main/resources/hbase-default.xml: valueorg.apache.hadoop.hbase.master.cleaner.TimeToLiveHFileCleaner/value src/test/resources/hbase-site.xml: namehbase.master.event.waiting.time/name src/test/resources/hbase-site.xml:namehbase.master.info.port/name src/test/resources/hbase-site.xml:descriptionThe port for the hbase master web UI src/test/resources/hbase-site.xml: namehbase.master.lease.thread.wakefrequency/name On Fri, Oct 4, 2013 at 11:06 AM, Jay Vyas jayunit...@gmail.com wrote: Thanks for the feedback ! So - are you sure it has no effect? By obsolete - do we mean deprecated? In this case - which parameters have replaced it and how [specifically] ? Any help on this issue would be appreciated , because im seeing an effect when i have the parameter , will check my hbase version and confirm. On Fri, Oct 4, 2013 at 2:05 AM, Harsh J ha...@cloudera.com wrote: That property hasn't been in effect since 0.90 (far as I can remember). Ever since we switched master discovery to ZK, the property has been obsolete. On Fri, Oct 4, 2013 at 5:13 AM, Jay Vyas jayunit...@gmail.com wrote: What happened to the hbase.master parameter? I dont see it in the docs... was it deprecated? It appears to still have an effect in 94.7 -- Jay Vyas http://jayunit100.blogspot.com -- Harsh J -- Jay Vyas http://jayunit100.blogspot.com -- Jay Vyas http://jayunit100.blogspot.com
Re: HBase stucked because HDFS fails to replicate blocks
I like the way you were able to dig down into multiple logs and present us the information, but it looks more like GC than an HDFS failure. In your region server log, go back to the first FATAL and see if it got a session expired from ZK and other messages like a client not being able to talk to a server for some amount of time. If it's the case then what you are seeing is the result of IO fencing by the master. J-D On Wed, Oct 2, 2013 at 10:15 AM, Ionut Ignatescu ionut.ignate...@gmail.comwrote: Hi, I have a HadoopHBase cluster, that runs Hadoop 1.1.2 and HBase 0.94.7. I notice an issue that stops normal cluster running. My use case: I have several MR jobs that read data from one HBase table in map phase and write data in 3 different tables during the reduce phase. I create table handler on my own, I don't TableOutputFormat. The only way out I found is to restart region server deamon on region server with problems. On namenode: cat namenode.2013-10-02 | grep blk_3136705509461132997_43329 Wed Oct 02 13:32:17 2013 GMT namenode 3852-0@namenode:0 [INFO] (IPC Server handler 29 on 22700) org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /hbase/.logs/datanode1,60020,1380637389766/datanode1%2C60020%2C1380637389766.1380720737247. blk_3136705509461132997_43329 Wed Oct 02 13:33:38 2013 GMT namenode 3852-0@namenode:0 [INFO] (IPC Server handler 13 on 22700) org.apache.hadoop.hdfs.server.namenode.FSNamesystem: commitBlockSynchronization(lastblock=blk_3136705509461132997_43329, newgenerationstamp=43366, newlength=40045568, newtargets=[ 10.81.18.101:50010], closeFile=false, deleteBlock=false) On region server: cat regionserver.2013-10-02 | grep 1380720737247 Wed Oct 02 13:32:17 2013 GMT regionserver 5854-0@datanode1:0 [INFO] (regionserver60020.logRoller) org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/datanode1,60020,1380637389766/datanode1%2C60020%2C1380637389766.1380720701436, entries=149, filesize=63934833. for /hbase/.logs/datanode1,60020,1380637389766/datanode1%2C60020%2C1380637389766.1380720737247 Wed Oct 02 13:33:37 2013 GMT regionserver 5854-0@datanode1:0 [WARN] (DataStreamer for file /hbase/.logs/datanode1,60020,1380637389766/datanode1%2C60020%2C1380637389766.1380720737247 block blk_3136705509461132997_43329) org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_3136705509461132997_43329 bad datanode[0] 10.80.40.176:50010 Wed Oct 02 13:33:37 2013 GMT regionserver 5854-0@datanode1:0 [WARN] (DataStreamer for file /hbase/.logs/datanode1,60020,1380637389766/datanode1%2C60020%2C1380637389766.1380720737247 block blk_3136705509461132997_43329) org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_3136705509461132997_43329 in pipeline 10.80.40.176:50010, 10.81.111.8:50010, 10.81.18.101:50010: bad datanode 10.80.40.176:50010 Wed Oct 02 13:33:43 2013 GMT regionserver 5854-0@datanode1:0 [INFO] (regionserver60020.logRoller) org.apache.hadoop.hdfs.DFSClient: Could not complete file /hbase/.logs/datanode1,60020,1380637389766/datanode1%2C60020%2C1380637389766.1380720737247 retrying... Wed Oct 02 13:33:43 2013 GMT regionserver 5854-0@datanode1:0 [INFO] (regionserver60020.logRoller) org.apache.hadoop.hdfs.DFSClient: Could not complete file /hbase/.logs/datanode1,60020,1380637389766/datanode1%2C60020%2C1380637389766.1380720737247 retrying... Wed Oct 02 13:33:44 2013 GMT regionserver 5854-0@datanode1:0 [INFO] (regionserver60020.logRoller) org.apache.hadoop.hdfs.DFSClient: Could not complete file /hbase/.logs/datanode1,60020,1380637389766/datanode1%2C60020%2C1380637389766.1380720737247 retrying... cat regionserver.2013-10-02 | grep 1380720737247 | grep 'Could not complete' | wc -l 5640 In datanode logs, that runs on the same host with region server: cat datanode.2013-10-02 | grep blk_3136705509461132997_43329 Wed Oct 02 13:32:17 2013 GMT datanode 5651-0@datanode1:0 [INFO] (org.apache.hadoop.hdfs.server.datanode.DataXceiver@ca6b1e3) org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_3136705509461132997_43329 src: /10.80.40.176:36721 dest: / 10.80.40.176:50010 Wed Oct 02 13:33:37 2013 GMT datanode 5651-0@datanode1:0 [INFO] (org.apache.hadoop.hdfs.server.datanode.DataXceiver@ca6b1e3) org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration( 10.80.40.176:50010, storageID=DS-812180968-10.80.40.176-50010-1380263000454, infoPort=50075, ipcPort=50020): Exception writing block blk_3136705509461132997_43329 to mirror 10.81.111.8:50010 Wed Oct 02 13:33:37 2013 GMT datanode 5651-0@datanode1:0 [INFO] (org.apache.hadoop.hdfs.server.datanode.DataXceiver@ca6b1e3) org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock for block blk_3136705509461132997_43329 java.io.IOException: Connection reset by peer Wed Oct 02 13:33:38 2013 GMT datanode 5651-0@datanode1:0 [INFO] (PacketResponder 2 for Block blk_3136705509461132997_43329)
Re: Replication
That means that the master cluster isn't able to see any region servers in the slave cluster... is cluster b up? Can you create tables? J-D On Fri, Sep 27, 2013 at 3:23 AM, Arnaud Lamy al...@ltutech.com wrote: Hi, I tried to configure a replication with 2 boxes (ab). A hosts hbase zk and b only hbase. A is on zk:/hbase and b on zk:hbase_b. I used start-hbase.sh script to start hbase and I changed HBASE_MANAGES_ZK=false on both. A is master and B is slave. I added a peer on A and when I list it I have: 1 localhost:2181:/hbase_b ENABLED I created my table on A B, added some data on A but nothing on B. When I look at my logs I have: 2013-09-15 23:59:43,682 INFO org.apache.hadoop.hbase.** replication.regionserver.**ReplicationSource: Getting 0 rs from peer cluster # 1 That means there's no slave plugged to my master. There's no time difference between A and B for info. I'm stucked (can't find anything on google). Do you have any idea why it doesn't work ? Arnaud
Re: What is causing my mappers to execute so damn slow?
Your details are missing important bits like you configurations, Hadoop/HBase versions, etc. Doing those random reads inside your MR job, especially if they are reading cold data, will indeed make it slower. Just to get an idea, if you skip doing the Gets, how fast does it became? J-D On Fri, Sep 27, 2013 at 10:33 AM, Pavan Sudheendra pavan0...@gmail.comwrote: Hi everyone, I posted this question many time before and i've given full details on stackoverflow.. http://stackoverflow.com/q/19056712/938959 Please i need someone to guide me in the right direction here. Help much appreciated! -- Regards- Pavan
Re: What is causing my mappers to execute so damn slow?
I don't think there's a CDH that includes Hadoop 1.2.1 So either your code is doing something slow or it's the reading itself. For the latter, make sure you go through http://hbase.apache.org/book.html#perf.reading and we also recently had this thread on the list were you can see some live performance debugging http://www.mail-archive.com/user@hbase.apache.org/msg27174.html. For example, make sure you're not running on the local job tracker. J-D On Fri, Sep 27, 2013 at 11:07 AM, Pavan Sudheendra pavan0...@gmail.comwrote: Hi Jean, HBase 0.94.6 and Hadoop 1.2.1 Cloudera Distributions.. I infact tried that out, in place of doing the get operations , i created stub data and returned that instead.. It was practically at the same speed. Nothing changed.. After 20 mins or so when i check the job status.. It hardly reached 1,000,000 rows.. On Fri, Sep 27, 2013 at 11:12 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: Your details are missing important bits like you configurations, Hadoop/HBase versions, etc. Doing those random reads inside your MR job, especially if they are reading cold data, will indeed make it slower. Just to get an idea, if you skip doing the Gets, how fast does it became? J-D On Fri, Sep 27, 2013 at 10:33 AM, Pavan Sudheendra pavan0...@gmail.com wrote: Hi everyone, I posted this question many time before and i've given full details on stackoverflow.. http://stackoverflow.com/q/19056712/938959 Please i need someone to guide me in the right direction here. Help much appreciated! -- Regards- Pavan -- Regards- Pavan
Re: Export API using start and stop row key !
You'd need to use 0.94 (or CDH4.2+ since you are mentioning being on CDH) to have access to TableInputFormat.SCAN_ROW_START and SCAN_ROW_STOP then all you need to do is to copy Export's code and add what you're missing. J-D On Tue, Sep 24, 2013 at 5:42 PM, karunakar lkarunaka...@gmail.com wrote: Hi Experts, I would like to fetch data from hbase table using map reduce export API. I see that I can fetch data using start and stop time, but I don't see any information regarding start and stop row key. Can any expert guide me or give me an example in order fetch first 1000 rows (or start and stop row key) using export API which I can import to different table ? Hadoop 2.0.0-cdh4.1.2 HBase 0.92.1-cdh4.1.2 Please let me know if you need more information. Thanks you. -- View this message in context: http://apache-hbase.679495.n3.nabble.com/Export-API-using-start-and-stop-row-key-tp4051182.html Sent from the HBase User mailing list archive at Nabble.com.
Re: Hbase Compression
On flushing we do some cleanup, like removing deleted data that was already in the MemStore or extra versions. Could it be that you are overwriting recently written data? 48MB is the size of the Memstore that accumulated while the flushing happened. J-D On Tue, Sep 24, 2013 at 3:50 AM, aiyoh79 tcheng...@gmail.com wrote: Hi, I am using hbase 0.94.11 and i feel a bit confuse when looking at the log file below: 13/09/24 13:11:00 INFO regionserver.Store: Flushed , sequenceid=687077, memsize= 122.1m, into tmp file hdfs://192.168.123.123:54310/hbase/usertable/b19289cf9b1400 c6daddc347337bac03/.tmp/13f0d91efe784372a796585a6c1e05d3 13/09/24 13:11:00 INFO regionserver.Store: Added hdfs://192.168.123.123:54310/hba se/usertable/b19289cf9b1400c6daddc347337bac03/family/13f0d91efe784372a796585a6c1 e05d3, entries=432620, sequenceid=687077, filesize=64.4m 13/09/24 13:11:00 INFO regionserver.HRegion: Finished memstore flush of ~128.2m/ 134402240, currentsize=48.0m/50366240 for region usertable,user4\xB4\xB0,1379998 895119.b19289cf9b1400c6daddc347337bac03. in 1163ms, sequenceid=687077, compactio n requested=false It seems like it will first flush into a tmp file and the memsize is 122.1m, but when it finally added, the size is 64.4m. Lastly, there are 2 more parameters which is 128.2m and 48.0m for currentsize. I never specify hbase.regionserver.codecs preperty in my hbase-site.xml file, so is the size difference still because of compression? Thanks, aiyoh79 -- View this message in context: http://apache-hbase.679495.n3.nabble.com/Hbase-Compression-tp4051122.html Sent from the HBase User mailing list archive at Nabble.com.
Re: Hbase ports
On Mon, Sep 23, 2013 at 9:14 AM, John Foxinhead john.foxinh...@gmail.comwrote: Hi all. I'm doing a project for my university so that i have to know perfectly how all the Hbase ports work. Studing the documentation i found that Zookeeper accept connection on port 2181, Hbase master on port 6 and Hbase regionservers on port 60020. I didn't understand the importance of port 60010 on master and port 60030 on regionservers. Can i not use them? From the documentation (http://hbase.apache.org/book.html#config.files): hbase.regionserver.info.port The port for the HBase RegionServer web UI Set to -1 if you do not want the RegionServer UI to run. Default: 60030 You can look for the other port in there too. More important: if i launch Hbase in pseudo-distribuited mode, running all processes on localhost, what ports are used for each of the processes if i launch 1, 2, 3 or more backup masters and if i launch few regionservers (less than 10) or a lot of regionservers (10, 20, 100)? It'll clash, you'll have to have different hbase-site.xml for each process you want to start. J-D
Re: openTSDB lose large amount of data when the client are writing
Could happen if a region moves since locks aren't persisted, but if I were you I'd ask on the opentsdb mailing list first. J-D On Thu, Sep 19, 2013 at 10:09 AM, Tianying Chang tich...@ebaysf.com wrote: Hi, I have a customer who use openTSDB. Recently we found that only less than 10% data are written, rest are are lost. By checking the RS log, there are many row lock related issues, like below. It seems large amount of write to tsdb that need row lock caused the problem. Anyone else see similar problem? Is it a bug of openTSDB? Or it is due to HBase exposed a vulnerable API? org.apache.hadoop.hbase.UnknownRowLockException: Invalid row lock at org.apache.hadoop.hbase.regionserver.HRegionServer.getLockFromId(HRegionServer.java:2732) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2071) at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) 13/09/18 12:08:30 ERROR regionserver.HRegionServer: org.apache.hadoop.hbase.UnknownRowLockException: -6180307918863136448 at org.apache.hadoop.hbase.regionserver.HRegionServer.unlockRow(HRegionServer.java:2765) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) Thanks Tian-Ying
Re: Bulkload into empty table with configureIncrementalLoad()
You need to create the table with pre-splits, see http://hbase.apache.org/book.html#perf.writing J-D On Thu, Sep 19, 2013 at 9:52 AM, Dolan Antenucci antenucc...@gmail.comwrote: I have about 1 billion values I am trying to load into a new HBase table (with just one column and column family), but am running into some issues. Currently I am trying to use MapReduce to import these by first converting them to HFiles and then using LoadIncrementalHFiles.doBulkLoad(). I also use HFileOutputFormat.configureIncrementalLoad() as part of my MR job. My code is essentially the same as this example: https://github.com/Paschalis/HBase-Bulk-Load-Example/blob/master/src/cy/ac/ucy/paschalis/hbase/bulkimport/Driver.java The problem I'm running into is that only 1 reducer is created by configureIncrementalLoad(), and there is not enough space on this node to handle all this data. configureIncrementalLoad() should start one reducer for every region the table has, so apparently the table only has 1 region -- maybe because it is empty and brand new (my understanding of how regions work is not crystal clear)? The cluster has 5 region servers, so I'd at least like that many reducers to handle this loading. On a side note, I also tried the command line tool, completebulkload, but am running into other issues with this (timeouts, possible heap issues) -- probably due to only one server being assigned the task of inserting all the records (i.e. I look at the region servers' logs, and only one of the servers has log entries; the rest are idle). Any help is appreciated -Dolan Antenucci
Re: HBase Negation or NOT operator
You can always remove the NOT clause by changing the statement, but I'm wondering what your use case really is. HBase doesn't have secondary indexes so, unless you are doing a short-ish scan (let's say a million rows), it means you want to do a full table scan and that doesn't scale. J-D On Tue, Sep 17, 2013 at 1:34 AM, Ashwin Jain ashvyn.j...@gmail.com wrote: Hello All, Does HBase not support an SQL NOT operator on complex filters? I would like to filter out whatever matches a complex nested filter. my use case is to parse a query like this(below) and build a HBase filter from it. (field1=value1 AND NOT ((field2=value2 OR field3=value3) AND field4=value4)) How to go about this , any ideas? What will be a better approach - implement a custom filter that excludes a row qualified by another filter or to convert input query into an opposite query. Thanks, Ashwin
Re: user_permission ERROR: Unknown table
Ah I see, well unless you setup Secure HBase there won't be any perms enforcement. So in which way is your application failing to use Selector? Do you have an error message or stack trace handy? J-D On Tue, Sep 17, 2013 at 5:43 AM, BG bge...@mitre.org wrote: Well we are trying to find out why our application works when we use 'Selectors' table. When we use 'Selectors2' it works just fine. So we wanted to see if it was a permission error. That is why we tried out user_permissions, but when they gave errors we wondered if that might enforce that maybe it is a permissions problem. bg -- View this message in context: http://apache-hbase.679495.n3.nabble.com/user-permission-ERROR-Unknown-table-tp4050797p4050838.html Sent from the HBase User mailing list archive at Nabble.com.
Re: show processlist equivalent in Hbase
(putting cdh user in BCC, please don't cross-post) The web UIs for both the master and the region server have a section called Tasks and has a bunch of links like this: Tasks Show All Monitored Tasks Show non-RPC Tasks Show All RPC Handler Tasks Show Active RPC Calls Show Client Operations View as JSON J-D On Tue, Sep 17, 2013 at 5:41 AM, Dhanasekaran Anbalagan bugcy...@gmail.comwrote: Hi Guys, I want know show processlist in mysql equivalent in hbase any tool is there?. In Hbase Master webpage says only requestsPerSecond and table details only. I want know which process hitting load Please guide me. -Dhanasekaran. Did I learn something today? If not, I wasted it.
Re: Command to delete based on column Family + rowkey
HBASE-8753 doesn't seem related. Right now there's nothing in the shell that does the equivalent of this: Delete.deleteFamily(byte [] family) But it's possible to run java code in the jruby shell so in the end you can still do it, just takes more lines. J-D On Mon, Sep 16, 2013 at 1:45 AM, Ted Yu yuzhih...@gmail.com wrote: Have you looked at https://issues.apache.org/jira/browse/HBASE-8753 ? Cheers On Sep 16, 2013, at 12:37 AM, Ramasubramanian ramasubramanian.naraya...@gmail.com wrote: Hi, Thanks…but the requirement is to delete the fields for a single row key… can u pls help? regards, Rams On 10-Sep-2013, at 4:56 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: This? hbase(main):002:0 help alter Alter column family schema; pass table name and a dictionary specifying new column family schema. Dictionaries are described on the main help command output. Dictionary must include name of column family to alter. For example, To change or add the 'f1' column family in table 't1' from defaults to instead keep a maximum of 5 cell VERSIONS, do: hbase alter 't1', NAME = 'f1', VERSIONS = 5 To delete the 'f1' column family in table 't1', do: hbase alter 't1', NAME = 'f1', METHOD = 'delete' or a shorter version: hbase alter 't1', 'delete' = 'f1' 2013/9/10 Ramasubramanian Narayanan ramasubramanian.naraya...@gmail.com Manish, I need to delete all the columns for a particular column family of a given rowkey... I don't want to specify the column name (qualifier name) one by one to delete. Pls let me know is there any way to delete like that... regards, Rams On Tue, Sep 10, 2013 at 2:06 PM, manish dunani manishd...@gmail.com wrote: If you want to delete rowkey for particular columnfamily then you need to mention individually:: delete 't','333','TWO:qualifier_name' This will definitely delete the records which you are looking for. Please revert back if it is not work. On Tue, Sep 10, 2013 at 1:40 PM, manish dunani manishd...@gmail.com wrote: hey rama, Try this:: *deleteall 't','333'* * * I hope it will definitely works for you!! On Tue, Sep 10, 2013 at 1:31 PM, Ramasubramanian Narayanan ramasubramanian.naraya...@gmail.com wrote: Dear All, Requirement is to delete all columns which belongs to a column family and for a particular rowkey. Have tried with the below command but record is not getting deleted. * hbase deleteall 't1', 'r1', 'c1'* * * *Test result :* * * 3) Scan the table 't' hbase(main):025:0 scan 't' ROW COLUMN+CELL 111 column=ONE:ename, timestamp=1378459582478, value= 111 column=ONE:eno, timestamp=1378459582335, value=1000 111 column=ONE:sal, timestamp=1378459582515, value=1500 111 column=TWO:ename, timestamp=1378459582655, value= 111 column=TWO:eno, timestamp=1378459582631, value=4000 222 column=ONE:ename, timestamp=1378459582702, value= 222 column=ONE:eno, timestamp=1378459582683, value=2000 222 column=ONE:sal, timestamp=1378459582723, value=2500 222 column=TWO:ename, timestamp=1378459582779, value= 222 column=TWO:eno, timestamp=1378459582754, value=4000 222 column=TWO:sal, timestamp=1378459582798, value=7500 333 column=ONE:ename, timestamp=1378459582880, value=sss 333 column=ONE:eno, timestamp=1378459582845, value=9000 333 column=ONE:sal, timestamp=1378459582907, value=6500 333 column=TWO:ename, timestamp=1378459582950, value=zzz 333 column=TWO:eno, timestamp=1378459582931, value= 333 column=TWO:sal, timestamp=1378459582968, value=6500 3 row(s) in 0.0440 seconds - 4) Delete the records from the table 't' in the rowkey '333' in the column family 'TWO' hbase(main):027:0 deleteall 't','333','TWO' 0 row(s) in 0.0060 seconds - 5) After deleting scan the table
Re: user_permission ERROR: Unknown table
What are you trying to do bg? If you want to setup user permissions you also need to have a secure HBase (the link that Ted posted) which involves Kerberos. J-D On Mon, Sep 16, 2013 at 1:33 PM, Ted Yu yuzhih...@gmail.com wrote: See http://hbase.apache.org/book.html#d0e5135 On Mon, Sep 16, 2013 at 1:06 PM, BG bge...@mitre.org wrote: Thanks.. Do I need to do this. We do NOT have kerberos running. bg -- View this message in context: http://apache-hbase.679495.n3.nabble.com/user-permission-ERROR-Unknown-table-tp4050797p4050804.html Sent from the HBase User mailing list archive at Nabble.com.
Re: Information about hbase 0.96
Release date is: when it gets released. We are currently going through release candidates and as soon as one gets accepted we release it. I'd like to say it's gonna happen this month but who knows. There's probably one or two presentations online that explain what's in 0.96.0, but the source of truth at the moment is: https://issues.apache.org/jira/issues/?jql=project%20%3D%20HBASE%20AND%20(%20fixVersion%20%3D%20%220.96.0%22%20or%20fixVersion%20%3D%20%220.95.0%22%20%20or%20fixVersion%20%3D%20%220.95.1%22%20or%20fixVersion%20%3D%20%220.95.2%22%20)%20AND%20(status%20%3D%20Resolved%20OR%20status%20%3D%20Closed)%20ORDER%20BY%20issuetype%20DESC%2C%20priority%20DESC J-D On Thu, Sep 12, 2013 at 10:05 AM, Vimal Jain vkj...@gmail.com wrote: Hi, Where can i get information about hbase 0.96 like what are its additional features , its release date ? -- Thanks and Regards, Vimal Jain
Re: High cpu usage on a region server
Or roll back to CDH 4.2's HBase. They are fully compatible. J-D On Thu, Sep 12, 2013 at 10:25 AM, lars hofhansl la...@apache.org wrote: Not that I am aware of. Reduce the HFile block size will lessen this problem (but then cause other issues). It's just a fix to the RegexStringFilter. You can just recompile that and deploy it to the RegionServers (need to make it's in the class path before the HBase jars). Probably easier to roll a new release. It's a shame we did not see this earlier. -- Lars From: OpenSource Dev dev.opensou...@gmail.com To: user@hbase.apache.org; lars hofhansl la...@apache.org Sent: Thursday, September 12, 2013 9:52 AM Subject: Re: High cpu usage on a region server Thanks Lars. Are there any other workarounds for this issue until we get the fix ? If not we might have to do the patch and rollout custom pkg. On Thu, Sep 12, 2013 at 8:36 AM, lars hofhansl la...@apache.org wrote: Yep... Very likely HBASE-9428: 8 threads: java.lang.Thread.State: RUNNABLE at java.util.Arrays.copyOf(Arrays.java:2786) at java.lang.StringCoding.decode(StringCoding.java:178) at java.lang.String.init(String.java:483) at org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96) ... 4 threads: java.lang.Thread.State: RUNNABLE at sun.nio.cs.ISO_8859_1$Decoder.decodeArrayLoop(ISO_8859_1.java:79) at sun.nio.cs.ISO_8859_1$Decoder.decodeLoop(ISO_8859_1.java:106) at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:544) at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:140) at java.lang.StringCoding.decode(StringCoding.java:179) at java.lang.String.init(String.java:483) at org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96) It's also consistent with what you see: Lots of garbage (hence tweaking your GC options had a significant effect) The fix is in 0.94.12, which is in RC right now, probably to be released early next week. -- Lars From: OpenSource Dev dev.opensou...@gmail.com To: user@hbase.apache.org Sent: Thursday, September 12, 2013 8:15 AM Subject: Re: High cpu usage on a region server A server started getting busy last night, but this time it took ~5 hrs to get from 15% busy to 75% busy. It is not running 80% flat-out yet. But this is still very high compared to other servers that are running under ~25% cpu usage. Only change that I made yesterday was the addition of -XX:+UseParNewGC to hbase startup command. http://pastebin.com/VRmujgyH On Wed, Sep 11, 2013 at 2:28 PM, Stack st...@duboce.net wrote: Can you thread dump the busy server and pastebin it? Thanks, St.Ack On Wed, Sep 11, 2013 at 1:49 PM, OpenSource Dev dev.opensou...@gmail.comwrote: Hi, I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no issues with writes/puts. System is handles upto 800k puts per seconds without issue. On average we do 250k puts per second. I am having the problem with Reads, I've also isolated where the problem is but not been able to find the root cause. I have 16 machines running hbase-region server, each has ~35 regions. Once in a while cpu goes flatout 80% in 1 region server. These are the things i've noticed in ganglia: hbase.regionserver.request - evenly distributed. Not seeing any spikes on the busy server hbase.regionserver.blockCacheSize - between 500MB and 1000MB hbase.regionserver.compactionQueueSize - avg 2 or less hbase.regionserver.blockCacheHitRatio - 30% on busy node, 60% on other nodes JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC -XX:+UseConcMarkSweepGC I've noticed the system load moves to a different region, sometimes within a minute, if the busy region is restarted. Any suggestion what could be causing the load and/or what other metrics should I check ? Thank you!
Re: Performance analysis in Hbase
Yeah there isn't a whole lot of documentation about metrics. Could it be that you are still running on a default 1GB heap and you are pounding it with multiple clients? Try raising the heap size? FWIW I gave a presentation at HBaseCon with Kevin O'dell about HBase operations which could shed some light: Video: http://www.cloudera.com/content/cloudera/en/resources/library/hbasecon/hbasecon-2013--apache-hbase-meet-ops-ops-meet-apache-hbase-video.html Slides: http://www.slideshare.net/cloudera/operations-session-6 J-D On Tue, Sep 10, 2013 at 8:40 AM, Vimal Jain vkj...@gmail.com wrote: Can someone please throw some light on this aspect of Hbase ? On Thu, Sep 5, 2013 at 11:04 AM, Vimal Jain vkj...@gmail.com wrote: Just to add more information , i got following link which explains metrics related to RS. http://hbase.apache.org/book.html#rs_metrics Is there any resource which explains these metrics in detail ,( in official guide , there is just one line for each metric) . On Thu, Sep 5, 2013 at 10:06 AM, Vimal Jain vkj...@gmail.com wrote: Hi, I am running Hbase in *pseudo distributed mode on top of HDFS.* So far , its been running fine. In past i had some memory related issue ( long GC pauses ). So i wanted to know if there is a way through GUI ( web UI on 60010,60030) or CLI ( shell) to get the health of Hbase ( with reference to its memory consumption , cpu starvation if any ). Please provide some resources where i can look for this information. -- Thanks and Regards, Vimal Jain -- Thanks and Regards, Vimal Jain -- Thanks and Regards, Vimal Jain
Re: Getting column values in batches for a single row
Scan.setBatch does what you are looking for, since with a Get there's no way to iterate over mutliple calls: https://github.com/apache/hbase/blob/0.94.2/src/main/java/org/apache/hadoop/hbase/client/Scan.java#L306 Just make sure to make the Scan start at the row you want and stop right after it. J-D On Mon, Sep 9, 2013 at 12:28 PM, Sam William sa...@stumbleupon.com wrote: Hi, I have a table which is wide(with a single family) and the column qualifiers are timestamps. I'd like to do a get on a rowkey, but I dont need to read all of the columns. I want to read the first n values and then read more in batches if need be. Is there a way to do this? Im on version-0.94.2. Thanks
Re: HBase distributed mode issue
What's your /etc/hosts on the master like? HBase does a simple lookup to get the machine's hostname and it seems your need reports itself as being localhost. On Tue, Sep 3, 2013 at 6:23 AM, Omkar Joshi omkar.jo...@lntinfotech.comwrote: I'm trying to set up a 2-node HBase cluster in distributed mode. Somehow, my regionserver/slave is connecting to 'localhost' for the master despite of adding the appropriate property for master in hbase-site.xml. The detailed thread(in the mail, the files etc. would look cluttered, hence, providing the thread to an external site) is here : http://stackoverflow.com/questions/18587512/hbase-distributed-mode Regards, Omkar Joshi The contents of this e-mail and any attachment(s) may contain confidential or privileged information for the intended recipient(s). Unintended recipients are prohibited from taking action on the basis of information in this e-mail and using or disseminating the information, and must notify the sender and delete it from their system. LT Infotech will not accept responsibility or liability for the accuracy or completeness of, or the presence of any virus or disabling code in this e-mail
Re: counter Increment gives DonotRetryException
You probably put a string in there that was a number, and increment expects a 8 bytes long. For example, if you did: put 't1', '9row27', 'columnar:column1', '1' Then did an increment on that, it would fail. J-D On Thu, Aug 29, 2013 at 4:42 AM, yeshwanth kumar yeshwant...@gmail.comwrote: i am newbie to Hbase, going through Counters topic, whenever i perform increment like incr 't1','9row27','columnar:column1',1 it gives an ERROR: org.apache.hadoop.hbase.DoNotRetryIOException: org.apache.hadoop.hbase.DoNotRetryIOException: Attempted to increment field that isn't 64 bits wide looking for some help
Re: [Question: replication] why only one regionserver is used during replication? 0.94.9
Region servers replicate data written to them, so look at how your regions are distributed. J-D On Tue, Aug 27, 2013 at 11:29 AM, Demai Ni nid...@gmail.com wrote: hi, guys, I am using hbase 0.94.9. And setup replication from a 4-nodes master(3 regserver) to a 3-nodes slave(2 regserver). I can tell that all source regservers can successfully replicate data. However, it seems for each particular table, only one regserver will handle its replication at each given table. For example, I am using YCSB to load 1,000,000 rows with workloada, with 16 threads. During the load period, I looked at the ageOfLastShippedOp and sizeOfLogQueue. I can tell one of the regserver from Master is doing the replication. While values of both age and sizeOfLog are growing, another two regserver doesn't come into help. So does that mean: for each table and process, only one regionserver will do the replication regardless how long the queue is? Or did I miss some setup configuration? Thanks. Demai
Re: Is downgrade from 0.96.0 to 0.94.6 possible?
FYI you'll be in the same situation with 0.95.2, actually worse since it's really just a developer preview release. But if you meant try in its strict sense, ie use it on a test cluster, then yes please do. The more people we get to try it out the better 0.96.0 will be. J-D On Thu, Aug 22, 2013 at 9:58 PM, Xiong LIU liuxiongh...@gmail.com wrote: Thanks, Stack. I will try 0.95.2 ahead. Best Wishes On Fri, Aug 23, 2013 at 11:28 AM, Stack st...@duboce.net wrote: On Thu, Aug 22, 2013 at 8:00 PM, Xiong LIU liuxiongh...@gmail.com wrote: We are considering to upgrade our hbase cluster from version 0.94.6 to 0.96.0 once 0.96.0 is out. I want to know whether any possible failure may happen during the upgrade progress, and if it does happen, is it possible to downgrade to 0.94.6? No. We do not have anyone working on making it so you can rollback. Is there any best practice of upgrading 0.94.x to 0.96.0? Don't be the first (smile). Ask later after we get some experience moving folks through the upgrade. The upgrade process has had little exercise as of this date. St.Ack
Re: Replication queue?
You can find a lot here: http://hbase.apache.org/replication.html And how many logs you can queue is how much disk space you have :) On Tue, Aug 20, 2013 at 7:23 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi, If I have a master - slave replication, and master went down, replication will start back where it was when master will come back online. Fine. If I have a master - slave replication, and slave went down, is the data queued until the slave come back online and then sent? If so, how big can be this queu, and how long can the slave be down? Same questions for master - master... I guess for this one, it's like for the 1 line above and it's fine, right? Thanks, JM
Re: Major Compaction in 0.90.6
On Mon, Aug 19, 2013 at 11:52 PM, Monish r monishs...@gmail.com wrote: Hi Jean, s/Jean/Jean-Daniel ;) Thanks for the explanation. Just a clarification on the third answer, In our current cluster ( 0.90.6 ) , i find that irrespective of whether TTL is set or not , Major compaction compaction rewrites hfile for the region ( there is only one hfile for that region ) on every manual major compaction trigger. Can you enable DEBUG logs? You'd see why the major compaction is triggered. log : 2013-08-19 14:15:29,926 INFO org.apache.hadoop.hbase.regionserver.Store: Completed major compaction of 1 file(s), new file=hdfs://x.x.x.x:9000/hbase/NOTIFICATION_HISTORY/b00086bca62ee55796a960002291aca4/n/4754838096619480671 i find a new file is created for every major compaction triggger. Regards, R.Monish On Mon, Aug 19, 2013 at 11:52 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: Inline. J-D On Mon, Aug 19, 2013 at 2:48 AM, Monish r monishs...@gmail.com wrote: Hi guys, I have the following questions in HBASE 0.90.6 1. Does hbase use only one compaction thread to handle both major and minor compaction? Yes, look at CompactSplitThread 2. If hbase uses multiple compaction threads, which configuration parameter defines the number of compaction threads? It doesn't in 0.90.6 but CompactSplitThread lists those for 0.92+ hbase.regionserver.thread.compaction.large hbase.regionserver.thread.compaction.small 3. After hbase.majorcompaction.interval from last major compaction ,if major compaction is executed on a table already major compacted Does hbase skip all the table regions from major compaction? Determining if something is major-compacted is definitely not at the table-level. In 0.90.6, MajorCompactionChecker will ask HRegion.isMajorCompaction() to check if it needs to major compact again, which in turns checks every Store. FWW if you have TTL turned on it will still major compact a major compacted file, HFiles don't have an index of what's deleted or TTL'd and it doesn't do a full read of each files to check. Regards, R.Monish
Re: HDFS Restart with Replication
Doing a bin/stop-hbase.sh is the way to go, then on the Hadoop side you do stop-all.sh. I think your ordering is correct but I'm not sure you are using the right commands. J-D On Fri, Aug 2, 2013 at 8:27 AM, Patrick Schless patrick.schl...@gmail.com wrote: Ah, I bet the issue is that I'm stopped the HMaster, but not the Region Servers, then restarting HDFS. What's the correct order of operations for bouncing everything? On Thu, Aug 1, 2013 at 5:21 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: Can you follow the life of one of those blocks though the Namenode and datanode logs? I'd suggest you start by doing a fsck on one of those files with the option that gives the block locations first. By the way why do you have split logs? Are region servers dying every time you try out something? On Thu, Aug 1, 2013 at 3:16 PM, Patrick Schless patrick.schl...@gmail.com wrote: Yup, 14 datanodes, all check back in. However, all of the corrupt files seem to be splitlogs from data05. This is true even though I've done several restarts (each restart adding a few missing blocks). There's nothing special about data05, and it seems to be in the cluster, the same as anyone else. On Thu, Aug 1, 2013 at 5:04 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: I can't think of a way how your missing blocks would be related to HBase replication, there's something else going on. Are all the datanodes checking back in? J-D On Thu, Aug 1, 2013 at 2:17 PM, Patrick Schless patrick.schl...@gmail.com wrote: I'm running: CDH4.1.2 HBase 0.92.1 Hadoop 2.0.0 Is there an issue with restarting a standby cluster with replication running? I am doing the following on the standby cluster: - stop hmaster - stop name_node - start name_node - start hmaster When the name node comes back up, it's reliably missing blocks. I started with 0 missing blocks, and have run through this scenario a few times, and am up to 46 missing blocks, all from the table that is the standby for our production table (in a different datacenter). The missing blocks all are from the same table, and look like: blk_-2036986832155369224 /hbase/splitlog/data01.sea01.staging.tdb.com ,60020,1372703317824_hdfs%3A%2F%2Fname-node.sea01.staging.tdb.com %3A8020%2Fhbase%2F.logs%2Fdata05.sea01.staging.tdb.com %2C60020%2C1373557074890-splitting%2Fdata05.sea01.staging.tdb.com %252C60020%252C1373557074890.1374960698485/tempodb-data/c9cdd64af0bfed70da154c219c69d62d/recovered.edits/01366319450.temp Do I have to stop replication before restarting the standby? Thanks, Patrick
Re: HDFS Restart with Replication
Ah then doing bin/hbase-daemon.sh stop master on the master node is the equivalent, but don't stop the region server themselves as the master will take care of it. Doing a stop on the master and the region servers will screw things up. J-D On Fri, Aug 2, 2013 at 3:28 PM, Patrick Schless patrick.schl...@gmail.com wrote: Doesn't stop-hbase.sh (and its ilk) require the server to be able to manage the clients (using unpassworded SSH keys, for instance)? I don't have that set up (for security reasons). I use capistrano for all these sort of coordination tasks. On Fri, Aug 2, 2013 at 12:07 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: Doing a bin/stop-hbase.sh is the way to go, then on the Hadoop side you do stop-all.sh. I think your ordering is correct but I'm not sure you are using the right commands. J-D On Fri, Aug 2, 2013 at 8:27 AM, Patrick Schless patrick.schl...@gmail.com wrote: Ah, I bet the issue is that I'm stopped the HMaster, but not the Region Servers, then restarting HDFS. What's the correct order of operations for bouncing everything? On Thu, Aug 1, 2013 at 5:21 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: Can you follow the life of one of those blocks though the Namenode and datanode logs? I'd suggest you start by doing a fsck on one of those files with the option that gives the block locations first. By the way why do you have split logs? Are region servers dying every time you try out something? On Thu, Aug 1, 2013 at 3:16 PM, Patrick Schless patrick.schl...@gmail.com wrote: Yup, 14 datanodes, all check back in. However, all of the corrupt files seem to be splitlogs from data05. This is true even though I've done several restarts (each restart adding a few missing blocks). There's nothing special about data05, and it seems to be in the cluster, the same as anyone else. On Thu, Aug 1, 2013 at 5:04 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: I can't think of a way how your missing blocks would be related to HBase replication, there's something else going on. Are all the datanodes checking back in? J-D On Thu, Aug 1, 2013 at 2:17 PM, Patrick Schless patrick.schl...@gmail.com wrote: I'm running: CDH4.1.2 HBase 0.92.1 Hadoop 2.0.0 Is there an issue with restarting a standby cluster with replication running? I am doing the following on the standby cluster: - stop hmaster - stop name_node - start name_node - start hmaster When the name node comes back up, it's reliably missing blocks. I started with 0 missing blocks, and have run through this scenario a few times, and am up to 46 missing blocks, all from the table that is the standby for our production table (in a different datacenter). The missing blocks all are from the same table, and look like: blk_-2036986832155369224 /hbase/splitlog/ data01.sea01.staging.tdb.com ,60020,1372703317824_hdfs%3A%2F%2Fname-node.sea01.staging.tdb.com %3A8020%2Fhbase%2F.logs%2Fdata05.sea01.staging.tdb.com %2C60020%2C1373557074890-splitting%2Fdata05.sea01.staging.tdb.com %252C60020%252C1373557074890.1374960698485/tempodb-data/c9cdd64af0bfed70da154c219c69d62d/recovered.edits/01366319450.temp Do I have to stop replication before restarting the standby? Thanks, Patrick
Re: HDFS Restart with Replication
I can't think of a way how your missing blocks would be related to HBase replication, there's something else going on. Are all the datanodes checking back in? J-D On Thu, Aug 1, 2013 at 2:17 PM, Patrick Schless patrick.schl...@gmail.com wrote: I'm running: CDH4.1.2 HBase 0.92.1 Hadoop 2.0.0 Is there an issue with restarting a standby cluster with replication running? I am doing the following on the standby cluster: - stop hmaster - stop name_node - start name_node - start hmaster When the name node comes back up, it's reliably missing blocks. I started with 0 missing blocks, and have run through this scenario a few times, and am up to 46 missing blocks, all from the table that is the standby for our production table (in a different datacenter). The missing blocks all are from the same table, and look like: blk_-2036986832155369224 /hbase/splitlog/data01.sea01.staging.tdb.com ,60020,1372703317824_hdfs%3A%2F%2Fname-node.sea01.staging.tdb.com %3A8020%2Fhbase%2F.logs%2Fdata05.sea01.staging.tdb.com %2C60020%2C1373557074890-splitting%2Fdata05.sea01.staging.tdb.com %252C60020%252C1373557074890.1374960698485/tempodb-data/c9cdd64af0bfed70da154c219c69d62d/recovered.edits/01366319450.temp Do I have to stop replication before restarting the standby? Thanks, Patrick
Re: HDFS Restart with Replication
Can you follow the life of one of those blocks though the Namenode and datanode logs? I'd suggest you start by doing a fsck on one of those files with the option that gives the block locations first. By the way why do you have split logs? Are region servers dying every time you try out something? On Thu, Aug 1, 2013 at 3:16 PM, Patrick Schless patrick.schl...@gmail.com wrote: Yup, 14 datanodes, all check back in. However, all of the corrupt files seem to be splitlogs from data05. This is true even though I've done several restarts (each restart adding a few missing blocks). There's nothing special about data05, and it seems to be in the cluster, the same as anyone else. On Thu, Aug 1, 2013 at 5:04 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: I can't think of a way how your missing blocks would be related to HBase replication, there's something else going on. Are all the datanodes checking back in? J-D On Thu, Aug 1, 2013 at 2:17 PM, Patrick Schless patrick.schl...@gmail.com wrote: I'm running: CDH4.1.2 HBase 0.92.1 Hadoop 2.0.0 Is there an issue with restarting a standby cluster with replication running? I am doing the following on the standby cluster: - stop hmaster - stop name_node - start name_node - start hmaster When the name node comes back up, it's reliably missing blocks. I started with 0 missing blocks, and have run through this scenario a few times, and am up to 46 missing blocks, all from the table that is the standby for our production table (in a different datacenter). The missing blocks all are from the same table, and look like: blk_-2036986832155369224 /hbase/splitlog/data01.sea01.staging.tdb.com ,60020,1372703317824_hdfs%3A%2F%2Fname-node.sea01.staging.tdb.com %3A8020%2Fhbase%2F.logs%2Fdata05.sea01.staging.tdb.com %2C60020%2C1373557074890-splitting%2Fdata05.sea01.staging.tdb.com %252C60020%252C1373557074890.1374960698485/tempodb-data/c9cdd64af0bfed70da154c219c69d62d/recovered.edits/01366319450.temp Do I have to stop replication before restarting the standby? Thanks, Patrick
Re: Can't solve the Unable to load realm info from SCDynamicStore error
Unable to load realm info from SCDynamicStore is only a warning and a red herring. What seems to be happening is that your shell can't reach zookeeper. Are Zookeeper and HBase running? What other health checks have you done? J-D On Tue, Jul 30, 2013 at 10:28 PM, Seth Edwards sethaedwa...@gmail.com wrote: I am somewhat new to HBase and was using it fine locally. At some point I started getting Unable to load realm info from SCDynamicStore when I would try to run HBase in standalone mode. I'm on Mac OSX 10.8.4. I have gone through many steps mentioned on Stack Overflow, changing configurations in hbase-env.sh. I've tried this on hbase version 0.94.7 and 0.94.9. Here is a gist of the stack trace I receive when trying to create a table with the shell https://gist.github.com/Sedward/2570beade8c9528682c3
Re: Excessive .META scans
Can you tell who's doing it? You could enable IPC debug for a few secs to see who's coming in with scans. You could also try to disable pre-fetching, set hbase.client.prefetch.limit to 0 Also, is it even causing a problem or you're just worried it might since it doesn't look normal? J-D On Mon, Jul 29, 2013 at 10:32 AM, Varun Sharma va...@pinterest.com wrote: Hi folks, We are seeing an issue with hbase 0.94.3 on CDH 4.2.0 with excessive .META. reads... In the steady state where there are no client crashes and there are no region server crashes/region movement, the server holding .META. is serving an incredibly large # of read requests on the .META. table. From my understanding, in the steady state, region locations should be indefinitely cached in the client. The client is running a work load of multiput(s), puts, gets and coprocessor calls. Thanks Varun
Re: Altering table column family attributes without disabling the table
You could always set hbase.online.schema.update.enable to true on your master, restart it (but not the cluster), and you could do what you are describing... but it's a risky feature to use before 0.96.0. Did you also set hbase.replication to true? If not, you'll have to do it on the region servers and the master via a rolling restart. J-D
Re: Bulk Load on HBase 0.95.1-hadoop1
0.95.1 is a developer preview release, if you are just starting with HBase please grab the stable release from 0.94, for example http://mirrors.sonic.net/apache/hbase/stable/ J-D On Thu, Jul 18, 2013 at 1:51 PM, Jonathan Cardoso jonathancar...@gmail.comwrote: I was trying to follow the instructions from thishttp://www.thecloudavenue.com/2013/04/bulk-loading-data-in-hbase.html website to insert a lot of data to HBase using MapReduce, but as with other approaches I found on the web I have always the same problem: I have compile erros because classes from package org.apache.hadoop.hbase.mapreduce.* cannot be found. To run a mapreduce task using HBase I have to downgrade the version of my HBase to, for example, 0.92? *Jonathan Cardoso** ** Universidade Federal de Goias*
Re: several doubts about region split?
Inline. J-D On Wed, Jul 17, 2013 at 7:10 AM, yonghu yongyong...@gmail.com wrote: Thanks for your quick response! For the question one, what will be the latency? How long we need to wait until the daughter regions are again online? Usually a matter of 1-2 seconds. regards! Yong On Wed, Jul 17, 2013 at 4:05 PM, Ted Yu yuzhih...@gmail.com wrote: bq. Does it mean the region which will be splitted is not available anymore? Right. bq. What happened to the read and write requests to that region? The requests wouldn't be served by the hosting region server until daughter regions become online. Will try to dig up answer to question #2. In short, load balancer is supposed to offload one of the daughter regions if continuous write load incurs. Cheers On Wed, Jul 17, 2013 at 6:53 AM, yonghu yongyong...@gmail.com wrote: Dear all, From the HBase reference book, it mentions that when RegionServer splits regions, it will offline the split region and then adds the daughter regions to META, opens daughters on the parent's hosting RegionServer and then reports the split to the Master. I have a several questions: 1. What does offline means? Does it mean the region which will be splitted is not available anymore? What happened to the read and write requests to that region? 2. From the description, if I understand right it means that now the RegionServer will contain two Regions (One RegionServer for both daughter and parent regions ) instead of one RegionSever for daughter and one for parent. If it is, what are the benefits of this approach? Hot-spot problem is still there. It's not a load problem it's a data problem. We're splitting when we have enough data. Then HBase relies on the master doing some balancing on the cluster. Moreover, this approach will be a big problem if we use the HBase default split approach. Suppose we bulk load data into HBase cluster, initially every write request will be accepted by only one RegionServer. After some write requests, the RegionServer cannot response any write request as it reaches its disk volume threshold. Hence, some data must be removed from one RegionSever to the other RegionServer. The question is that why we don't do it at the region split time? Since you read the reference book, you will also find in there that we recommend never bulk loading data into a table with only 1 region. You should always create your tables with pre-defined splits if you plan on importing a lot of data. J-D
Re: Memory leak in HBase replication ?
Yean WARN won't give us anything, and please try to get us a fat log. Post it on pastebin or such. Thx, J-D On Wed, Jul 17, 2013 at 11:03 AM, Anusauskas, Laimonas lanusaus...@corp.untd.com wrote: J-D, I have log level org.apache=WARN and there is only following in the logs before GC happens: 2013-07-17 10:56:45,830 ERROR org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics: Inconsistent configuration. Previous configuration for using table name in metrics: true, new configuration: false 2013-07-17 10:56:47,395 WARN org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library is available I'll try upping log level to DEBUG to see if that shows anything and will run jstack. Thanks, Limus
Re: Memory leak in HBase replication ?
1GB is a pretty small heap and it could be that the default size for logs to replicate is set to high. The default for replication.source.size.capacity is 64MB. Can you set it much lower on your master cluster (on each RS), like 2MB, and see if it makes a difference? The logs and the jstack seem to correlate in that sense. Thx, J-D On Wed, Jul 17, 2013 at 1:40 PM, Anusauskas, Laimonas lanusaus...@corp.untd.com wrote: And here is the jstack output. http://pastebin.com/JKnQYqRg
Re: Memory leak in HBase replication ?
Yes... your master cluster must have helluva backup to replicate :) Seems to make a good argument to lower the default setting. What do you think? J-D On Wed, Jul 17, 2013 at 3:37 PM, Anusauskas, Laimonas lanusaus...@corp.untd.com wrote: Thanks, setting replication.source.size.capacity to 2MB resolved this. I see heap growing to about 700MB but then going down and full GC is only triggered occasionally. And while primary cluster is has very little load ( 100 requests/sec) the standby cluster is now pretty loaded at 5K requests/sec, presumable because it has to replicate all the pending changes. So perhaps this is the issue that happens when standby cluster goes away for a while and then has to catch up. Really appreciate the help. Limus
Re: HBase Standalone against multiple drives
The local filesystem implementation doesn't support multiple drives AFAIK, so your best bet is to RAID your disks if that's really something you want to do. Else, you have to use HDFS. J-D On Tue, Jul 16, 2013 at 8:55 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi, In standalone mode, HBase does not use HDFS -- it uses the local filesystem instead but is there a way to give it more than one directory to write on? Or is pseudo-distributed mode the only way to do that? Thanks, JM
Re: Replication - some timestamps off by 1 ms
Are those incremented cells? J-D On Thu, Jul 11, 2013 at 10:23 AM, Patrick Schless patrick.schl...@gmail.com wrote: I have had replication running for about a week now, and have had a lot of data flowing to our slave cluster over that time. Now, I'm running the verifyrep MR job over a 1-hour period a couple days ago (which should be fully replicated), and I'm seeing a small number of BADROWS. Spot-checking a few of them, the issue seems to be that the rows are present, and have the same values, but a single cell in the row will be off by 1ms. For instance, the log reports this error: java.lang.Exception: This result was different: keyvalues={01e581745c6a43aba01adf105af4e4a92013071015/data:!\xDF\xE0\x01/1373470622986/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:s\xC0\x01/1373470923084/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:+\x07\xA0\x01/1373471223717/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:/\x9B\x80\x01/1373471523316/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:4/`\x01/1373471822913/Put/vlen=8} compared to keyvalues={01e581745c6a43aba01adf105af4e4a92013071015/data:!\xDF\xE0\x01/1373470622986/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:s\xC0\x01/1373470923084/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:+\x07\xA0\x01/1373471223716/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:/\x9B\x80\x01/1373471523316/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:4/`\x01/1373471822913/Put/vlen=8} Some diffing reduces the issue down to: 01e581745c6a43aba01adf105af4e4a92013071015/data:+\x07\xA0\x01/1373471223717/Put/vlen=8 compared to 01e581745c6a43aba01adf105af4e4a92013071015/data:+\x07\xA0\x01/1373471223716/Put/vlen=8. I'm assuming that the value before /Put is the cell's timestamp, which means that the copies are off by 1ms. Any idea what could cause this? So far (the job is still running), the problem seems rare (about 0.05% of rows). Thanks, Patrick
Re: Replication - some timestamps off by 1 ms
Yeah increments won't work. I guess the warning isn't really visible but one place you can see it is: $ ./bin/hadoop jar ../hbase/hbase.jar An example program must be given as the first argument. Valid program names are: CellCounter: Count cells in HBase table completebulkload: Complete a bulk data load. copytable: Export a table from local cluster to peer cluster export: Write table data to HDFS. import: Import data written by Export. importtsv: Import data in TSV format. rowcounter: Count rows in HBase table verifyrep: Compare the data from tables in two different clusters. WARNING: It doesn't work for incrementColumnValues'd cells since the timestamp is changed after being appended to the log. The problem is that increments' timestamps are different in the WAL and in the final KV that's stored in HBase. J-D On Thu, Jul 11, 2013 at 12:19 PM, Patrick Schless patrick.schl...@gmail.com wrote: It's possible, but I'm not sure. This is a live system, and we do use increment, and it's a smaller portion of our writes into HBase. I can try to duplicate it, but I can't say how these specific cells got written. Would incremented cells not get replicated correctly? On Thu, Jul 11, 2013 at 12:53 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: Are those incremented cells? J-D On Thu, Jul 11, 2013 at 10:23 AM, Patrick Schless patrick.schl...@gmail.com wrote: I have had replication running for about a week now, and have had a lot of data flowing to our slave cluster over that time. Now, I'm running the verifyrep MR job over a 1-hour period a couple days ago (which should be fully replicated), and I'm seeing a small number of BADROWS. Spot-checking a few of them, the issue seems to be that the rows are present, and have the same values, but a single cell in the row will be off by 1ms. For instance, the log reports this error: java.lang.Exception: This result was different: keyvalues={01e581745c6a43aba01adf105af4e4a92013071015/data:!\xDF\xE0\x01/1373470622986/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:s\xC0\x01/1373470923084/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:+\x07\xA0\x01/1373471223717/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:/\x9B\x80\x01/1373471523316/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:4/`\x01/1373471822913/Put/vlen=8} compared to keyvalues={01e581745c6a43aba01adf105af4e4a92013071015/data:!\xDF\xE0\x01/1373470622986/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:s\xC0\x01/1373470923084/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:+\x07\xA0\x01/1373471223716/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:/\x9B\x80\x01/1373471523316/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:4/`\x01/1373471822913/Put/vlen=8} Some diffing reduces the issue down to: 01e581745c6a43aba01adf105af4e4a92013071015/data:+\x07\xA0\x01/1373471223717/Put/vlen=8 compared to 01e581745c6a43aba01adf105af4e4a92013071015/data:+\x07\xA0\x01/1373471223716/Put/vlen=8. I'm assuming that the value before /Put is the cell's timestamp, which means that the copies are off by 1ms. Any idea what could cause this? So far (the job is still running), the problem seems rare (about 0.05% of rows). Thanks, Patrick
Re: Replication - some timestamps off by 1 ms
Yeah verifyrep is a pretty basic tool, there's tons of room for improvement. For the moment I guess you can ignore the 8 bytes cells that aren't printable strings. Feel free to hack around that MR job and maybe contribute back? The use case for which I built it had loads of tables and the ones that had ICVs pretty much only had that, so it was easy to verify just a couple of tables to have a good idea of how it was doing. J-D On Thu, Jul 11, 2013 at 2:36 PM, Patrick Schless patrick.schl...@gmail.com wrote: Interesting (thanks for the info). I don't suppose there's an easy way to filter those incremented cells out, so the response from verifyRep is meaningful? :) On Thu, Jul 11, 2013 at 3:44 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: Yeah increments won't work. I guess the warning isn't really visible but one place you can see it is: $ ./bin/hadoop jar ../hbase/hbase.jar An example program must be given as the first argument. Valid program names are: CellCounter: Count cells in HBase table completebulkload: Complete a bulk data load. copytable: Export a table from local cluster to peer cluster export: Write table data to HDFS. import: Import data written by Export. importtsv: Import data in TSV format. rowcounter: Count rows in HBase table verifyrep: Compare the data from tables in two different clusters. WARNING: It doesn't work for incrementColumnValues'd cells since the timestamp is changed after being appended to the log. The problem is that increments' timestamps are different in the WAL and in the final KV that's stored in HBase. J-D On Thu, Jul 11, 2013 at 12:19 PM, Patrick Schless patrick.schl...@gmail.com wrote: It's possible, but I'm not sure. This is a live system, and we do use increment, and it's a smaller portion of our writes into HBase. I can try to duplicate it, but I can't say how these specific cells got written. Would incremented cells not get replicated correctly? On Thu, Jul 11, 2013 at 12:53 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: Are those incremented cells? J-D On Thu, Jul 11, 2013 at 10:23 AM, Patrick Schless patrick.schl...@gmail.com wrote: I have had replication running for about a week now, and have had a lot of data flowing to our slave cluster over that time. Now, I'm running the verifyrep MR job over a 1-hour period a couple days ago (which should be fully replicated), and I'm seeing a small number of BADROWS. Spot-checking a few of them, the issue seems to be that the rows are present, and have the same values, but a single cell in the row will be off by 1ms. For instance, the log reports this error: java.lang.Exception: This result was different: keyvalues={01e581745c6a43aba01adf105af4e4a92013071015/data:!\xDF\xE0\x01/1373470622986/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:s\xC0\x01/1373470923084/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:+\x07\xA0\x01/1373471223717/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:/\x9B\x80\x01/1373471523316/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:4/`\x01/1373471822913/Put/vlen=8} compared to keyvalues={01e581745c6a43aba01adf105af4e4a92013071015/data:!\xDF\xE0\x01/1373470622986/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:s\xC0\x01/1373470923084/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:+\x07\xA0\x01/1373471223716/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:/\x9B\x80\x01/1373471523316/Put/vlen=8, 01e581745c6a43aba01adf105af4e4a92013071015/data:4/`\x01/1373471822913/Put/vlen=8} Some diffing reduces the issue down to: 01e581745c6a43aba01adf105af4e4a92013071015/data:+\x07\xA0\x01/1373471223717/Put/vlen=8 compared to 01e581745c6a43aba01adf105af4e4a92013071015/data:+\x07\xA0\x01/1373471223716/Put/vlen=8. I'm assuming that the value before /Put is the cell's timestamp, which means that the copies are off by 1ms. Any idea what could cause this? So far (the job is still running), the problem seems rare (about 0.05% of rows). Thanks, Patrick
Re: optimizing block cache requests + eviction
Do you know if it's a data or meta block? J-D On Mon, Jul 8, 2013 at 4:28 PM, Viral Bajaria viral.baja...@gmail.com wrote: I was able to reproduce the same regionserver asking for the same local block over 300 times within the same 2 minute window by running one of my heavy workloads. Let me try and gather some stack dumps. I agree that jstack crashing the jvm is concerning but there is nothing in the errors to know why it happened. I will keep that conversation out of here. As an addendum, I am using asynchbase as my client. Not sure if the arrival of multiple requests for rowkeys that could be in the same non-cached block causes hbase to queue up a non-cached block read via SCR and since the box is under load, it queues up multiple of these and makes the problem worse. Thanks, Viral On Mon, Jul 8, 2013 at 3:53 PM, Andrew Purtell apurt...@apache.org wrote: but unless the behavior you see is the _same_ regionserver asking for the _same_ block many times consecutively, it's probably workload related.
Re: optimizing block cache requests + eviction
meta blocks are at the end: http://hbase.apache.org/book.html#d2617e12979, a way to tell would be by logging from the HBase side but then I guess it's hard to reconcile with which file we're actually reading from... Regarding your second question, you are asking if we block HDFS blocks? We don't, since we don't even know about HDFS blocks. The BlockReader seeks into the file and return whatever data is asked. J-D On Mon, Jul 8, 2013 at 4:45 PM, Viral Bajaria viral.baja...@gmail.com wrote: Good question. When I looked at the logs, it's not clear from it whether it's reading a meta or data block. Is there any kind of log line that indicates that ? Given that it's saying that it's ready from a startOffset I would assume this is a data block. A question that comes to mind, is this read doing a seek to that position directly or is it going to cache the block ? Looks like it is not caching the block if it's reading directly from a given offset. Or am I wrong ? Following is a sample line that I used while debugging: 2013-07-08 22:58:55,221 DEBUG org.apache.hadoop.hdfs.DFSClient: New BlockReaderLocal for file /mnt/data/current/subdir34/subdir26/blk_-448970697931783518 of size 67108864 startOffset 13006577 length 54102287 short circuit checksum true On Mon, Jul 8, 2013 at 4:37 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: Do you know if it's a data or meta block?
Re: stop_replication dangerous?
Yeah that package documentation ought to be changed. Mind opening a jira? Thx, J-D On Mon, Jul 1, 2013 at 1:51 PM, Patrick Schless patrick.schl...@gmail.com wrote: The first two tutorials for enabling replication that google gives me [1], [2] take very different tones with regard to stop_replication. The HBase docs [1] make it sound fine to start and stop replication as desired. The Cloudera docs [2] say it may cause data loss. Which is true? If data loss is possible, are we talking about data loss in the primary cluster, or data loss in the standby cluster (presumably would require reinitializing the sync with a new CopyTable). Thanks, Patrick [1] http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/replication/package-summary.html#requirements [2] http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Installation-Guide/cdh4ig_topic_20_11.html
Re: 答复: flushing + compactions after config change
On Thu, Jun 27, 2013 at 4:27 PM, Viral Bajaria viral.baja...@gmail.com wrote: Hey JD, Thanks for the clarification. I also came across a previous thread which sort of talks about a similar problem. http://mail-archives.apache.org/mod_mbox/hbase-user/201204.mbox/%3ccagptdnfwnrsnqv7n3wgje-ichzpx-cxn1tbchgwrpohgcos...@mail.gmail.com%3E I guess my problem is also similar to the fact that my writes are well distributed and at a given time I could be writing to a lot of regions. Some of the regions receive very little data but since the flush algorithm choose at random what to flush when too many hlogs is hit, it will flush It's not random, it picks the region with the most data in its memstores. a region with less than 10mb of data causing too many small files. This in-turn causes compaction storms where even though major compactions is disabled, some of the minor get upgraded to major and that's when things start getting worse. I doubt that it's the fact that it's a major compaction that it's making everything worse. When a minor gets promoted into a major it's because we're already going to compact all the files, so we might as well get rid of some deletes at the same time. They are all getting selected because the files are within the selection ratio. I would not focus on this to resolve your problem. My compaction queues are still the same and so I doubt I will be coming out of this storm without bumping up max hlogs for now. Reducing regions per server is one option but then I will be wasting my resources since the servers at current load are at 30% CPU and 25% RAM. Maybe I can bump up heap space and give more memory to the the memstore. Sorry, I am just thinking out loud. I haven't been closely following this thread, but have you posted a log snippet somewhere? It's usually much more telling and we eliminate a few levels of interpretation. Make sure it's at DEBUG, and that you grab a few hours of activity. Get the GC log for the same time as well. Drop this on a web server or pastebin if it fits. Thx, J-D
Re: 答复: flushing + compactions after config change
On Fri, Jun 28, 2013 at 2:39 PM, Viral Bajaria viral.baja...@gmail.com wrote: On Fri, Jun 28, 2013 at 9:31 AM, Jean-Daniel Cryans jdcry...@apache.orgwrote: On Thu, Jun 27, 2013 at 4:27 PM, Viral Bajaria viral.baja...@gmail.com wrote: It's not random, it picks the region with the most data in its memstores. That's weird, because I see some of my regions which receive the least amount of data in a given time period flushing before the regions that are receiving data continuously. The reason I know this is because of the write pattern. Some of my tables are in catch-up mode i.e. I am ingesting data from the past and they always have something to do. While some tables are not in catch-up mode and are just sitting idle for most of the time. Yet I see high number of flushes for those regions too. I doubt that it's the fact that it's a major compaction that it's making everything worse. When a minor gets promoted into a major it's because we're already going to compact all the files, so we might as well get rid of some deletes at the same time. They are all getting selected because the files are within the selection ratio. I would not focus on this to resolve your problem. I meant worse for my writes not for HBase as a whole. I haven't been closely following this thread, but have you posted a log snippet somewhere? It's usually much more telling and we eliminate a few levels of interpretation. Make sure it's at DEBUG, and that you grab a few hours of activity. Get the GC log for the same time as well. Drop this on a web server or pastebin if it fits. The only log snippet that I posted was the flushing action. Also that log was not everything, I had grep'd a few lines out. Let me collect some more stats here and post it again. I just enabled GC logging on this server, deployed the wrong config out initially which had no GC logging. I am not sure how GC logs will help here given that I am at less than 50% heap space used and so I would doubt a stop the world GC happening. Are you trying to look for some other information ? Just trying to cover all the bases. J-D
Re: 答复: flushing + compactions after config change
No, all your data eventually makes it into the log, just potentially not as quickly :) J-D On Thu, Jun 27, 2013 at 2:06 PM, Viral Bajaria viral.baja...@gmail.com wrote: Thanks Azuryy. Look forward to it. Does DEFERRED_LOG_FLUSH impact the number of WAL files that will be created ? Tried looking around but could not find the details. On Thu, Jun 27, 2013 at 7:53 AM, Azuryy Yu azury...@gmail.com wrote: your JVM options arenot enough. I will give you some detail when I go back office tomorrow. --Send from my Sony mobile.
Re: HBase Replication is talking to the wrong peer
Did you find what the issue was? From your other thread it looks like you got it working. Thx, J-D On Mon, Jun 17, 2013 at 11:48 PM, Asaf Mesika asaf.mes...@gmail.com wrote: Hi, I have two cluster setup in a lab, each has 1 Master and 3 RS. I'm inserting roughly 15GB into the master cluster, but I see between 5 - 10 minutes delay between master and slave cluster (ageOfLastShippedOp) them. On my Graphite I see that replicateLogEntries_num_ops is increasing in one region server (IP 85) of the slave cluster, out of 3 (IPs 83,84,85). I ran a grep on the logs of each region server of the master, and saw Chosen peer message saying the following: RS ip 74: Chosen peer 83 RS ip 75: Chosen peer 85 RS ip 76: Chosen peer 85 So first problem: Why only two slave RS (83,85) are receiving replicated log entries instead of 3? Second and biggest problem: I ran netstat -tnp and grepped for 83,84,85 on the RS ip 74, and saw that it is in fact talking with RS 85! This was correlated with the Graphite graph of replicateLogEntries_num_ops which showed that only RS 85 was receiving replicated log entries. For me it looks like a bug. Anyone has any ideas how to solve those two issues?
Re: Replication not suited for intensive write applications?
Given that the region server writes to a single WAL at a time, doing it with multiple threads might be hard. You also have to manage the correct position up in ZK. It might be easier with multiple WALs. In any case, Inserting at such date might not be doable over long periods of time. How long were your benchmarks running for exactly? (can't find it in your first email) You could also fancy doing regular bulk loads (say, every 30 minutes) and consider shipping the same files to the other cluster. Do you have a real use case in mind? Thanks, J-D On Sat, Jun 22, 2013 at 11:33 PM, Asaf Mesika asaf.mes...@gmail.com wrote: bq. I'm not sure if it's really a problem tho. Let's the maximum throughput achieved by writing with k client threads is 30 MB/sec, where k = the number of region servers. If you are consistently writing to HBase more than 30 MB/sec - lets say 40 MB/sec with 2k threads - that you can't use HBase replication and must write your own solution. One way I started thinking about is to somehow declare that for a specific table, order of Puts is not important (say each write is unique), thus you can spawn multiple threads for replicating a WAL file. On Sat, Jun 22, 2013 at 12:18 AM, Jean-Daniel Cryans jdcry...@apache.orgwrote: I think that the same way writing with more clients helped throughput, writing with only 1 replication thread will hurt it. The clients in both cases have to read something (a file from HDFS or the WAL) then ship it, meaning that you can utilize the cluster better since a single client isn't consistently writing. I agree with Asaf's assessment that it's possible that you can write faster into HBase than you can replicate from it if your clients are using the write buffers and have a bigger aggregate throughput than replication's. I'm not sure if it's really a problem tho. J-D On Fri, Jun 21, 2013 at 3:05 PM, lars hofhansl la...@apache.org wrote: Hmm... Yes. Was worth a try :) Should've checked and I even wrote that part of the code. I have no good explanation then, and also no good suggestion about how to improve this. From: Asaf Mesika asaf.mes...@gmail.com To: user@hbase.apache.org user@hbase.apache.org; lars hofhansl la...@apache.org Sent: Friday, June 21, 2013 5:50 AM Subject: Re: Replication not suited for intensive write applications? On Fri, Jun 21, 2013 at 2:38 PM, lars hofhansl la...@apache.org wrote: Another thought... I assume you only write to a single table, right? How large are your rows on average? I'm writing to 2 tables: Avg row size for 1st table is 1500 bytes, and the seconds around is around 800 bytes Replication will send 64mb blocks by default (or 25000 edits, whatever is smaller). The default HTable buffer is 2mb only, so the slave RS receiving a block of edits (assuming it is a full block), has to do 32 rounds of splitting the edits per region in order to apply them. In the ReplicationSink.java (0.94.6) I see that HTable.batch() is used, which writes directly to RS without buffers? private void batch(byte[] tableName, ListRow rows) throws IOException { if (rows.isEmpty()) { return; } HTableInterface table = null; try { table = new HTable(tableName, this.sharedHtableCon, this. sharedThreadPool); table.batch(rows); this.metrics.appliedOpsRate.inc(rows.size()); } catch (InterruptedException ix) { throw new IOException(ix); } finally { if (table != null) { table.close(); } } } There is no setting specifically targeted at the buffer size for replication, but maybe you could increase hbase.client.write.buffer to 64mb (67108864) on the slave cluster and see whether that makes a difference. If it does we can (1) add a setting to control the ReplicationSink HTable's buffer size, or (2) just have it match the replication buffer size replication.source.size.capacity. -- Lars From: lars hofhansl la...@apache.org To: user@hbase.apache.org user@hbase.apache.org Sent: Friday, June 21, 2013 1:48 AM Subject: Re: Replication not suited for intensive write applications? Thanks for checking... Interesting. So talking to 3RSs as opposed to only 1 before had no effect on the throughput? Would be good to explore this a bit more. Since our RPC is not streaming, latency will effect throughout. In this case there is latency while all edits are shipped to the RS in the slave cluster and then extra latency when applying the edits there (which are likely not local to that RS). A true streaming API should be better. If that is the case compression *could* help (but that is a big if). The single thread shipping the edits to the slave should not be an issue as the edits are actually applied
Re: removing ttl
TTL is enforced when compactions are running so there's no need to rewrite the data. The alter is sufficient. J-D On Mon, Jun 24, 2013 at 4:15 PM, Kireet kir...@feedly.com wrote: I need to remove the TTL setting from an existing HBase table and remove the TTL from all existing rows. I think this is the proper command for removing the TTL setting: alter 't', {NAME = 'cf', TTL = '2147483647'} After doing this, do I need to rewrite all the existing data to remove the TTL for each cell, perhaps using the IdentityMapper M/R job? Thanks Kireet
Re: Replication not suited for intensive write applications?
I think that the same way writing with more clients helped throughput, writing with only 1 replication thread will hurt it. The clients in both cases have to read something (a file from HDFS or the WAL) then ship it, meaning that you can utilize the cluster better since a single client isn't consistently writing. I agree with Asaf's assessment that it's possible that you can write faster into HBase than you can replicate from it if your clients are using the write buffers and have a bigger aggregate throughput than replication's. I'm not sure if it's really a problem tho. J-D On Fri, Jun 21, 2013 at 3:05 PM, lars hofhansl la...@apache.org wrote: Hmm... Yes. Was worth a try :) Should've checked and I even wrote that part of the code. I have no good explanation then, and also no good suggestion about how to improve this. From: Asaf Mesika asaf.mes...@gmail.com To: user@hbase.apache.org user@hbase.apache.org; lars hofhansl la...@apache.org Sent: Friday, June 21, 2013 5:50 AM Subject: Re: Replication not suited for intensive write applications? On Fri, Jun 21, 2013 at 2:38 PM, lars hofhansl la...@apache.org wrote: Another thought... I assume you only write to a single table, right? How large are your rows on average? I'm writing to 2 tables: Avg row size for 1st table is 1500 bytes, and the seconds around is around 800 bytes Replication will send 64mb blocks by default (or 25000 edits, whatever is smaller). The default HTable buffer is 2mb only, so the slave RS receiving a block of edits (assuming it is a full block), has to do 32 rounds of splitting the edits per region in order to apply them. In the ReplicationSink.java (0.94.6) I see that HTable.batch() is used, which writes directly to RS without buffers? private void batch(byte[] tableName, ListRow rows) throws IOException { if (rows.isEmpty()) { return; } HTableInterface table = null; try { table = new HTable(tableName, this.sharedHtableCon, this. sharedThreadPool); table.batch(rows); this.metrics.appliedOpsRate.inc(rows.size()); } catch (InterruptedException ix) { throw new IOException(ix); } finally { if (table != null) { table.close(); } } } There is no setting specifically targeted at the buffer size for replication, but maybe you could increase hbase.client.write.buffer to 64mb (67108864) on the slave cluster and see whether that makes a difference. If it does we can (1) add a setting to control the ReplicationSink HTable's buffer size, or (2) just have it match the replication buffer size replication.source.size.capacity. -- Lars From: lars hofhansl la...@apache.org To: user@hbase.apache.org user@hbase.apache.org Sent: Friday, June 21, 2013 1:48 AM Subject: Re: Replication not suited for intensive write applications? Thanks for checking... Interesting. So talking to 3RSs as opposed to only 1 before had no effect on the throughput? Would be good to explore this a bit more. Since our RPC is not streaming, latency will effect throughout. In this case there is latency while all edits are shipped to the RS in the slave cluster and then extra latency when applying the edits there (which are likely not local to that RS). A true streaming API should be better. If that is the case compression *could* help (but that is a big if). The single thread shipping the edits to the slave should not be an issue as the edits are actually applied by the slave RS, which will use multiple threads to apply the edits in the local cluster. Also my first reply - upon re-reading it - sounded a bit rough, that was not intended. -- Lars - Original Message - From: Asaf Mesika asaf.mes...@gmail.com To: user@hbase.apache.org user@hbase.apache.org; lars hofhansl la...@apache.org Cc: Sent: Thursday, June 20, 2013 10:16 PM Subject: Re: Replication not suited for intensive write applications? Thanks for the taking the time to answer! My answers are inline. On Fri, Jun 21, 2013 at 1:47 AM, lars hofhansl la...@apache.org wrote: I see. In HBase you have machines for both CPU (to serve requests) and storage (to hold the data). If you only grow your cluster for CPU and you keep all RegionServers 100% busy at all times, you are correct. Maybe you need to increase replication.source.size.capacity and/or replication.source.nb.capacity (although I doubt that this will help here). I was thinking of giving a shot, but theoretically it should not affect, since I'm doing anything in parallel, right? Also a replication source will pick region server from the target at random (10% of them at default). That has two effects: 1. Each source will pick exactly one RS at the target: ceil (3*0.1)=1 2. With such a small cluster setup the likelihood is high that two or more RSs in the
Re: heap memory running
24GB is often cited as an upper limit, but YMMV. It also depends if you need memory for MapReduce, if you are using it. J-D On Wed, Jun 19, 2013 at 3:17 PM, prakash kadel prakash.ka...@gmail.com wrote: hi every one, i am quite new to base and java. I have a few questions. 1. on the web ui for hbase i have the following entry in the region server mining,60020,13711358624Fri Jun 14 00:04:22 GMT 2013requestsPerSecond=0, numberOfOnlineRegions=106, usedHeapMB=5577, maxHeapMB=7933 when the hbase is idle with no requests the usedHeapMB hovers around 5000MB, shouldn't it go down after some idle time? what is occupying the heap when no requests are being made? 2. I have assigned 8GB for heap on a 48GB machine, i dont mind assigning more of it to hbase. What is the recommended size for the heap? Sincerely, Prakash
Re: client bursts followed by drops
None of your attachements made it across, this mailing list often (but not always) strips them. Are you able to jstack when the drops happen and the queue time is high? This could be https://issues.apache.org/jira/browse/HBASE-5898 but it seems a long stretch without more info. You could also try to see if a more recent version exhibits the same behavior. J-D On Wed, Jun 19, 2013 at 1:09 AM, Amit Mor amit.mor.m...@gmail.com wrote: Attachment of JMX requests metric showing bursts on RS On Wed, Jun 19, 2013 at 12:48 AM, Amit Mor amit.mor.m...@gmail.com wrote: Hello, We use hbase 0.94.2 and we are seeing (mostly reading) latency issues, and with an interesting twist: client (100 threads) get stuck waiting on HBase, then stops sending RPC's and then seems to be freed. When freed, while trying to flush all request it had queued, it causes another vicious cycle of burst - drop - burst. I am seeing about 40K requests per seconds per RS. The cluster is mostly read, no compaction and split storms. Just bursts and bursts. We set 30 rpc handlers per RS (avg scan size is 5K) and they most of the time seem to be WAITING. IPC at DEBUG revealed (see below) that sometimes the queueTime is very big and sometimes the responseTime is very big, but at a very low percentage. The burst seem to be periodic and uncorrelated to GC activity or CPU (the burst appear the moment the RS is onlined and the heap is free) The row keys are murmur3 hashed, and I don't really see any hotspoting Any idea what might cause those bursts ? Thanks, Amit
Re: RPC Replication Compression
Replication doesn't need to know about compression at the RPC level so it won't refer to it and as far as I can tell you need to set compression only on the master cluster and the slave will figure it out. Looking at the code tho, I'm not sure it works the same way it used to work before everything went protobuf. I would give 2 internets to whoever tests 0.95.1 with RPC compression turned on and compares results with non-compressed RPC. See http://hbase.apache.org/book.html#rpc.configs J-D On Tue, Jun 4, 2013 at 5:22 AM, Asaf Mesika asaf.mes...@gmail.com wrote: If RPC has compression abilities, how come Replication, which also works in RPC does not get it automatically? On Tue, Jun 4, 2013 at 12:34 PM, Anoop John anoop.hb...@gmail.com wrote: 0.96 will support HBase RPC compression Yes Replication between master and slave will enjoy it as well (important since bandwidth between geographically distant data centers is scarce and more expensive) But I can not see it is being utilized in replication. May be we can do improvements in this area. I can see possibilities. -Anoop- On Tue, Jun 4, 2013 at 1:51 PM, Asaf Mesika asaf.mes...@gmail.com wrote: Hi, Just wanted to make sure if I read in the internet correctly: 0.96 will support HBase RPC compression thus Replication between master and slave will enjoy it as well (important since bandwidth between geographically distant data centers is scarce and more expensive)
Re: Best practices for loading data into hbase
You cannot use the local job tracker (that is, the one that gets started if you don't have one running) with the TotalOrderPartitioner. You'll need to fully install hadoop on that vmware node. Google that error to find other relevant comments. J-D On Fri, May 31, 2013 at 1:19 PM, David Poisson david.pois...@ca.fujitsu.com wrote: Hi, We are still very new at all of this hbase/hadoop/mapreduce stuff. We are looking for the best practices that will fit our requirements. We are currently using the latest cloudera vmware's (single node) for our development tests. The problem is as follows: We have multiple sources in different format (xml, csv, etc), which are dumps of existing systems. As one might think, there will be an initial import of the data into hbase and afterwards, the systems would most likely dump whatever data they have accumulated since the initial import into hbase or since the last data dump. Another thing, we would require to have an intermediary step, so that we can ensure all of a source's data can be successfully processed, something which would look like: XML data file --(MR JOB)-- Intermediate (hbase table or hfile?) --(MR JOB)-- production tables in hbase We're guessing we can't use something like a transaction in hbase, so we thought about using a intermediate step: Is that how things are normally done? As we import data into hbase, we will be populating several tables that links data parts together (account X in System 1 == account Y in System 2) as tuples in 3 tables. Currently, this is being done by a mapreduce job which reads the XML source and uses multiTableOutputFormat to put data into those 3 hbase tables. This method isn't that fast using our test sample (2 minutes for 5Mb), so we are looking at optimizing the loading of data. We have been researching bulk loading but we are unsure of a couple of things: Once we process an xml file and we populate our 3 production hbase tables, could we bulk load another xml file and append this new data to our 3 tables or would it write over what was written before? In order to bulk load, we need to output a file using HFileOutputFormat. Since MultiHFileOutputFormat doesn't seem to officially exist yet (still in the works, right?), should we process our input xml file with 3 MapReduce jobs instead of 1 and output an hfile for each, which we could then become our intermediate step (if all 3 hfiles were created without errors, then process was successful: bulk load in hbase)? Can you experiment with bulk loading on a vmware? We're experiencing problems with partition file not being found with the following exception: java.lang.Exception: java.lang.IllegalArgumentException: Can't read partitions file at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:404) Caused by: java.lang.IllegalArgumentException: Can't read partitions file at org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:108) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:70) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.init(MapTask.java:588) We also tried another idea on how to speed things up: What if instead of doing individual puts, we passed a list of puts to put() (eg: htable.put(putList) ). Internally in hbase, would there be less overhead vs multiple calls to put()? It seems to be faster, however since we're not using context.write, I'm guessing this will lead to problems later on, right? Turning off WAL on puts to speed things up isn't an option, since data loss would be unacceptable, even if the chances of a failure occurring are slim. Thanks, David
Re: HBase is not running.
Ah yeah the master advertised itself as: Attempting connect to Master server at ip72-215-225-9.at.at.cox.net,46122,1369408257140 So the region server cannot find it since that's the public address and nothing's reachable through that. Now you really need to fix your networking :) J-D On Fri, May 24, 2013 at 8:21 AM, Yves S. Garret yoursurrogate...@gmail.com wrote: Ok, weird, it still seems to be looking towards Cox. Here is my hbase-site.xml file: http://bin.cakephp.org/view/628322266 On Thu, May 23, 2013 at 7:35 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: No, I meant hbase.master.ipc.address and hbase.regionserver.ipc.address. See https://issues.apache.org/jira/browse/HBASE-8148. J-D On Thu, May 23, 2013 at 4:34 PM, Yves S. Garret yoursurrogate...@gmail.com wrote: Do you mean hbase.master.info.bindAddress and hbase.regionserver.info.bindAddress? I couldn't find anything else in the docs. But having said that, both are set to 0.0.0.0 by default. Also, I checked out 127.0.0.1:60010 and 0.0.0.0:60010, no web gui. On Thu, May 23, 2013 at 7:19 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: It should only be a matter of network configuration and not a matter of whether you are a Hadoop expert or not. HBase is just trying to get the machine's hostname and bind to it and in your case it's given something it cannot use. It's unfortunate. IIUC your machine is hosted on cox.net? And it seems that while providing that machine they at some point set it up so that its hostname would resolve to a public address. Sounds like a misconfiguration. Anyways, you can edit your /etc/hosts so that your hostname points to 127.0.0.1 or, since you are using 0.94.7, set both hbase.master.ipc.address and hbase.regionserver.ipc.address to 0.0.0.0 in your hbase-site.xml so that it binds on the wildcard address instead. J-D On Thu, May 23, 2013 at 4:07 PM, Yves S. Garret yoursurrogate...@gmail.com wrote: How weird. Admittedly I'm not terribly knowledgeable about Hadoop and all of its sub-projects, but I don't recall ever setting any networking info to something other than localhost. What would cause this? On Thu, May 23, 2013 at 6:26 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: That's your problem: Caused by: java.net.BindException: Problem binding to ip72-215-225-9.at.at.cox.net/72.215.225.9:0 : Cannot assign requested address Either it's a public address and you can't bind to it or someone else is using it. J-D On Thu, May 23, 2013 at 3:24 PM, Yves S. Garret yoursurrogate...@gmail.com wrote: Here is my dump of the sole log file in the logs directory: http://bin.cakephp.org/view/2116332048 On Thu, May 23, 2013 at 6:20 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: On Thu, May 23, 2013 at 2:50 PM, Jay Vyas jayunit...@gmail.com wrote: 1) Should hbase-master be changed to localhost? Maybe Try changing /etc/hosts to match the actual non loopback ip of your machine... (i.e. just run Ifconfig | grep 1 and see what ip comes out :)) and make sure your /etc/hosts matches the file in my blog post, (you need hbase-master to be defined in your /etc/hosts...). hbase.master was dropped around 2009 now that we have zookeeper. So you can set it to whatever you want, it won't change anything :) 2) zookeeper parent seems bad.. Change hbase-rootdir to hbase (in hbase.rootdir) so that it's consistent with what you defined in zookeeper parent node. Those two are really unrelated, /hbase is the default so no need to override it, and I'm guessing that hbase.rootdir is somewhere writable so that's all good. Now, regarding the Check the value configured in 'zookeeper.znode.parent, it's triggered when the client wants to read the /hbase znode in ZooKeeper but it's unable to. If it doesn't exist, it might be because your HBase is homed elsewhere. It could also be that HBase isn't running at all so the Master never got to create it. BTW you can start the shell with -d and it's gonna give more info and dump all the stack traces. Going by this thread I would guess that HBase isn't running so the shell won't help. Another way to check is pointing your browser to localhost:60010 and see if the master is responding. If not, time to open up the log and see what's up. J-D
Re: Not able to connect to Hbase remotly
It says your event_data table isn't assigned anywhere on the cluster. Was it disabled? J-D On Fri, May 24, 2013 at 6:06 AM, Vimal Jain vkj...@gmail.com wrote: Hi Tariq/Jyothi, Sorry to trouble you again. I think this problem is solved but i am not able to figure out why in client's /etc/hosts file , i need to put an entry of zookeeper's location.I have configured everything as IP addresss in Hbase server so why this /etc/hosts comes in picture as i understand its only required for name resolution. Appreciate your help in this case. On Wed, May 22, 2013 at 2:56 PM, Vimal Jain vkj...@gmail.com wrote: Hi, I have Hbase configured in pseudo distributed mode on Machine A. I would like to connect to it through a Java program running on Machine B. But i am unable to do so.What configurations are required in Java for this ? Please help. -- Thanks and Regards, Vimal Jain -- Thanks and Regards, Vimal Jain
Re: HBase is not running.
This is a machine identity problem. HBase simply uses the normal Java APIs and asks who am I?. The answer it gets is ip72-215-225-9.at.at.cox.net. Changing this should only be a matter of DNS configs, starting with /etc/hosts. What is your machine's hostname exactly (run hostname)? When you ping it, what does it return? That should get you started. Does you machine even have a local IP when you run ifconfig? If not, all you can do is force everything to localhost in your network configs. It also means you cannot use HBase in a distributed fashion. Changing the code seems like a waste of time, HBase is inherently distributed and it relies on machines having their network correctly configured. Your time might be better spent using a VM on your own machine. J-D On Fri, May 24, 2013 at 12:38 PM, Yves S. Garret yoursurrogate...@gmail.com wrote: That seems to be the case. The thing that I don't get is if I missed any global setting in order to make everything turn towards localhost. What am I missing? I'll scour the HBase docs again. On Fri, May 24, 2013 at 1:17 PM, Jay Vyas jayunit...@gmail.com wrote: Yes ... get hostname and /etc/hosts synced up properly and i bet that will fix it On Fri, May 24, 2013 at 12:41 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: Ah yeah the master advertised itself as: Attempting connect to Master server at ip72-215-225-9.at.at.cox.net,46122,1369408257140 So the region server cannot find it since that's the public address and nothing's reachable through that. Now you really need to fix your networking :) J-D On Fri, May 24, 2013 at 8:21 AM, Yves S. Garret yoursurrogate...@gmail.com wrote: Ok, weird, it still seems to be looking towards Cox. Here is my hbase-site.xml file: http://bin.cakephp.org/view/628322266 On Thu, May 23, 2013 at 7:35 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: No, I meant hbase.master.ipc.address and hbase.regionserver.ipc.address. See https://issues.apache.org/jira/browse/HBASE-8148. J-D On Thu, May 23, 2013 at 4:34 PM, Yves S. Garret yoursurrogate...@gmail.com wrote: Do you mean hbase.master.info.bindAddress and hbase.regionserver.info.bindAddress? I couldn't find anything else in the docs. But having said that, both are set to 0.0.0.0 by default. Also, I checked out 127.0.0.1:60010 and 0.0.0.0:60010, no web gui. On Thu, May 23, 2013 at 7:19 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: It should only be a matter of network configuration and not a matter of whether you are a Hadoop expert or not. HBase is just trying to get the machine's hostname and bind to it and in your case it's given something it cannot use. It's unfortunate. IIUC your machine is hosted on cox.net? And it seems that while providing that machine they at some point set it up so that its hostname would resolve to a public address. Sounds like a misconfiguration. Anyways, you can edit your /etc/hosts so that your hostname points to 127.0.0.1 or, since you are using 0.94.7, set both hbase.master.ipc.address and hbase.regionserver.ipc.address to 0.0.0.0 in your hbase-site.xml so that it binds on the wildcard address instead. J-D On Thu, May 23, 2013 at 4:07 PM, Yves S. Garret yoursurrogate...@gmail.com wrote: How weird. Admittedly I'm not terribly knowledgeable about Hadoop and all of its sub-projects, but I don't recall ever setting any networking info to something other than localhost. What would cause this? On Thu, May 23, 2013 at 6:26 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: That's your problem: Caused by: java.net.BindException: Problem binding to ip72-215-225-9.at.at.cox.net/72.215.225.9:0 : Cannot assign requested address Either it's a public address and you can't bind to it or someone else is using it. J-D On Thu, May 23, 2013 at 3:24 PM, Yves S. Garret yoursurrogate...@gmail.com wrote: Here is my dump of the sole log file in the logs directory: http://bin.cakephp.org/view/2116332048 On Thu, May 23, 2013 at 6:20 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: On Thu, May 23, 2013 at 2:50 PM, Jay Vyas jayunit...@gmail.com wrote: 1) Should hbase-master be changed to localhost? Maybe Try changing /etc/hosts to match the actual non loopback ip of your machine... (i.e. just run Ifconfig | grep 1 and see what ip comes out :)) and make sure your /etc/hosts matches the file in my blog post, (you need hbase-master to be defined in your /etc/hosts...). hbase.master was dropped around 2009 now that we have zookeeper. So you can set it to whatever you want, it won't change
Re: RS crash upon replication
fwiw stop_replication is a kill switch, not a general way to start and stop replicating, and start_replication may put you in an inconsistent state: hbase(main):001:0 help 'stop_replication' Stops all the replication features. The state in which each stream stops in is undetermined. WARNING: start/stop replication is only meant to be used in critical load situations. On Thu, May 23, 2013 at 1:17 AM, Amit Mor amit.mor.m...@gmail.com wrote: No the server came out fine just because after the crash (RS's - the masters were still running), I immediately pulled the breaks with stop_replication. Then I start the RS's and they came back fine (not replicating). Once I hit 'start_replication' again they had crashed for the second time. Eventually I deleted the heavily nested replication znodes and the 'start_replication' succeeded. I didn't patch 8207 because I'm on CDH with Cloudera Manager Parcels thing and I'm still trying to figure out how to replace their jars with mine in a clean and non intrusive way On Thu, May 23, 2013 at 10:33 AM, Varun Sharma va...@pinterest.com wrote: Actually, it seems like something else was wrong here - the servers came up just fine on trying again - so could not really reproduce the issue. Amit: Did you try patching 8207 ? Varun On Wed, May 22, 2013 at 5:40 PM, Himanshu Vashishtha hv.cs...@gmail.com wrote: That sounds like a bug for sure. Could you create a jira with logs/znode dump/steps to reproduce it? Thanks, himanshu On Wed, May 22, 2013 at 5:01 PM, Varun Sharma va...@pinterest.com wrote: It seems I can reproduce this - I did a few rolling restarts and got screwed with NoNode exceptions - I am running 0.94.7 which has the fix but my nodes don't contain hyphens - nodes are no longer coming back up... Thanks Varun On Wed, May 22, 2013 at 3:02 PM, Himanshu Vashishtha hv.cs...@gmail.com wrote: I'd suggest to please patch the code with 8207; cdh4.2.1 doesn't have it. With hyphens in the name, ReplicationSource gets confused and tried to set data in a znode which doesn't exist. Thanks, Himanshu On Wed, May 22, 2013 at 2:42 PM, Amit Mor amit.mor.m...@gmail.com wrote: yes, indeed - hyphens are part of the host name (annoying legacy stuff in my company). It's hbase-0.94.2-cdh4.2.1. I have no idea if 0.94.6 was backported by Cloudera into their flavor of 0.94.2, but the mysterious occurrence of the percent sign in zkcli (ls /hbase/replication/rs/va-p-hbase-02-d,60020,1369249862401/1-va-p-hbase-02-e,60020,1369042377129-va-p-hbase-02-c,60020,1369042377731-va-p-hbase-02-d,60020,1369233252475/va-p-hbase-02-e%2C60020%2C1369042377129.1369227474895) might be a sign for such problem. How deep should my rmr in zkcli (an example would be most welcomed :) be ? I have no serious problem running copyTable with a time period corresponding to the outage and then to start the sync back again. One question though, how did it cause a crash ? On Thu, May 23, 2013 at 12:32 AM, Varun Sharma va...@pinterest.com wrote: I believe there were cascading failures which got these deep nodes containing still to be replicated WAL(s) - I suspect there is either some parsing bug or something which is causing the replication source to not work - also which version are you using - does it have https://issues.apache.org/jira/browse/HBASE-8207 - since you use hyphens in our paths. One way to get back up is to delete these nodes but then you lose data in these WAL(s)... On Wed, May 22, 2013 at 2:22 PM, Amit Mor amit.mor.m...@gmail.com wrote: va-p-hbase-02-d,60020,1369249862401 On Thu, May 23, 2013 at 12:20 AM, Varun Sharma va...@pinterest.com wrote: Basically ls /hbase/rs and what do you see for va-p-02-d ? On Wed, May 22, 2013 at 2:19 PM, Varun Sharma va...@pinterest.com wrote: Can you do ls /hbase/rs and see what you get for 02-d - instead of looking in /replication/, could you look in /hbase/replication/rs - I want to see if the timestamps are matching or not ? Varun On Wed, May 22, 2013 at 2:17 PM, Varun Sharma va...@pinterest.com wrote: I see - so looks okay - there's just a lot of deep nesting in there - if you look into these you nodes by doing ls - you should see a bunch of WAL(s) which still need to be replicated... Varun On Wed, May 22, 2013 at 2:16 PM, Varun Sharma va...@pinterest.com wrote: 2013-05-22
Re: hbase region server shutdown after datanode connection exception
You are looking at it the wrong way. Per http://hbase.apache.org/book.html#trouble.general, always walk up the log to the first exception. In this case it's a session timeout. Whatever happens next is most probably a side effect of that. To help debug your issue, I would suggest reading this section of the reference guide: http://hbase.apache.org/book.html#trouble.rs.runtime J-D On Tue, May 21, 2013 at 7:17 PM, Cheng Su scarcer...@gmail.com wrote: Hi all. I have a small hbase cluster with 3 physical machines. On 192.168.1.80, there are HMaster and a region server. On 81 82, there is a region server on each. The region server on 80 can't sync HLog after a datanode access exception, and started to shutdown. The datanode itself was not shutdown and response other requests normally. I'll paste logs below. My question is: 1. Why this exception causes region server shutdown? Can I prevent it? 2. Is there any tools(shell command is best, like hadoop dfsadmin -report) can monitor hbase region server? to check whether it is alive or dead? I have done some research that nagios/ganglia can do such things. But actually I just want know the region server is alive or dead, so they are a little over qualify. And I'm not using CDH, so I can't use Cloudera Manager I think. Here are the logs. HBase master: 2013-05-21 17:03:32,675 ERROR org.apache.hadoop.hbase.master.HMaster: Region server ^@^@hadoop01,60020,1368774173179 reported a fatal error: ABORTING region server hadoop01,60020,1368774173179: regionserver:60020-0x3eb14c67540002 regionserver:60020-0x3eb14c67540002 received expired from ZooKeeper, aborting Cause: org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeper Watcher.java:369) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher. java:266) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:521 ) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:497) Region Server: 2013-05-21 17:00:16,895 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 12ms for sessionid 0x3eb14c67540002, closing socket connection and attempting re connect 2013-05-21 17:00:35,896 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 12ms for sessionid 0x13eb14ca4bb, closing socket connection and attempting r econnect 2013-05-21 17:03:31,498 WARN org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor exception for block blk_9188414668950016309_4925046java.net.SocketTimeoutException: 63000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.1.80:57020 remote=/192.168.1.82:50010] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128) at java.io.DataInputStream.readFully(DataInputStream.java:178) at java.io.DataInputStream.readLong(DataInputStream.java:399) at org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck. readFields(DataTransferProtocol.java:124) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSCl ient.java:2784) 2013-05-21 17:03:31,520 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_9188414668950016309_4925046 bad datanode[0] 192.168.1.82:50010 2013-05-21 17:03:32,315 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server /192.168.1.82:2100 2013-05-21 17:03:32,316 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to hadoop03/192.168.1.82:2100, initiating session 2013-05-21 17:03:32,317 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server hadoop03/192.168.1.82:2100, sessionid = 0x13eb14ca4bb, negotiated timeout = 18 2013-05-21 17:03:32,497 FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: Could not sync. Requesting close of hlog java.io.IOException: Reflection at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(Sequence FileLogWriter.java:230) at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1091) at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1195) at org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog. java:1057) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.reflect.InvocationTargetException
Re: HBase is not running.
On Thu, May 23, 2013 at 2:50 PM, Jay Vyas jayunit...@gmail.com wrote: 1) Should hbase-master be changed to localhost? Maybe Try changing /etc/hosts to match the actual non loopback ip of your machine... (i.e. just run Ifconfig | grep 1 and see what ip comes out :)) and make sure your /etc/hosts matches the file in my blog post, (you need hbase-master to be defined in your /etc/hosts...). hbase.master was dropped around 2009 now that we have zookeeper. So you can set it to whatever you want, it won't change anything :) 2) zookeeper parent seems bad.. Change hbase-rootdir to hbase (in hbase.rootdir) so that it's consistent with what you defined in zookeeper parent node. Those two are really unrelated, /hbase is the default so no need to override it, and I'm guessing that hbase.rootdir is somewhere writable so that's all good. Now, regarding the Check the value configured in 'zookeeper.znode.parent, it's triggered when the client wants to read the /hbase znode in ZooKeeper but it's unable to. If it doesn't exist, it might be because your HBase is homed elsewhere. It could also be that HBase isn't running at all so the Master never got to create it. BTW you can start the shell with -d and it's gonna give more info and dump all the stack traces. Going by this thread I would guess that HBase isn't running so the shell won't help. Another way to check is pointing your browser to localhost:60010 and see if the master is responding. If not, time to open up the log and see what's up. J-D
Re: HBase is not running.
That's your problem: Caused by: java.net.BindException: Problem binding to ip72-215-225-9.at.at.cox.net/72.215.225.9:0 : Cannot assign requested address Either it's a public address and you can't bind to it or someone else is using it. J-D On Thu, May 23, 2013 at 3:24 PM, Yves S. Garret yoursurrogate...@gmail.com wrote: Here is my dump of the sole log file in the logs directory: http://bin.cakephp.org/view/2116332048 On Thu, May 23, 2013 at 6:20 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: On Thu, May 23, 2013 at 2:50 PM, Jay Vyas jayunit...@gmail.com wrote: 1) Should hbase-master be changed to localhost? Maybe Try changing /etc/hosts to match the actual non loopback ip of your machine... (i.e. just run Ifconfig | grep 1 and see what ip comes out :)) and make sure your /etc/hosts matches the file in my blog post, (you need hbase-master to be defined in your /etc/hosts...). hbase.master was dropped around 2009 now that we have zookeeper. So you can set it to whatever you want, it won't change anything :) 2) zookeeper parent seems bad.. Change hbase-rootdir to hbase (in hbase.rootdir) so that it's consistent with what you defined in zookeeper parent node. Those two are really unrelated, /hbase is the default so no need to override it, and I'm guessing that hbase.rootdir is somewhere writable so that's all good. Now, regarding the Check the value configured in 'zookeeper.znode.parent, it's triggered when the client wants to read the /hbase znode in ZooKeeper but it's unable to. If it doesn't exist, it might be because your HBase is homed elsewhere. It could also be that HBase isn't running at all so the Master never got to create it. BTW you can start the shell with -d and it's gonna give more info and dump all the stack traces. Going by this thread I would guess that HBase isn't running so the shell won't help. Another way to check is pointing your browser to localhost:60010 and see if the master is responding. If not, time to open up the log and see what's up. J-D
Re: HBase is not running.
It should only be a matter of network configuration and not a matter of whether you are a Hadoop expert or not. HBase is just trying to get the machine's hostname and bind to it and in your case it's given something it cannot use. It's unfortunate. IIUC your machine is hosted on cox.net? And it seems that while providing that machine they at some point set it up so that its hostname would resolve to a public address. Sounds like a misconfiguration. Anyways, you can edit your /etc/hosts so that your hostname points to 127.0.0.1 or, since you are using 0.94.7, set both hbase.master.ipc.address and hbase.regionserver.ipc.address to 0.0.0.0 in your hbase-site.xml so that it binds on the wildcard address instead. J-D On Thu, May 23, 2013 at 4:07 PM, Yves S. Garret yoursurrogate...@gmail.com wrote: How weird. Admittedly I'm not terribly knowledgeable about Hadoop and all of its sub-projects, but I don't recall ever setting any networking info to something other than localhost. What would cause this? On Thu, May 23, 2013 at 6:26 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: That's your problem: Caused by: java.net.BindException: Problem binding to ip72-215-225-9.at.at.cox.net/72.215.225.9:0 : Cannot assign requested address Either it's a public address and you can't bind to it or someone else is using it. J-D On Thu, May 23, 2013 at 3:24 PM, Yves S. Garret yoursurrogate...@gmail.com wrote: Here is my dump of the sole log file in the logs directory: http://bin.cakephp.org/view/2116332048 On Thu, May 23, 2013 at 6:20 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: On Thu, May 23, 2013 at 2:50 PM, Jay Vyas jayunit...@gmail.com wrote: 1) Should hbase-master be changed to localhost? Maybe Try changing /etc/hosts to match the actual non loopback ip of your machine... (i.e. just run Ifconfig | grep 1 and see what ip comes out :)) and make sure your /etc/hosts matches the file in my blog post, (you need hbase-master to be defined in your /etc/hosts...). hbase.master was dropped around 2009 now that we have zookeeper. So you can set it to whatever you want, it won't change anything :) 2) zookeeper parent seems bad.. Change hbase-rootdir to hbase (in hbase.rootdir) so that it's consistent with what you defined in zookeeper parent node. Those two are really unrelated, /hbase is the default so no need to override it, and I'm guessing that hbase.rootdir is somewhere writable so that's all good. Now, regarding the Check the value configured in 'zookeeper.znode.parent, it's triggered when the client wants to read the /hbase znode in ZooKeeper but it's unable to. If it doesn't exist, it might be because your HBase is homed elsewhere. It could also be that HBase isn't running at all so the Master never got to create it. BTW you can start the shell with -d and it's gonna give more info and dump all the stack traces. Going by this thread I would guess that HBase isn't running so the shell won't help. Another way to check is pointing your browser to localhost:60010 and see if the master is responding. If not, time to open up the log and see what's up. J-D
Re: HBase is not running.
No, I meant hbase.master.ipc.address and hbase.regionserver.ipc.address. See https://issues.apache.org/jira/browse/HBASE-8148. J-D On Thu, May 23, 2013 at 4:34 PM, Yves S. Garret yoursurrogate...@gmail.com wrote: Do you mean hbase.master.info.bindAddress and hbase.regionserver.info.bindAddress? I couldn't find anything else in the docs. But having said that, both are set to 0.0.0.0 by default. Also, I checked out 127.0.0.1:60010 and 0.0.0.0:60010, no web gui. On Thu, May 23, 2013 at 7:19 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: It should only be a matter of network configuration and not a matter of whether you are a Hadoop expert or not. HBase is just trying to get the machine's hostname and bind to it and in your case it's given something it cannot use. It's unfortunate. IIUC your machine is hosted on cox.net? And it seems that while providing that machine they at some point set it up so that its hostname would resolve to a public address. Sounds like a misconfiguration. Anyways, you can edit your /etc/hosts so that your hostname points to 127.0.0.1 or, since you are using 0.94.7, set both hbase.master.ipc.address and hbase.regionserver.ipc.address to 0.0.0.0 in your hbase-site.xml so that it binds on the wildcard address instead. J-D On Thu, May 23, 2013 at 4:07 PM, Yves S. Garret yoursurrogate...@gmail.com wrote: How weird. Admittedly I'm not terribly knowledgeable about Hadoop and all of its sub-projects, but I don't recall ever setting any networking info to something other than localhost. What would cause this? On Thu, May 23, 2013 at 6:26 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: That's your problem: Caused by: java.net.BindException: Problem binding to ip72-215-225-9.at.at.cox.net/72.215.225.9:0 : Cannot assign requested address Either it's a public address and you can't bind to it or someone else is using it. J-D On Thu, May 23, 2013 at 3:24 PM, Yves S. Garret yoursurrogate...@gmail.com wrote: Here is my dump of the sole log file in the logs directory: http://bin.cakephp.org/view/2116332048 On Thu, May 23, 2013 at 6:20 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: On Thu, May 23, 2013 at 2:50 PM, Jay Vyas jayunit...@gmail.com wrote: 1) Should hbase-master be changed to localhost? Maybe Try changing /etc/hosts to match the actual non loopback ip of your machine... (i.e. just run Ifconfig | grep 1 and see what ip comes out :)) and make sure your /etc/hosts matches the file in my blog post, (you need hbase-master to be defined in your /etc/hosts...). hbase.master was dropped around 2009 now that we have zookeeper. So you can set it to whatever you want, it won't change anything :) 2) zookeeper parent seems bad.. Change hbase-rootdir to hbase (in hbase.rootdir) so that it's consistent with what you defined in zookeeper parent node. Those two are really unrelated, /hbase is the default so no need to override it, and I'm guessing that hbase.rootdir is somewhere writable so that's all good. Now, regarding the Check the value configured in 'zookeeper.znode.parent, it's triggered when the client wants to read the /hbase znode in ZooKeeper but it's unable to. If it doesn't exist, it might be because your HBase is homed elsewhere. It could also be that HBase isn't running at all so the Master never got to create it. BTW you can start the shell with -d and it's gonna give more info and dump all the stack traces. Going by this thread I would guess that HBase isn't running so the shell won't help. Another way to check is pointing your browser to localhost:60010 and see if the master is responding. If not, time to open up the log and see what's up. J-D
Re: How does client connects to Hbase server
I guess you are referring to http://hbase.apache.org/book.html#client_dependencies ? The thing is by default hbase.zookeeper.quorum is localhost, so your client will look at your local machine to find HBase if you don't configure anything. J-D On Tue, May 21, 2013 at 10:50 AM, Vimal Jain vkj...@gmail.com wrote: Hi, I am newbie to both Hadoop and Hbase technologies. I have setup Hbase properly in Standalone mode. I am unable to understand work flow when a client (Java program accessing Hbase) connects to Hbase Server. Documentations and Books say that the client should have hbase-site.xml in its classpath or i should provide zookeeper information explicitly in the configuration object.I am not doing either but still my program correctly connects to Hbase server and saves data. How its happening ? -- Thanks and Regards, Vimal Jain
Re: What is in BlockCache?
The reference guide has a pretty good section about this: http://hbase.apache.org/book.html#block.cache What do you think is missing in order to fully answer your question? Thx, J-D On Mon, May 20, 2013 at 5:07 AM, yun peng pengyunm...@gmail.com wrote: Hi, All, I am wondering what is exactly stored in BlockCache: Is it the same raw blocks as in HFile? or does HBase merge several raw blocks and store the merged block in cache to serve future queries? To be more specific, when a get operation entails loading of block b1 from hfile f1, and of block b2 from hfile f2. In order to produce the final result, the overlapping part of two blocks are merged. So in BlockCache, does HBase store b1 and b2 separately, or store the merged form? Thanks, Yun
Re: PleaseHoldException when Master is clearly running as JPS
I see: 2013-05-21 17:15:07,914 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_FAILED_OPEN, server=hbase-regionserver1,60020,1369170595340, region=70236052/-ROOT- Over and over. Look in the region server logs, you should see fat stack traces on why it's failing to open the -ROOT- region. Maybe related to your gluster setup. J-D On Tue, May 21, 2013 at 2:18 PM, Jay Vyas jayunit...@gmail.com wrote: https://gist.github.com/anonymous/5623327 -- all logs for starting up hbase and master On Tue, May 21, 2013 at 5:13 PM, Mohammad Tariq donta...@gmail.com wrote: No prob. I was referring to this : 127.0.0.1 hbase-master 192.168.122.200 hbase-master I was thinking that this is your HBase master. Correct me if i'm wrong. Could you please show me your logs? Warm Regards, Tariq cloudfront.blogspot.com On Wed, May 22, 2013 at 2:36 AM, Jay Vyas jayunit...@gmail.com wrote: Hmmm... what do you mean have your hostname in there? sorry -- just curious about which hostname you are referring to...? Im now getting a new exception: 13/05/21 17:02:44 INFO client.HConnectionManager$HConnectionImplementation: getMaster attempt 2 of 7 failed; retrying after sleep of 1002 On Tue, May 21, 2013 at 4:57 PM, Mohammad Tariq donta...@gmail.com wrote: OK..You already have your hostname in there. But it is appearing twice. Comment out 127.0.0.1 hbase-master. This might be a reason. I did not notice that you are on a distributed setup. RS IPs and hostnames are fine. Warm Regards, Tariq cloudfront.blogspot.com On Wed, May 22, 2013 at 2:21 AM, Jay Vyas jayunit...@gmail.com wrote: Hi kevin : So you don't have any region servers defined in your /etc/hosts ? On Tue, May 21, 2013 at 4:46 PM, Mohammad Tariq donta...@gmail.com wrote: Sorry, my bad. By that I meant 127.0.0.1hostname..To me it seems like HBase is not able to connect to localhost using 127.0.0.1 Warm Regards, Tariq cloudfront.blogspot.com On Wed, May 22, 2013 at 2:12 AM, Jay Vyas jayunit...@gmail.com wrote: Thanks, but adding 127.0.0.1 localhost to the top seems redundant... right? I did so but still no luck :(. 1) OS? This is fedora 16. 2) any thoughts on why the PleaseHoldException is being triggered ? On Tue, May 21, 2013 at 4:32 PM, Mohammad Tariq donta...@gmail.com wrote: OS?Add 127.0.0.1 localhost and see if it makes any difference. Warm Regards, Tariq cloudfront.blogspot.com On Wed, May 22, 2013 at 1:57 AM, Jay Vyas jayunit...@gmail.com wrote: #This is my /etc/hosts file --- 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 #::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 127.0.0.1 hbase-master 192.168.122.200 hbase-master 192.168.122.201 hbase-regionserver1 192.168.122.202 hbase-regionserver2 192.168.122.203 hbase-regionserver3 On Tue, May 21, 2013 at 4:25 PM, Mohammad Tariq donta...@gmail.com wrote: Hello Jay, Please change the line containing 127.0.1.1 in your /etc/hosts to 127.0.0.1 and see if it works. Warm Regards, Tariq cloudfront.blogspot.com On Wed, May 22, 2013 at 1:47 AM, Jay Vyas jayunit...@gmail.com wrote: Hi folks: Hope someone can shed some light on this - I cannot run hbase shell create commands because of the PleaseHoldException on a fresh install of hbase. Im not finding much in the error logs, and all nodes appear to be up and running, including hbase master. Version : hbase-0.94.7 Error: I am getting the Please hold exception... on my hbase shell. When running create table 't1','f1'... But... But wait :) theres more ! ... clearly, the hbase master is running : [root@hbase-master ~]# jps 11896 HQuorumPeer 12914 Jps 9894 Main 5879 Main ** 12279 HMaster ** 5779 Main 11714 ZKServerTool 12058 HRegionServer 12860 Main 8369 Main And finally - here is a dump of the output from the shell --- any thoughts? [root@hbase-master ~]# hbaseinstall/hbase-0.94.7/bin/ hbase shell -d EOF create 't1','f1' EOF
Re: Questions about HBase replication
On Mon, May 20, 2013 at 3:48 PM, Varun Sharma va...@pinterest.com wrote: Thanks JD for the response... I was just wondering if issues have ever been seen with regards to moving over a large number of WAL(s) entirely from one region server to another since that would double the replication related load on the one server which takes over. We only move the znodes, no data is actually being re-written. Another side question: After the WAL has been replicated - is it purged immediately or soonish from the zookeeper ? The WAL's znode reference is deleted immediately. The actual WAL will be deleted according to the chain of log cleaners. J-D
Re: Questions about HBase replication
Yes, but the region server now has 2X the number of WAL(s) to replicate and could suffer higher replication lag as a result... In my experience this hasn't been an issue. Keep in mind that the RS will only replicate what's in the queue when it was recovered and nothing more. It means you have one more thread reading from a likely remote disk (low penalty), then it has to build its own set of edits to replicate (unless you are already severly CPU contented that won't be an issue), then you send those edits to the other cluster (unless you are already filling that machine's pipe, it won't be an issue). Was there anything you were thinking about? You'd rather spread those logs to a bunch of machines? J-D
Re: Cached an already cached block (HBASE-5285)
It would be nice if you can isolate the use case that triggers the issue so that we can reproduce. You could also hit HBASE-6479 if you still have HFileV1 files around. J-D On Sun, May 5, 2013 at 10:49 PM, Viral Bajaria viral.baja...@gmail.comwrote: On Sun, May 5, 2013 at 10:45 PM, ramkrishna vasudevan ramkrishna.s.vasude...@gmail.com wrote: Just to confirm you are getting this with LruBlockCache? If with LruBlockCache then the issue is critical. Because we have faced similar issue with OffHeapCache. But that is not yet stable as far as i know. Regards Ram Yes, it's with LRU cache. My bad, should have copy/pasted the stack trace too. Here you go: java.io.IOException: java.lang.RuntimeException: Cached an already cached block at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1192) at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1181) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2041) at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) Caused by: java.lang.RuntimeException: Cached an already cached block at org.apache.hadoop.hbase.io.hfile.LruBlockCache.cacheBlock(LruBlockCache.java:279) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:353) at org.apache.hadoop.hbase.util.CompoundBloomFilter.contains(CompoundBloomFilter.java:98) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.passesGeneralBloomFilter(StoreFile.java:1511) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.passesBloomFilter(StoreFile.java:1383) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.shouldUseScanner(StoreFileScanner.java:373) at org.apache.hadoop.hbase.regionserver.StoreScanner.selectScannersFrom(StoreScanner.java:257) at org.apache.hadoop.hbase.regionserver.StoreScanner.getScannersNoCompaction(StoreScanner.java:221) at org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:119) at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:1963) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.(HRegion.java:3517) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1700) at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1692) at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1668) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4406) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4380) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:2039)
Re: Extract a whole table for a given time(stamp)
You can use the Export MR job provided with HBase, it lets you set a time range: http://hbase.apache.org/book.html#export J-D On Mon, May 6, 2013 at 10:27 AM, Gaurav Pandit pandit.gau...@gmail.comwrote: Hi Hbase users, We have a use case where we need to know how data looked at a given time in past. The data is stored in HBase of course, with multiple versions. And, the goal is to be able to extractall records (rowkey, columns) as of a given timestamp, to a file. I am trying to figure out the best way to achieve this. The options I know are: 1. Write a *Java* client using HBase Java API, and scan the hbase table. 2. Do the same, but over *Thrift* HBase API using Perl (since our environment is mostly Perl). 3. Use *Hive *to point to HBase table, and use Sqoop to extract data from the Hive table and onto client / RDBMS. 4. Use *Pig *to extract data from HBase table and dump it on HDFS and move the file over to the client. So far, I have successfully implemented option (2). I am still running some tests to see how it performs, but it works fine as such. My questions are: 1. Is option (3) or (4) even possible? I am not sure if we can access the table for a given timestamp over Pig or Hive. 2. Is there any other better way of achieving this? Thanks! Gaurav
Re: Extract a whole table for a given time(stamp)
Obviously I don't know much about your use case, so hopefully this won't turn into a game of yes but I also need X ;) It sounds like you don't have a lot of data to retrieve? Since your first and second options are to scan the whole table, it might be that the table itself is small. If it's small then any option is good and it's just a matter of writing some code. Then, option 3 and 4 will write multiple files unless you use only 1 reducer so, since you already need to merge files, you could consider having a post step that converts the multiple SF into 1 tsv file. Or you could have you own version of Export that has a single reducer that writes in the tsv format. The possibilities are endless. Hope this helps, J-D On Mon, May 6, 2013 at 10:40 AM, Gaurav Pandit pandit.gau...@gmail.comwrote: Thanks J-D. Wouldn't the export utility export the data in sequence file format? My goal is to generate data in some sort of delimited plain text file and hand it over the caller. - Gaurav On Mon, May 6, 2013 at 1:33 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: You can use the Export MR job provided with HBase, it lets you set a time range: http://hbase.apache.org/book.html#export J-D On Mon, May 6, 2013 at 10:27 AM, Gaurav Pandit pandit.gau...@gmail.com wrote: Hi Hbase users, We have a use case where we need to know how data looked at a given time in past. The data is stored in HBase of course, with multiple versions. And, the goal is to be able to extractall records (rowkey, columns) as of a given timestamp, to a file. I am trying to figure out the best way to achieve this. The options I know are: 1. Write a *Java* client using HBase Java API, and scan the hbase table. 2. Do the same, but over *Thrift* HBase API using Perl (since our environment is mostly Perl). 3. Use *Hive *to point to HBase table, and use Sqoop to extract data from the Hive table and onto client / RDBMS. 4. Use *Pig *to extract data from HBase table and dump it on HDFS and move the file over to the client. So far, I have successfully implemented option (2). I am still running some tests to see how it performs, but it works fine as such. My questions are: 1. Is option (3) or (4) even possible? I am not sure if we can access the table for a given timestamp over Pig or Hive. 2. Is there any other better way of achieving this? Thanks! Gaurav
Re: Extract a whole table for a given time(stamp)
You could save some time by using http://hbase.apache.org/book.html#copytable J-D On Mon, May 6, 2013 at 11:19 AM, Gaurav Pandit pandit.gau...@gmail.comwrote: Thanks for your inputs, J-D, Shahab. Sorry if I was ambiguous in stating what I wanted to do. Just to restate the goal in one line: Extract all rows (with rowkey, columns) from an Hbase table as of a given time using HBase timestamp/versions, in a plain text file format J-D, we have about 5 millions rows (but each could have multiple versions) for now. So I think scanning the whole table is okay for now. But it seems it may not be the best option for a big table. Also, as I mentioned earlier, I think Hive/Pig does not let you access Hbase for a timestamp. If they can do that, it's the approach I wanted to take. But your suggestion of using *export* got me thinking, and the following may work out well: 1. Export HBase table for a given timestamp using *export* utility . 2. Import the file into another temp HBase table. 3. Use Pig/Hive to extract the table and put it on an HDFS file in plain text (or onto an RDBMS). 4. Let the client retrieve the file. Shahab, in my case, I was talking about using internal timestamp. But thanks for your input - I was unaware of Pig DBStorage loader! It may come handy in some other scenario. Thanks, Gaurav On Mon, May 6, 2013 at 1:50 PM, Shahab Yunus shahab.yu...@gmail.com wrote: Gaurav, when you say that you want older versions of the data then are you talking about filtering on the internal timestamps (and hence the internal versioning mechanism) or your data has a separate column (basically using custom versioning) for versioning? If the later then you can use Pig. It can dump your data directly into an RDBMS like MySQL too as a DBStorage loader/store is available. Might not be totally applicable to your issue but just wanted to share a thought. Regards, Shahab On Mon, May 6, 2013 at 1:40 PM, Gaurav Pandit pandit.gau...@gmail.com wrote: Thanks J-D. Wouldn't the export utility export the data in sequence file format? My goal is to generate data in some sort of delimited plain text file and hand it over the caller. - Gaurav On Mon, May 6, 2013 at 1:33 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: You can use the Export MR job provided with HBase, it lets you set a time range: http://hbase.apache.org/book.html#export J-D On Mon, May 6, 2013 at 10:27 AM, Gaurav Pandit pandit.gau...@gmail.com wrote: Hi Hbase users, We have a use case where we need to know how data looked at a given time in past. The data is stored in HBase of course, with multiple versions. And, the goal is to be able to extractall records (rowkey, columns) as of a given timestamp, to a file. I am trying to figure out the best way to achieve this. The options I know are: 1. Write a *Java* client using HBase Java API, and scan the hbase table. 2. Do the same, but over *Thrift* HBase API using Perl (since our environment is mostly Perl). 3. Use *Hive *to point to HBase table, and use Sqoop to extract data from the Hive table and onto client / RDBMS. 4. Use *Pig *to extract data from HBase table and dump it on HDFS and move the file over to the client. So far, I have successfully implemented option (2). I am still running some tests to see how it performs, but it works fine as such. My questions are: 1. Is option (3) or (4) even possible? I am not sure if we can access the table for a given timestamp over Pig or Hive. 2. Is there any other better way of achieving this? Thanks! Gaurav
Re: HBase - prioritizing writes over reads?
Short answer is no, there's no knob or configuration to do that. Longer answer is it depends. Are the reads and writes going to different regions/tables? If so, disable the balancer and take it in charge by segregating the offending regions on their own RS. I also see you have the requirement to take incoming data not matter what. Well, this currently cannot be guaranteed in HBase since a RS failure will incur some limited unavailability while the ZK session times out, the logs are replayed and the regions are reassigned. I don't know what kind of SLA you have but it sounds like even without your reads problem you need to do something client-side to take care of this. Local buffers maybe? It would work as long as you don't need to serve that new data right away (unless you also start serving from the local buffer, but it's getting complicated). Hope this helps, J-D On Wed, Apr 24, 2013 at 3:25 AM, kzurek kzu...@proximetry.pl wrote: Is it possible to prioritize writes over reads in HBase? I'm facing some I/O read related issues that influence my write clients and cluster in general (constantly growing store files on some RS). Due to the fact that I cannot let myself to loose/skip incoming data, I would like to guarantee that in case of extensive read I will be able to limit incoming read requests, so that write requests wont be influenced. Is it possible? If so what would be the best way to that and where it should be placed - on the client or cluster side)? -- View this message in context: http://apache-hbase.679495.n3.nabble.com/HBase-prioritizing-writes-over-reads-tp4042838.html Sent from the HBase User mailing list archive at Nabble.com.
Re: writing and reading from a region at once
Inline. J-D On Thu, Apr 25, 2013 at 1:09 PM, Aaron Zimmerman azimmer...@sproutsocial.com wrote: Hi, If a region is being written to, and a scanner takes a lease out on the region, what will happen to the writes? Is there a concept of Transaction Isolation Levels? There's MVCC, so reads can happen while someone else is writing. What you should expect from HBase is read committed. I don't see errors in Puts while the tables are being scanned? But it seems that I'm losing writes somewhere, is it possible the writes could fail silently? Is it temporary while you're scanning or there's really data missing at the end of the day? The former might happen on some older HBase versions while the latter should never happen unless you lower the durability level yourself and have machine failures. J-D
Re: HBaseStorage. Inconsistent result.
Can you run a RowCounter a bunch of times to see if it exhibits the same issue? It would tell us if it's HBase or Pig that causes the issue. http://hbase.apache.org/book.html#rowcounter J-D On Tue, Apr 9, 2013 at 3:58 AM, Eugene Morozov emoro...@griddynamics.comwrote: Hello everyone. I have following script: pages = LOAD 'hbase://mmpages' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('t:d', '-loadKey'); pages2 = FOREACH pages GENERATE $0; pages3 = DISTINCT pages2; g_pages = GROUP pages3 all PARALLEL 1; s_pages = FOREACH g_pages GENERATE 'count', COUNT(pages3); DUMP s_pages; It just calculates number of keys in the table. The issue with this is that it gives me different results. I had two launch. * first one - 7 tasks in parallel (I launched same script 7 times trying to imitate heavy workload) * second one - 9 tasks in parallel. All 7 guys in first and 8 guys in second give me correct result, which is: Input(s): Successfully read 246419854 records (102194 bytes) from: hbase://mmpages ... (count,246419854) But one last of second run gives different Input(s): Successfully read 246419853 records (102194 bytes) from: hbase://mmpages ... (count,246419853) Number of read bytes is same, but number of rows is different. There was definitely no change in mmpages. We do not use standard Put/Delete - only bulkImport and there were no Major compaction run on this table. Even if it would be run, it wouldn't delete anything, because TTL of this page is = '2147483647'. Moreover this table was for debug purposes - nobody uses it, but me. Original issue I got was actually same, but with my own HBaseStorage. It gives much less consistent results. For example for 7 parallel run it gives me: --(count,246419854) --(count,246419173) : Successfully read 246419173 records (2333164 bytes) from: hbase://mmpages --(count,246419854) : Successfully read 246419854 records (2333164 bytes) from: hbase://mmpages --(count,246419854) : Successfully read 246419854 records (2333164 bytes) from: hbase://mmpages --(count,246419173) : Successfully read 246419173 records (2333164 bytes) from: hbase://mmpages --(count,246418816) : Successfully read 246418816 records (2333164 bytes) from: hbase://mmpages --(count,246418690) -- and one job has been failed due to lease exception. During run with my own HBaseStorage I see many map tasks killed with lease does not exist exception, though job usually finish successful. As you can see number of read bytes is exactly same every time, but numbers of read rows are different. Exactly same I got with native HBaseStorage, though difference is really small. But anyway, I didn't expect to see that original HBaseStorage could also do the trick. And now my question is more about org.apache...HBaseStorage than about my own HBaseStorage. Any advice to prove anything regarding native org.apache...HBaseStorage to fix it or to do more experiments on the matter would be really really appreciated. -- Eugene Morozov Developer of Grid Dynamics Skype: morozov.evgeny www.griddynamics.com emoro...@griddynamics.com
Re: hbase-0.94.6.1 balancer issue
Samir, When you say And at what point balancer will start redistribute regions to second server, do you mean that when you look at the master's web UI you see that one region server has 0 region? That would be a problem. Else, that line you posted in your original message should be repeated for each table, and globally the regions should all be correctly distributed... unless there's an edge case where when you have only tables with 1 region it puts them all on the same server :) Thx, J-D On Fri, Apr 12, 2013 at 12:37 PM, Samir Ahmic ahmic.sa...@gmail.com wrote: Thanks for explaining Jean-Marc, We are using 0.90.4 for very long time and balancing was based on total number of regions.That is why i was surprised with balancer log on 0.94. Well i'm more ops guy then dev i handle what other develop :) Regards On Fri, Apr 12, 2013 at 6:24 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Samir, Since regions are balanced per table, as soon as you will have more than one region in your table, balancer will start to balance the regions over the servers. You can split some of those tables and will you start to see HBase balance them. This is normal behavior for 0.94. I don't know for versions before that. Also, are you sure you need 48 tables? And not less tables with more CFs? JM 2013/4/12 Samir Ahmic ahmic.sa...@gmail.com Hi, JM I have 48 tables and as you said it is 1 region per table since i did not reach splitting limit yet. So this is normal behavior in 0.94.6.1 version ? And at what point balancer will start redistribute regions to second server ? Thanks Samir On Fri, Apr 12, 2013 at 6:06 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Samir, Regions are balancer per table. So if you have 48 regions within the same table, it should be split about 24 on each server. But if you have 48 tables with 1 region each, the for each table, the balancer will see only 1 region and will display the message you saw. Have you looked at the UI? What do you have in it? Can you please confirm if yo uhave 48 tables or 1 table? Thanks, JM 2013/4/12 Samir Ahmic ahmic.sa...@gmail.com Hi, all I'm evaluating hbase-0.94.6.1 and i have 48 regions on 2 node cluster. I was restarting on of RSs and after that tried to balance cluster by running balancer from shell. After running command regions were not distributed to second RS and i found this line i master log: 2013-04-12 16:45:15,589 INFO org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing because balanced cluster; servers=2 *regions=1 *average=0.5 mostloaded=1 leastloaded=0 This look like to me that wrong number of regions is reported by balancer and that cause of skipping load balancing . In hbase shell i see all 48 tables that i have and everything else looks fine. Did someone else see this type of behavior ? Did something changed around balancer in hbase-0.94.6.1 ? Regards Samir
Re: HBase and Hadoop version
On Tue, Mar 26, 2013 at 6:56 AM, Robert Hamilton rhamil...@whalesharkmedia.com wrote: I am evaluating HBase 0.94.5 on a test cluster that happens to be running Hadoop 0.20.2-cdh3u5 I've seen the compatibility warnings but I'm just doing a first look at the features and not even thinking about production for the moment. So nothing disastrous will happen even in the worst case. My question is, what should I expect to go wrong with this particular version mismatch? Nothing, in fact I've deployed pretty much that setup in production before. J-D
Re: How to prevent major compaction when doing bulk load provisioning?
On Fri, Mar 22, 2013 at 12:12 AM, Nicolas Seyvet nicolas.sey...@gmail.com wrote: @J-D: Thanks, this sounds very likely. One more thing, from the logs of one slave, I can see the following: 2013-03-21 22:27:15,041 INFO org.apache.hadoop.hbase.regionserver.Store: Completed major compaction of 9 file(s) in f of rc_nise,$,1363860406830.5689430f7a27cc511f99dcb62001edc6. into 5418126f3d154ef3aca8027e04512279, size=8.3g; total size for store is 8.3g [...] 2013-03-21 23:34:31,836 INFO org.apache.hadoop.hbase.regionserver.Store: Completed major compaction of 5 file(s) in f of rc_nise,$,1363860406830.5689430f7a27cc511f99dcb62001edc6. into 3bdeb58c57af4ee1a92d22865e707416, size=8.3g; total size for store is 8.3g Are not those the sign that a major compaction also occurred? And if so, what could have triggered it? If the compaction algo selects all the files for compaction, it gets upgraded into a major compaction because it's essentially the same thing. On Thu, Mar 21, 2013 at 8:06 PM, Nicolas Seyvet nicolas.sey...@gmail.comwrote: @Ram: You are entirely correct, I made the exact same mistakes of mixing up Large and minor compaction. By looking closely, what I see is that at around 200 HFiles per region it starts minor compacting files per group of 10 HFiles. The problem seems that this minor compacting never stops even when there are about 20 HFiles left. It just keep on going and on taking more and more time (I guess because the files to compact are getting bigger). Of course in parallel we keep on adding more and more data. @J-D: It seems to me that it would be better if you were able to do a single load for all your files. Yes, I agree.. but that is not what we are testing, our use case is to use 1min batch files.
Re: Question about compactions
On Thu, Mar 21, 2013 at 6:46 AM, Brennon Church bren...@getjar.com wrote: Hello all, As I understand it, a common performance tweak is to disable major compactions so that you don't end up with storms taking things out at inconvenient times. I'm thinking that I should just write a quick script to rotate through all of our regions, one at a time, and compact them. Again, if I'm understanding this correctly we should not end up with storms as they'll only happen one at a time, and each one doesn't run for long. Does that seem reasonable, or am I missing something? My hope is to run the script regularly. FWIW major compacting isn't even needed if you don't update or delete cells so do consider that too. The problem with scheduling major compactions yourself is that, since the command is async, you can still end up with a storm of compactions if you just blindly issue major_compact for all your regions. Things like adding wait time works but then let's say you want the compactions to run only between 2 and 4AM then you can run out of time. What I have seen to circumvent this is to only do a subset of the regions at a time. You can also use JMX to monitor the compaction queue on each RS and make sure you are not just piling them up, but this requires some more work. Corollary question... I recently added drives to our nodes and since I did this while they were all still running, basically just restarting the datanode underneath to pick up the new spindles, I'm fairly sure I've thrown data locality out the window, based on the changed pattern of network traffic. Interesting but unlikely. Even restarting HBase shouldn't do that unless it was wrongly restarted. Each RS publishes a locality index (hdfsBlocksLocalityIndex) that you can find via JMX or in their web UI, are they close to 100% or way down? Also which version are you on? If I'm right, manually running major compactions against all of the regions should resolve that, as the underlying data would all get written locally. Again, does that make sense? Major compacting would do that yes, but first check if you need it at all I think. J-D
Re: How to prevent major compaction when doing bulk load provisioning?
You are likely just hitting the threshold for a minor compaction and by picking up all the files (I'm making a guess that it does) it gets upgraded to a major compaction. The threshold is 3 by default. So after loading 3 files you should get a compaction per region, then every other 2 loading you will trigger another per region. It seems to me that it would be better if you were able to do a single load for all your files. J-D On Thu, Mar 21, 2013 at 6:29 AM, Nicolas Seyvet nicolas.sey...@gmail.com wrote: Hi, We are using code similar to https://github.com/jrkinley/hbase-bulk-import-example/ in order to benchmark our HBase cluster. We are running a CDH4 installation, and HBase is version 0.92.1-cdh4.1.1.. The cluster is composed of 12 slaves and 1 master and 1 secondary master. During the bulk load insert, roughly within 3 hours after the start (~200Gb), we notice a large drop in performance in the insert rate. At the same time, there is a spike in IO and CPU usage. Connecting to a Region Server (RS), the Monitored Task section shows that a compaction is started. I have set hbase.hregion.max.filesize to 107374182400 (100Gb), and disable automatic major compaction hbase.hregion.majorcompactionis set to 0. What we are doing is that we have 1000 files of synthetic data (csv), where each row in a file is one row to insert into HBase, each file contains 600K rows (or 600K events). Our loader works in the following way: 1. Look for a file 2. When a file is found, prepare a job for that file 3. Launch job 4. Wait for completion 5. Compute insert rate (nb of rows /time) 6. Repeat from 1 until there are no more files. What I understand of the bulk load M/R job is that it produces one HFile for each Region. Questions: - How is HStoreFileSize calclulated? - What do HStoreFileSize, storeFileSize and hbase.hregion.max.filesize have in common? - Can the number of HFiles trigger a major compaction? Thx for help. I hope my questions make sense. /Nicolas
Re: How to prevent major compaction when doing bulk load provisioning?
On Thu, Mar 21, 2013 at 12:06 PM, Nicolas Seyvet nicolas.sey...@gmail.com wrote: @Ram: You are entirely correct, I made the exact same mistakes of mixing up Large and minor compaction. By looking closely, what I see is that at around 200 HFiles per region it starts minor compacting files per group of 10 HFiles. The problem seems that this minor compacting never stops even when there are about 20 HFiles left. It just keep on going and on taking more and more time (I guess because the files to compact are getting bigger). Of course in parallel we keep on adding more and more data. @J-D: It seems to me that it would be better if you were able to do a single load for all your files. Yes, I agree.. but that is not what we are testing, our use case is to use 1min batch files. I worked on a very similar use case recently and would recommend against doing bulk loads like this. The way bulk loaded files are treated by the compaction selection algorithm is broken when loads are done in a continuous fashion. The solution to this is in HBASE-7842[1] but it is still being worked on. What you are seeing is that the files picked up for compactions will often include the bigger already-compacted files. As those files get bigger, compactions will take longer and longer, up to a point where the data that is selected for compaction is greater than your compacting capacity. The workaround would be to use the normal API as files will be more properly selected for compaction, but it won't be as fast/efficient as the continuous bulk load solution should be if the selection algo wasn't broken. J-D 1. https://issues.apache.org/jira/browse/HBASE-7842
Re: Question about compactions
On Thu, Mar 21, 2013 at 1:44 PM, Brennon Church bren...@getjar.com wrote: Hello, Here's the data locality index values for all 8 nodes: hdfsBlocksLocalityIndex=45 hdfsBlocksLocalityIndex=57 hdfsBlocksLocalityIndex=55 hdfsBlocksLocalityIndex=55 hdfsBlocksLocalityIndex=58 hdfsBlocksLocalityIndex=47 hdfsBlocksLocalityIndex=45 hdfsBlocksLocalityIndex=42 Those seem pretty bad to me. Yeah, considering that you have 8 nodes and probably use a replication factor of 3, then I would expect you to be at least 38% local in case of a wrongful restart (but then minor compactions probably ran and that brought you up). I'm running HBase v. 0.92.0 I'd considered the async problem, and was going to add some basic checks into the script to not submit additional compactions to the queue if I saw that it had anything in it already. For the moment, it seems my best bet is to run through the major compactions for everything to regain locality. Going forward, we may or may not need the major compactions on a regular basis. I can tell you it's been several months since we turned them off, and performance has been reasonable. FWIW your data should be cached now so major compacting will do no good (unless you mostly do full table scans, in which case the caching doesn't do anything for you). You shouldn't see a big difference turning major compactions off if you don't delete/update a lot. Thanks. --Brennon On 3/21/13 10:49 AM, Jean-Daniel Cryans wrote: On Thu, Mar 21, 2013 at 6:46 AM, Brennon Church bren...@getjar.com wrote: Hello all, As I understand it, a common performance tweak is to disable major compactions so that you don't end up with storms taking things out at inconvenient times. I'm thinking that I should just write a quick script to rotate through all of our regions, one at a time, and compact them. Again, if I'm understanding this correctly we should not end up with storms as they'll only happen one at a time, and each one doesn't run for long. Does that seem reasonable, or am I missing something? My hope is to run the script regularly. FWIW major compacting isn't even needed if you don't update or delete cells so do consider that too. The problem with scheduling major compactions yourself is that, since the command is async, you can still end up with a storm of compactions if you just blindly issue major_compact for all your regions. Things like adding wait time works but then let's say you want the compactions to run only between 2 and 4AM then you can run out of time. What I have seen to circumvent this is to only do a subset of the regions at a time. You can also use JMX to monitor the compaction queue on each RS and make sure you are not just piling them up, but this requires some more work. Corollary question... I recently added drives to our nodes and since I did this while they were all still running, basically just restarting the datanode underneath to pick up the new spindles, I'm fairly sure I've thrown data locality out the window, based on the changed pattern of network traffic. Interesting but unlikely. Even restarting HBase shouldn't do that unless it was wrongly restarted. Each RS publishes a locality index (hdfsBlocksLocalityIndex) that you can find via JMX or in their web UI, are they close to 100% or way down? Also which version are you on? If I'm right, manually running major compactions against all of the regions should resolve that, as the underlying data would all get written locally. Again, does that make sense? Major compacting would do that yes, but first check if you need it at all I think. J-D