Crash when run two jobs at the same time with same Hbase table
Dear, When I run two MR Jobs which will read same Hbase table and write to another same Hbase table at the same time. The result is one job successful finished. And another job crashed. And The following shows the error log. Please help me to find out why ? 2013-03-25 15:50:34,026 INFO org.apache.hadoop.mapred.JobClient - map 0% reduce 0%(JobClient.java:monitorAndPrintJob:1301) 2013-03-25 15:50:36,096 WARN org.apache.hadoop.mapred.Task - Could not find output size (Task.java:calculateOutputSize:948) org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find output/file.out in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:429) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160) at org.apache.hadoop.mapred.MapOutputFile.getOutputFile(MapOutputFile.java:56) at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:944) at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:924) at org.apache.hadoop.mapred.Task.done(Task.java:875) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:374) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) 2013-03-25 15:50:36,100 INFO org.apache.hadoop.mapred.LocalJobRunner - (LocalJobRunner.java:statusUpdate:321) 2013-03-25 15:50:36,102 INFO org.apache.hadoop.mapred.Task - Task 'attempt_local_0001_m_00_0' done.(Task.java:sendDone:959) 2013-03-25 15:50:36,111 WARN org.apache.hadoop.mapred.FileOutputCommitter - Output path is null in cleanup(FileOutputCommitter.java:cleanupJob:100) 2013-03-25 15:50:36,111 WARN org.apache.hadoop.mapred.LocalJobRunner - job_local_0001(LocalJobRunner.java:run:298) org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find output/file.out in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:429) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160) at org.apache.hadoop.mapred.MapOutputFile.getOutputFile(MapOutputFile.java:56) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:236) 2013-03-25 15:50:37,029 INFO org.apache.hadoop.mapred.JobClient - map 100% reduce 0%(JobClient.java:monitorAndPrintJob:1301) 2013-03-25 15:50:37,030 INFO org.apache.hadoop.mapred.JobClient - Job complete: job_local_0001(JobClient.java:monitorAndPrintJob:1356) 2013-03-25 15:50:37,031 INFO org.apache.hadoop.mapred.JobClient - Counters: 15(Counters.java:log:585) 2013-03-25 15:50:37,031 INFO org.apache.hadoop.mapred.JobClient - File Input Format Counters (Counters.java:log:587) 2013-03-25 15:50:37,031 INFO org.apache.hadoop.mapred.JobClient - Bytes Read=0(Counters.java:log:589) 2013-03-25 15:50:37,032 INFO org.apache.hadoop.mapred.JobClient - FileSystemCounters(Counters.java:log:587) 2013-03-25 15:50:37,032 INFO org.apache.hadoop.mapred.JobClient - FILE_BYTES_READ=10294950(Counters.java:log:589) 2013-03-25 15:50:37,033 INFO org.apache.hadoop.mapred.JobClient - FILE_BYTES_WRITTEN=10432139(Counters.java:log:589) 2013-03-25 15:50:37,033 INFO org.apache.hadoop.mapred.JobClient - Map-Reduce Framework(Counters.java:log:587) 2013-03-25 15:50:37,033 INFO org.apache.hadoop.mapred.JobClient - Map output materialized bytes=4006(Counters.java:log:589) 2013-03-25 15:50:37,034 INFO org.apache.hadoop.mapred.JobClient - Combine output records=0(Counters.java:log:589) 2013-03-25 15:50:37,034 INFO org.apache.hadoop.mapred.JobClient - Map input records=500(Counters.java:log:589) 2013-03-25 15:50:37,035 INFO org.apache.hadoop.mapred.JobClient - Physical memory (bytes) snapshot=0(Counters.java:log:589) 2013-03-25 15:50:37,035 INFO org.apache.hadoop.mapred.JobClient - Spilled Records=500(Counters.java:log:589) 2013-03-25 15:50:37,035 INFO org.apache.hadoop.mapred.JobClient - Map output bytes=3000(Counters.java:log:589) 2013-03-25 15:50:37,036 INFO org.apache.hadoop.mapred.JobClient - Total committed heap usage (bytes)=202702848(Counters.java:log:589) 2013-03-25 15:50:37,036 INFO org.apache.hadoop.mapred.JobClient - CPU time spent (ms)=0(Counters.java:log:589) 2013-03-25 15:50:37,037 INFO org.apache.hadoop.mapred.JobClient - Virtual memory (bytes) snapshot=0(Counters.java:log:589) 2013-03-25 15:50:37,037 INFO org.apache.hadoop.mapred.JobClient - SPLIT_RAW_BYTES=105(Counters.java:log:589) 2013-03-25 15:50:37,038 INFO org.apache.hadoop.mapred.JobClient - Map output records=500(Counters.java:log:589) 2013-03-25 15:50:37,038 INFO org.apache.hadoop.mapred.JobClient - Combine input Thanks a lot. Best Regards Weibo: http://weibo.com/guowee Web: http://www.wbkit.com -
Re: HBase Writes With Large Number of Columns
Hi Pankaj Is it possible for you to profile the RS when this happens? Either may be like the Thrift adds an overhead or it should be some where the code is spending more time. As you said there may be a slight decrease in performance of the put because now more values has to go in but should not be this significant. We can work on based on the profile output and check what are we doing. Regards Ram On Tue, Mar 26, 2013 at 5:19 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: For a total of 1.5kb with 4 columns = 384 bytes/column bin/hbase org.apache.hadoop.hbase.util.LoadTestTool -write 4:384:100 -num_keys 100 13/03/25 14:54:45 INFO util.MultiThreadedAction: [W:100] Keys=991664, cols=3,8m, time=00:03:55 Overall: [keys/s= 4218, latency=23 ms] Current: [keys/s=4097, latency=24 ms], insertedUpTo=-1 For a total of 1.5kb with 100 columns = 15 bytes/column bin/hbase org.apache.hadoop.hbase.util.LoadTestTool -write 100:15:100 -num_keys 100 13/03/25 16:27:44 INFO util.MultiThreadedAction: [W:100] Keys=999721, cols=95,3m, time=01:27:46 Overall: [keys/s= 189, latency=525 ms] Current: [keys/s=162, latency=616 ms], insertedUpTo=-1 So overall, the speed is the same. A bit faster with 100 columns than with 4. I don't think there is any negative impact on HBase side because of all those columns. Might be interesting to test the same thing over Thrift... JM 2013/3/25 Pankaj Misra pankaj.mi...@impetus.co.in: Yes Ted, we have been observing Thrift API to clearly outperform Java native Hbase API, due to binary communication protocol, at higher loads. Tariq, the specs of the machine on which we are performing these tests are as given below. Processor : i3770K, 8 logical cores (4 physical, with 2 logical per physical core), 3.5 Ghz clock speed RAM: 32 GB DDR3 HDD: Single SATA 2 TB disk, Two 250 GB SATA HDD - Total of 3 disks HDFS and Hbase deployed in pseudo-distributed mode. We are having 4 parallel streams writing to HBase. We used the same setup for the previous tests as well, and to be very frank, we did expect a bit of drop in performance when we had to test with 40 columns, but did not expect to get half the performance. When we tested with 20 columns, we were consistently getting a performance of 200 mbps of writes. But with 40 columns we are getting 90 mbps of throughput only on the same setup. Thanks and Regards Pankaj Misra From: Ted Yu [yuzhih...@gmail.com] Sent: Tuesday, March 26, 2013 1:09 AM To: user@hbase.apache.org Subject: Re: HBase Writes With Large Number of Columns bq. These records are being written using batch mutation with thrift API This is an important information, I think. Batch mutation through Java API would incur lower overhead. On Mon, Mar 25, 2013 at 11:40 AM, Pankaj Misra pankaj.mi...@impetus.co.inwrote: Firstly, Thanks a lot Jean and Ted for your extended help, very much appreciate it. Yes Ted I am writing to all the 40 columns and 1.5 Kb of record data is distributed across these columns. Jean, some columns are storing as small as a single byte value, while few of the columns are storing as much as 80-125 bytes of data. The overall record size is 1.5 KB. These records are being written using batch mutation with thrift API, where in we are writing 100 records per batch mutation. Thanks and Regards Pankaj Misra From: Jean-Marc Spaggiari [jean-m...@spaggiari.org] Sent: Monday, March 25, 2013 11:57 PM To: user@hbase.apache.org Subject: Re: HBase Writes With Large Number of Columns I just ran some LoadTest to see if I can reproduce that. bin/hbase org.apache.hadoop.hbase.util.LoadTestTool -write 4:512:100 -num_keys 100 13/03/25 14:18:25 INFO util.MultiThreadedAction: [W:100] Keys=997172, cols=3,8m, time=00:03:55 Overall: [keys/s= 4242, latency=23 ms] Current: [keys/s=4413, latency=22 ms], insertedUpTo=-1 bin/hbase org.apache.hadoop.hbase.util.LoadTestTool -write 100:512:100 -num_keys 100 This one crashed because I don't have enought disk space, so I'm re-running it, but just before it crashed it was showing about 24.5 slower. which is coherent since it's writing 25 more columns. What size of data do you have? Big cells? Small cells? I will retry the test above with more lines and keep you posted. 2013/3/25 Pankaj Misra pankaj.mi...@impetus.co.in: Yes Ted, you are right, we are having table regions pre-split, and we see that both regions are almost evenly filled in both the tests. This does not seem to be a regression though, since we were getting good write rates when we had lesser number of columns. Thanks and Regards Pankaj Misra From: Ted Yu [yuzhih...@gmail.com] Sent: Monday, March 25, 2013 11:15 PM To: user@hbase.apache.org Cc:
RE: Getting less write throughput due to more number of columns
When the number of columns (qualifiers) are more yes it can impact the performance. In HBase every where the storage will be in terms of KVs. The key will be some thing like rowkey+cfname+columnname+TS... So when u have 26 cells in a put then there will be repetition of many bytes in the key.(One KV per column) So u will end up in transferring more data. Within memstore more data(actual KV data size) getting written and so more frequent flushes.. etc.. Have a look at Intel Panthera Document Store impl. -Anoop- From: Ankit Jain [ankitjainc...@gmail.com] Sent: Monday, March 25, 2013 10:19 PM To: user@hbase.apache.org Subject: Getting less write throughput due to more number of columns Hi All, I am writing a records into HBase. I ran the performance test on following two cases: Set1: Input record contains 26 columns and record size is 2Kb. Set2: Input record contain 1 column and record size is 2Kb. In second case I am getting 8MBps more performance than step. are the large number of columns have any impact on write performance and If yes, how we can overcome it. -- Thanks, Ankit Jain
Re: Compaction problem
Hi i tried the following parameters also export HBASE_REGIONSERVER_OPTS=-Xmx2g -Xms2g -Xmn256m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:$HBASE_HOME/logs/gc-$(hostname)-hbase.log hbase.regionserver.global.memstore.upper Limit .50 hbase.regionserver.global.memstore.lower Limit .50 hbase.regionserver.handler.count 30 but still not much effect. any suggestions , on how to improve the ingestion speed ?? On Fri, Mar 22, 2013 at 9:04 PM, tarang dawer tarang.da...@gmail.comwrote: 3 region servers 2 region servers having 5 regions each , 1 having 6 +2(meta and root) 1 CF set HBASE_HEAPSIZE in hbase-env.sh as 4gb . is the flush size okay ? or do i need to reduce/increase it ? i'll look into the flushQ and compactionQ size and get back to you . do these parameters seem okay to you ? if something seems odd / not in order , please do tell Thanks Tarang Dawer On Fri, Mar 22, 2013 at 8:21 PM, Anoop John anoop.hb...@gmail.com wrote: How many regions per RS? And CF in table? What is the -Xmx for RS process? You will bget 35% of that memory for all the memstores in the RS. hbase.hregion.memstore.flush.size = 1GB!! Can you closely observe the flushQ size and compactionQ size? You may be getting so many small file flushes(Due to global heap pressure) and subsequently many minor compactions. -Anoop- On Fri, Mar 22, 2013 at 8:14 PM, tarang dawer tarang.da...@gmail.com wrote: Hi As per my use case , I have to write around 100gb data , with a ingestion speed of around 200 mbps. While writing , i am getting a performance hit by compaction , which adds to the delay. I am using a 8 core machine with 16 gb RAM available., 2 Tb hdd 7200RPM. Got some idea from the archives and tried pre splitting the regions , configured HBase with following parameters(configured the parameters in a haste , so please guide me if anything's out of order) :- property namehbase.hregion.memstore.block.multiplier/name value4/value /property property namehbase.hregion.memstore.flush.size/name value1073741824/value /property property namehbase.hregion.max.filesize/name value1073741824/value /property property namehbase.hstore.compactionThreshold/name value5/value /property property namehbase.hregion.majorcompaction/name value0/value /property property namehbase.hstore.blockingWaitTime/name value3/value /property property namehbase.hstore.blockingStoreFiles/name value200/value /property property namehbase.regionserver.lease.period/name value300/value /property but still m not able to achieve the optimal rate , getting around 110 mbps. Need some optimizations ,so please could you help out ? Thanks Tarang Dawer On Fri, Mar 22, 2013 at 6:05 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Tarang, I will recommand you to take a look at the list archives first to see all the discussions related to compaction. You will found many interesting hints and tips. http://search-hadoop.com/?q=compactionsfc_project=HBasefc_type=mail+_hash_+user After that, you will need to provide more details regarding how you are using HBase and how the compaction is impacting you. JM 2013/3/22 tarang dawer tarang.da...@gmail.com: Hi I am using HBase 0.94.2 currently. While using it , its write performance, due to compaction is being affeced by compaction. Please could you suggest some quick tips in relation to how to deal with it ? Thanks Tarang Dawer
Re: Compaction problem
Wha is the rate at which you are flushing? Frequent flushes will cause more files and compaction may happen frequently but with lesser time. If the flush size is increased to a bigger value then you will end up more time in the compaction because the entire file has to be read and rewritten. After you check with the flush Q see what effect you get on increasing the memstore size. On Tue, Mar 26, 2013 at 12:02 PM, tarang dawer tarang.da...@gmail.comwrote: Hi i tried the following parameters also export HBASE_REGIONSERVER_OPTS=-Xmx2g -Xms2g -Xmn256m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:$HBASE_HOME/logs/gc-$(hostname)-hbase.log hbase.regionserver.global.memstore.upper Limit .50 hbase.regionserver.global.memstore.lower Limit .50 hbase.regionserver.handler.count 30 but still not much effect. any suggestions , on how to improve the ingestion speed ?? On Fri, Mar 22, 2013 at 9:04 PM, tarang dawer tarang.da...@gmail.com wrote: 3 region servers 2 region servers having 5 regions each , 1 having 6 +2(meta and root) 1 CF set HBASE_HEAPSIZE in hbase-env.sh as 4gb . is the flush size okay ? or do i need to reduce/increase it ? i'll look into the flushQ and compactionQ size and get back to you . do these parameters seem okay to you ? if something seems odd / not in order , please do tell Thanks Tarang Dawer On Fri, Mar 22, 2013 at 8:21 PM, Anoop John anoop.hb...@gmail.com wrote: How many regions per RS? And CF in table? What is the -Xmx for RS process? You will bget 35% of that memory for all the memstores in the RS. hbase.hregion.memstore.flush.size = 1GB!! Can you closely observe the flushQ size and compactionQ size? You may be getting so many small file flushes(Due to global heap pressure) and subsequently many minor compactions. -Anoop- On Fri, Mar 22, 2013 at 8:14 PM, tarang dawer tarang.da...@gmail.com wrote: Hi As per my use case , I have to write around 100gb data , with a ingestion speed of around 200 mbps. While writing , i am getting a performance hit by compaction , which adds to the delay. I am using a 8 core machine with 16 gb RAM available., 2 Tb hdd 7200RPM. Got some idea from the archives and tried pre splitting the regions , configured HBase with following parameters(configured the parameters in a haste , so please guide me if anything's out of order) :- property namehbase.hregion.memstore.block.multiplier/name value4/value /property property namehbase.hregion.memstore.flush.size/name value1073741824/value /property property namehbase.hregion.max.filesize/name value1073741824/value /property property namehbase.hstore.compactionThreshold/name value5/value /property property namehbase.hregion.majorcompaction/name value0/value /property property namehbase.hstore.blockingWaitTime/name value3/value /property property namehbase.hstore.blockingStoreFiles/name value200/value /property property namehbase.regionserver.lease.period/name value300/value /property but still m not able to achieve the optimal rate , getting around 110 mbps. Need some optimizations ,so please could you help out ? Thanks Tarang Dawer On Fri, Mar 22, 2013 at 6:05 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Tarang, I will recommand you to take a look at the list archives first to see all the discussions related to compaction. You will found many interesting hints and tips. http://search-hadoop.com/?q=compactionsfc_project=HBasefc_type=mail+_hash_+user After that, you will need to provide more details regarding how you are using HBase and how the compaction is impacting you. JM 2013/3/22 tarang dawer tarang.da...@gmail.com: Hi I am using HBase 0.94.2 currently. While using it , its write performance, due to compaction is being affeced by compaction. Please could you suggest some quick tips in relation to how to deal with it ? Thanks Tarang Dawer
Please help me how and when to catch java.net.ConnectException in case hbase is not running as it goes on connecting with hbase
13/03/26 12:46:14 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) 13/03/26 12:46:16 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 13/03/26 12:46:16 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) 13/03/26 12:46:17 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 13/03/26 12:46:17 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) 13/03/26 12:46:18 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 13/03/26 12:46:18 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) 13/03/26 12:46:19 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 13/03/26 12:46:19 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) 13/03/26 12:46:20 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 13/03/26 12:46:20 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) 13/03/26 12:46:21 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 13/03/26 12:46:21 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) 13/03/26 12:46:22 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 13/03/26 12:46:22 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at
Re: Truncate hbase table based on column family
Yes. If there is a table having data in column families F1, F2. I want to truncate the data of column family F1 alone. Is it possible? Thanks Regards, Varaprasada Reddy -Ted Yu yuzhih...@gmail.com wrote: - To: user@hbase.apache.org From: Ted Yu yuzhih...@gmail.com Date: 03/20/2013 08:12PM Subject: Re: Truncate hbase table based on column family Can you clarify your question ? Did you mean that you only want to drop certain column families ? Thanks On Wed, Mar 20, 2013 at 7:15 AM, varaprasad.bh...@polarisft.com wrote: Hi All, Can we truncate a table in hbase based on the column family. Please give your comments. Thanks Regards, Varaprasada Reddy This e-Mail may contain proprietary and confidential information and is sent for the intended recipient(s) only. If by an addressing or transmission error this mail has been misdirected to you, you are requested to delete this mail immediately. You are also hereby notified that any use, any form of reproduction, dissemination, copying, disclosure, modification, distribution and/or publication of this e-mail message, contents or its attachment other than by its intended recipient/s is strictly prohibited. Visit us at http://www.polarisFT.com This e-Mail may contain proprietary and confidential information and is sent for the intended recipient(s) only. If by an addressing or transmission error this mail has been misdirected to you, you are requested to delete this mail immediately. You are also hereby notified that any use, any form of reproduction, dissemination, copying, disclosure, modification, distribution and/or publication of this e-mail message, contents or its attachment other than by its intended recipient/s is strictly prohibited. Visit us at http://www.polarisFT.com
RE: Compaction problem
@tarang As per 4G max heap size, you will get by deafult 1.4G total memory for all the memstores (5/6 regions).. By default you will get 35% of the heap size for memstore. Is your process only write centric? If rare read happens, think of increasing this global heap space setting..Else can increase 4G heap size? (Still 1G for a memstore might be too much. You are now getting flushes because of global heap preassure before each memstore reaches 1GB. ) //hbase.regionserver.global.memstore.lowerlimit hbase.regionserver.global.memstore.upperlimit hbase.hregion.max.filesize is given as 1 GB. Try increasing this. See region splits frequently happening with your case. See all compaction related params... also tells us abt the status of the Qs -Anoop- From: tarang dawer [tarang.da...@gmail.com] Sent: Friday, March 22, 2013 9:04 PM To: user@hbase.apache.org Subject: Re: Compaction problem 3 region servers 2 region servers having 5 regions each , 1 having 6 +2(meta and root) 1 CF set HBASE_HEAPSIZE in hbase-env.sh as 4gb . is the flush size okay ? or do i need to reduce/increase it ? i'll look into the flushQ and compactionQ size and get back to you . do these parameters seem okay to you ? if something seems odd / not in order , please do tell Thanks Tarang Dawer On Fri, Mar 22, 2013 at 8:21 PM, Anoop John anoop.hb...@gmail.com wrote: How many regions per RS? And CF in table? What is the -Xmx for RS process? You will bget 35% of that memory for all the memstores in the RS. hbase.hregion.memstore.flush.size = 1GB!! Can you closely observe the flushQ size and compactionQ size? You may be getting so many small file flushes(Due to global heap pressure) and subsequently many minor compactions. -Anoop- On Fri, Mar 22, 2013 at 8:14 PM, tarang dawer tarang.da...@gmail.com wrote: Hi As per my use case , I have to write around 100gb data , with a ingestion speed of around 200 mbps. While writing , i am getting a performance hit by compaction , which adds to the delay. I am using a 8 core machine with 16 gb RAM available., 2 Tb hdd 7200RPM. Got some idea from the archives and tried pre splitting the regions , configured HBase with following parameters(configured the parameters in a haste , so please guide me if anything's out of order) :- property namehbase.hregion.memstore.block.multiplier/name value4/value /property property namehbase.hregion.memstore.flush.size/name value1073741824/value /property property namehbase.hregion.max.filesize/name value1073741824/value /property property namehbase.hstore.compactionThreshold/name value5/value /property property namehbase.hregion.majorcompaction/name value0/value /property property namehbase.hstore.blockingWaitTime/name value3/value /property property namehbase.hstore.blockingStoreFiles/name value200/value /property property namehbase.regionserver.lease.period/name value300/value /property but still m not able to achieve the optimal rate , getting around 110 mbps. Need some optimizations ,so please could you help out ? Thanks Tarang Dawer On Fri, Mar 22, 2013 at 6:05 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Tarang, I will recommand you to take a look at the list archives first to see all the discussions related to compaction. You will found many interesting hints and tips. http://search-hadoop.com/?q=compactionsfc_project=HBasefc_type=mail+_hash_+user After that, you will need to provide more details regarding how you are using HBase and how the compaction is impacting you. JM 2013/3/22 tarang dawer tarang.da...@gmail.com: Hi I am using HBase 0.94.2 currently. While using it , its write performance, due to compaction is being affeced by compaction. Please could you suggest some quick tips in relation to how to deal with it ? Thanks Tarang Dawer
RE: Truncate hbase table based on column family
varaprasad Pls see HBaseAdmin#deleteColumn().. You should disable the table before making an schema changes and enable back after that. -Anoop- From: varaprasad.bh...@polarisft.com [varaprasad.bh...@polarisft.com] Sent: Tuesday, March 26, 2013 2:15 PM To: user@hbase.apache.org Subject: Re: Truncate hbase table based on column family Yes. If there is a table having data in column families F1, F2. I want to truncate the data of column family F1 alone. Is it possible? Thanks Regards, Varaprasada Reddy -Ted Yu yuzhih...@gmail.com wrote: - To: user@hbase.apache.org From: Ted Yu yuzhih...@gmail.com Date: 03/20/2013 08:12PM Subject: Re: Truncate hbase table based on column family Can you clarify your question ? Did you mean that you only want to drop certain column families ? Thanks On Wed, Mar 20, 2013 at 7:15 AM, varaprasad.bh...@polarisft.com wrote: Hi All, Can we truncate a table in hbase based on the column family. Please give your comments. Thanks Regards, Varaprasada Reddy This e-Mail may contain proprietary and confidential information and is sent for the intended recipient(s) only. If by an addressing or transmission error this mail has been misdirected to you, you are requested to delete this mail immediately. You are also hereby notified that any use, any form of reproduction, dissemination, copying, disclosure, modification, distribution and/or publication of this e-mail message, contents or its attachment other than by its intended recipient/s is strictly prohibited. Visit us at http://www.polarisFT.com This e-Mail may contain proprietary and confidential information and is sent for the intended recipient(s) only. If by an addressing or transmission error this mail has been misdirected to you, you are requested to delete this mail immediately. You are also hereby notified that any use, any form of reproduction, dissemination, copying, disclosure, modification, distribution and/or publication of this e-mail message, contents or its attachment other than by its intended recipient/s is strictly prohibited. Visit us at http://www.polarisFT.com
What is the output format of org.apache.hadoop.examples.Join?
I am reading the following mail: http://www.mail-archive.com/core-user@hadoop.apache.org/msg04066.html After running the following command (I am using Hadoop 1.0.4): bin/hadoop jar hadoop-examples-1.0.4.jar join \ -inFormat org.apache.hadoop.mapred.KeyValueTextInputFormat \ -outKey org.apache.hadoop.io.Text \ -joinOp outer \ join/a.txt join/b.txt join/c.txt joinout Then I run bin/hadoop fs -text joinout/part-0. I see the following result: a0 [,] b0 [,] a1 [,] b1 [,] b2 [,] b3 [,] a2 [,] a3 [,] But Chris said that the result should be: [a0,b0,c0] [a1,b1,c1] [a1,b2,c1] [a1,b3,c1] [a2,,] [a3,,] [,,c2] [,,c3] Is Join's output format changed for Hadoop 1.0.4? -- Jingguo
Re: Crash when run two jobs at the same time with same Hbase table
Hi, So basically, you have one job which is reading from A and writing to B, and one wich is reading from A and writing to C, and the 2 jobs are running at the same time. Is that correct? Are you able to reproduce that each time you are running the job? Which HBased and Hadoop versions are you running? JM 2013/3/26 GuoWei wei@wbkit.com: Dear, When I run two MR Jobs which will read same Hbase table and write to another same Hbase table at the same time. The result is one job successful finished. And another job crashed. And The following shows the error log. Please help me to find out why ? 2013-03-25 15:50:34,026 INFO org.apache.hadoop.mapred.JobClient - map 0% reduce 0%(JobClient.java:monitorAndPrintJob:1301) 2013-03-25 15:50:36,096 WARN org.apache.hadoop.mapred.Task - Could not find output size (Task.java:calculateOutputSize:948) org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find output/file.out in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:429) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160) at org.apache.hadoop.mapred.MapOutputFile.getOutputFile(MapOutputFile.java:56) at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:944) at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:924) at org.apache.hadoop.mapred.Task.done(Task.java:875) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:374) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) 2013-03-25 15:50:36,100 INFO org.apache.hadoop.mapred.LocalJobRunner - (LocalJobRunner.java:statusUpdate:321) 2013-03-25 15:50:36,102 INFO org.apache.hadoop.mapred.Task - Task 'attempt_local_0001_m_00_0' done.(Task.java:sendDone:959) 2013-03-25 15:50:36,111 WARN org.apache.hadoop.mapred.FileOutputCommitter - Output path is null in cleanup(FileOutputCommitter.java:cleanupJob:100) 2013-03-25 15:50:36,111 WARN org.apache.hadoop.mapred.LocalJobRunner - job_local_0001(LocalJobRunner.java:run:298) org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find output/file.out in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:429) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160) at org.apache.hadoop.mapred.MapOutputFile.getOutputFile(MapOutputFile.java:56) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:236) 2013-03-25 15:50:37,029 INFO org.apache.hadoop.mapred.JobClient - map 100% reduce 0%(JobClient.java:monitorAndPrintJob:1301) 2013-03-25 15:50:37,030 INFO org.apache.hadoop.mapred.JobClient - Job complete: job_local_0001(JobClient.java:monitorAndPrintJob:1356) 2013-03-25 15:50:37,031 INFO org.apache.hadoop.mapred.JobClient - Counters: 15(Counters.java:log:585) 2013-03-25 15:50:37,031 INFO org.apache.hadoop.mapred.JobClient - File Input Format Counters (Counters.java:log:587) 2013-03-25 15:50:37,031 INFO org.apache.hadoop.mapred.JobClient - Bytes Read=0(Counters.java:log:589) 2013-03-25 15:50:37,032 INFO org.apache.hadoop.mapred.JobClient - FileSystemCounters(Counters.java:log:587) 2013-03-25 15:50:37,032 INFO org.apache.hadoop.mapred.JobClient - FILE_BYTES_READ=10294950(Counters.java:log:589) 2013-03-25 15:50:37,033 INFO org.apache.hadoop.mapred.JobClient - FILE_BYTES_WRITTEN=10432139(Counters.java:log:589) 2013-03-25 15:50:37,033 INFO org.apache.hadoop.mapred.JobClient - Map-Reduce Framework(Counters.java:log:587) 2013-03-25 15:50:37,033 INFO org.apache.hadoop.mapred.JobClient - Map output materialized bytes=4006(Counters.java:log:589) 2013-03-25 15:50:37,034 INFO org.apache.hadoop.mapred.JobClient - Combine output records=0(Counters.java:log:589) 2013-03-25 15:50:37,034 INFO org.apache.hadoop.mapred.JobClient - Map input records=500(Counters.java:log:589) 2013-03-25 15:50:37,035 INFO org.apache.hadoop.mapred.JobClient - Physical memory (bytes) snapshot=0(Counters.java:log:589) 2013-03-25 15:50:37,035 INFO org.apache.hadoop.mapred.JobClient - Spilled Records=500(Counters.java:log:589) 2013-03-25 15:50:37,035 INFO org.apache.hadoop.mapred.JobClient - Map output bytes=3000(Counters.java:log:589) 2013-03-25 15:50:37,036 INFO org.apache.hadoop.mapred.JobClient - Total committed heap usage (bytes)=202702848(Counters.java:log:589) 2013-03-25 15:50:37,036 INFO org.apache.hadoop.mapred.JobClient - CPU time spent (ms)=0(Counters.java:log:589) 2013-03-25 15:50:37,037 INFO org.apache.hadoop.mapred.JobClient - Virtual memory (bytes) snapshot=0(Counters.java:log:589) 2013-03-25 15:50:37,037 INFO
HBase and Hadoop version
I am evaluating HBase 0.94.5 on a test cluster that happens to be running Hadoop 0.20.2-cdh3u5 I've seen the compatibility warnings but I'm just doing a first look at the features and not even thinking about production for the moment. So nothing disastrous will happen even in the worst case. My question is, what should I expect to go wrong with this particular version mismatch? -- This e-mail, including attachments, contains confidential and/or proprietary information, and may be used only by the person or entity to which it is addressed. The reader is hereby notified that any dissemination, distribution or copying of this e-mail is prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and delete this e-mail immediately.
Re: HBase M/R with M/R and HBase not on same cluster
Hi Michael, The reason is that cluster B is a production environment with jobs running on it non-stop. I do not want to take ressources away from it. Secondly, the destination cluster A is a much less powerful test environment, thus, even when running the job on B - the slow HBase sink on cluster A would be a bottleneck. What I did in the end was run a regular job on cluster A with input path set to a file on cluster B. /David On Mon, Mar 25, 2013 at 5:12 PM, Michael Segel michael_se...@hotmail.comwrote: Just out of curiosity... Why do you want to run the job on Cluster A that reads from Cluster B but writes to Cluster A? Wouldn't it be easier to run the job on Cluster B and inside the Mapper.setup() you create your own configuration for your second cluster for output? On Mar 24, 2013, at 7:49 AM, David Koch ogd...@googlemail.com wrote: Hello J-D, Thanks, it was instructive to look at the source. However, I am now stuck with getting HBase to honor the hbase.mapred.output.quorum setting. I opened a separate topic for this. Regards, /David On Mon, Mar 18, 2013 at 11:26 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: Checkout how CopyTable does it: https://github.com/apache/hbase/blob/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/CopyTable.java J-D On Mon, Mar 18, 2013 at 3:09 PM, David Koch ogd...@googlemail.com wrote: Hello, Is it possible to run a M/R on cluster A over a table that resides on cluster B with output to a table on cluster A? If so, how? I am interested in doing this for the purpose of copying part of a table from B to A. Cluster B is a production environment, cluster A is a slow test platform. I do not want the M/R to run on B since it would block precious slots on this cluster. Otherwise I could just run CopyTable on cluster B and specify cluster A as output quorum. Could this work by pointing the client configuration at the mapred-site.xml of cluster A and the hdfs-site.xml and hbase-site.xml of cluster B? In this scenario - in order to output to cluster A I guess I'd have to set TableOutputFormat.QUORUM_ADDRESS to cluster A. I use a client configuration generated by CDH4 and there are some other files floating around - such as core-site.xml, not sure what to do with that. Thank you, /David
Re: hbase.mapred.output.quorum ignored in Mapper job with HDFS source and HBase sink
Hello Ted, Yes, I'll put in a request and add a baseline example to reproduce the issue. Thank you for helping me get to the bottom of this, /David On Sun, Mar 24, 2013 at 3:35 PM, Ted Yu yuzhih...@gmail.com wrote: Looks like MultiTableOutputFormat doesn't support this use case - MultiTableOutputFormat doesn't extend TableOutputFormat: public class MultiTableOutputFormat extendsOutputFormatImmutableBytesWritable, Mutation { Relevant configuration is setup in TableOutputFormat#setConf(): public void setConf(Configuration otherConf) { this.conf = HBaseConfiguration.create(otherConf); String tableName = this.conf.get(OUTPUT_TABLE); if(tableName == null || tableName.length() = 0) { throw new IllegalArgumentException(Must specify table name); } String address = this.conf.get(QUORUM_ADDRESS); int zkClientPort = conf.getInt(QUORUM_PORT, 0); String serverClass = this.conf.get(REGION_SERVER_CLASS); String serverImpl = this.conf.get(REGION_SERVER_IMPL); try { if (address != null) { ZKUtil.applyClusterKeyToConf(this.conf, address); } Mind filing a JIRA for enhancement ? On Sun, Mar 24, 2013 at 5:46 AM, David Koch ogd...@googlemail.com wrote: Hello, I want to import a file on HDFS from one cluster A (source) into HBase tables on a different cluster B (destination) using a Mapper job with an HBase sink. Both clusters run HBase. This setup works fine: - Run Mapper job on cluster B (destination) - mapred.input.dir -- hdfs://cluster-A/path-to-file (file on source cluster) - hbase.zookeeper.quorum -- quorum-hostname-B - hbase.zookeeper.property.clientPort -- quorum-port-B I thought it should be possible to run the job on cluster A (source) and using hbase.mapred.output.quorum to insert into the tables on cluster B. This is what the CopyTable utility does. However, the following does not work. HBase looks for the destination table(s) on cluster A and NOT cluster B: - Run Mapper job on cluster A (source) - mapred.input.dir -- hdfs://cluster-A/path-to-file (file is local) - hbase.zookeeper.quorum -- quorum-hostname-A - hbase.zookeeper.property.clientPort -- quorum-port-A - hbase.mapred.output.quorum - quorum-hostname-B:2181:/hbase (same as --peer.adr argument for CopyTable) Job setup inside the class MyJob is as follows, note I am using MultiTableOutputFormat. Configuration conf = HBaseConfiguration.addHbaseResources(getConf()); Job job = new Job(conf); job.setJarByClass(MyJob.class); job.setMapperClass(JsonImporterMapper.class); // Note, several output tables! job.setOutputFormatClass(MultiTableOutputFormat.class); job.setNumReduceTasks(0); TableMapReduceUtil.addDependencyJars(job); TableMapReduceUtil.addDependencyJars(job.getConfiguration()); Where The Mapper class has the following frame: public static class JsonImporterMapper extends MapperLongWritable, Text, ImmutableBytesWritable, Put { } Is this expected behaviour? How can I get the second scenario using hbase.mapred.output.quorum to work? Could the fact I am using MultiTableOutputFormat instead of TableOutputFormat play a part? I am using HBase 0.92.1. Thank you, /David
Re: HBase and Hadoop version
On Tue, Mar 26, 2013 at 6:56 AM, Robert Hamilton rhamil...@whalesharkmedia.com wrote: I am evaluating HBase 0.94.5 on a test cluster that happens to be running Hadoop 0.20.2-cdh3u5 I've seen the compatibility warnings but I'm just doing a first look at the features and not even thinking about production for the moment. So nothing disastrous will happen even in the worst case. My question is, what should I expect to go wrong with this particular version mismatch? Nothing, in fact I've deployed pretty much that setup in production before. J-D
NPE in log cleaner
We are seeing a lot of these NPE once we enabled replication. HBase Version: 0.92.1-cdh4.0.1 2013-03-25 15:29:25,027 ERROR org.apache.hadoop.hbase.master.LogCleaner: Caught exception java.lang.NullPointerException at org.apache.hadoop.hbase.replication.ReplicationZookeeper.getListOfReplicators(ReplicationZookeeper.java:503) at org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.refreshHLogsAndSearch(ReplicationLogCleaner.java:96) at org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.isLogDeletable(ReplicationLogCleaner.java:83) at org.apache.hadoop.hbase.master.LogCleaner.chore(LogCleaner.java:133) at org.apache.hadoop.hbase.Chore.run(Chore.java:67) at org.apache.hadoop.hbase.master.LogCleaner.run(LogCleaner.java:155) at java.lang.Thread.run(Thread.java:662) Any ideas as to what is going on? Seems like we should never be getting a NPE. ~Jeff -- Jeff Whiting Qualtrics Senior Software Engineer je...@qualtrics.com
blog on compaction
Hi, Recently there have been some questions and answers on compaction in HBase. I came up with the following blog which covers a portion of the discussions: http://zhihongyu.blogspot.com/2013/03/compactions-q.html Special thanks go to J-D, Anoop, Ramkrishna, Sergey, Elliot and Jean-Marc. Cheers
Re: NPE in log cleaner
How did you enable it? Going by the line numbers of the stacktrace, it is very unlikely to have a npe there. Did you see anything suspicious in the rs-logs on enabling it? Can you pastebin more rs logs when you enable it. Himanshu On Tue, Mar 26, 2013 at 9:57 AM, Jeff Whiting je...@qualtrics.com wrote: We are seeing a lot of these NPE once we enabled replication. HBase Version: 0.92.1-cdh4.0.1 2013-03-25 15:29:25,027 ERROR org.apache.hadoop.hbase.master.LogCleaner: Caught exception java.lang.NullPointerException at org.apache.hadoop.hbase.replication.ReplicationZookeeper.getListOfReplicators(ReplicationZookeeper.java:503) at org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.refreshHLogsAndSearch(ReplicationLogCleaner.java:96) at org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner.isLogDeletable(ReplicationLogCleaner.java:83) at org.apache.hadoop.hbase.master.LogCleaner.chore(LogCleaner.java:133) at org.apache.hadoop.hbase.Chore.run(Chore.java:67) at org.apache.hadoop.hbase.master.LogCleaner.run(LogCleaner.java:155) at java.lang.Thread.run(Thread.java:662) Any ideas as to what is going on? Seems like we should never be getting a NPE. ~Jeff -- Jeff Whiting Qualtrics Senior Software Engineer je...@qualtrics.com
Cannot run selected test under 0.94, is OK under trunk/95 though.
Hi, I can run either full test or a selected test under either trunk or 95. But after I checkout to branch 94, I found I cannot run a selected test anymore. It can still run the full test suite though. I am adding a new unit test, I don't want to have run the whole test suite each time. Anyone has any idea what went wrong here? I did following: git checkout -b myChange remote/origin/0.94 mvn -Dtest=TestAdmin test [ERROR] Failed to execute goal org.apache.maven.plugins: maven-surefire-plugin:2.12-TRUNK-HBASE-2:test (default-test) on project hbase: No tests were executed! (Set -DfailIfNoTests=false to ignore this error.) - [Help 1] org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12-TRUNK-HBASE-2:test (default-test) on project hbase: No tests were executed! (Set -DfailIfNoTests=false to ignore this error.) Thanks Tian-Ying
Re: Cannot run selected test under 0.94, is OK under trunk/95 though.
Use one of the Maven switches -PrunAllTests or -PlocalTests On Tue, Mar 26, 2013 at 9:52 PM, Tianying Chang tich...@ebaysf.com wrote: Hi, I can run either full test or a selected test under either trunk or 95. But after I checkout to branch 94, I found I cannot run a selected test anymore. It can still run the full test suite though. I am adding a new unit test, I don't want to have run the whole test suite each time. Anyone has any idea what went wrong here? I did following: git checkout -b myChange remote/origin/0.94 mvn -Dtest=TestAdmin test [ERROR] Failed to execute goal org.apache.maven.plugins: maven-surefire-plugin:2.12-TRUNK-HBASE-2:test (default-test) on project hbase: No tests were executed! (Set -DfailIfNoTests=false to ignore this error.) - [Help 1] org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12-TRUNK-HBASE-2:test (default-test) on project hbase: No tests were executed! (Set -DfailIfNoTests=false to ignore this error.) Thanks Tian-Ying -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
Re: blog on compaction
Thanks to Anoop who pointed out that hbase.hstore.compactionThreshold is deprecated in 0.95 and beyond. Feel free to comment on the blog. Please do not promote non-open source software or service(s) on my blog. Cheers On Tue, Mar 26, 2013 at 10:19 AM, Ted Yu yuzhih...@gmail.com wrote: Hi, Recently there have been some questions and answers on compaction in HBase. I came up with the following blog which covers a portion of the discussions: http://zhihongyu.blogspot.com/2013/03/compactions-q.html Special thanks go to J-D, Anoop, Ramkrishna, Sergey, Elliot and Jean-Marc. Cheers
Re: Compaction problem
1st thing I would do to find the bottleneck it so benchmark HDFS solo performance. Create a 16GB file (using dd) which is x2 your memory and run time hadoop fs -copyFromLocal yourFile.txt /tmp/a.txt Tell us what is the speed of this file copy in MB/sec. On Mar 22, 2013, at 4:44 PM, tarang dawer tarang.da...@gmail.com wrote: Hi As per my use case , I have to write around 100gb data , with a ingestion speed of around 200 mbps. While writing , i am getting a performance hit by compaction , which adds to the delay. I am using a 8 core machine with 16 gb RAM available., 2 Tb hdd 7200RPM. Got some idea from the archives and tried pre splitting the regions , configured HBase with following parameters(configured the parameters in a haste , so please guide me if anything's out of order) :- property namehbase.hregion.memstore.block.multiplier/name value4/value /property property namehbase.hregion.memstore.flush.size/name value1073741824/value /property property namehbase.hregion.max.filesize/name value1073741824/value /property property namehbase.hstore.compactionThreshold/name value5/value /property property namehbase.hregion.majorcompaction/name value0/value /property property namehbase.hstore.blockingWaitTime/name value3/value /property property namehbase.hstore.blockingStoreFiles/name value200/value /property property namehbase.regionserver.lease.period/name value300/value /property but still m not able to achieve the optimal rate , getting around 110 mbps. Need some optimizations ,so please could you help out ? Thanks Tarang Dawer On Fri, Mar 22, 2013 at 6:05 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Tarang, I will recommand you to take a look at the list archives first to see all the discussions related to compaction. You will found many interesting hints and tips. http://search-hadoop.com/?q=compactionsfc_project=HBasefc_type=mail+_hash_+user After that, you will need to provide more details regarding how you are using HBase and how the compaction is impacting you. JM 2013/3/22 tarang dawer tarang.da...@gmail.com: Hi I am using HBase 0.94.2 currently. While using it , its write performance, due to compaction is being affeced by compaction. Please could you suggest some quick tips in relation to how to deal with it ? Thanks Tarang Dawer
RE: Cannot run selected test under 0.94, is OK under trunk/95 though.
Andrew, Thanks! -PlocalTests works! Tian-Ying From: Andrew Purtell [apurt...@apache.org] Sent: Tuesday, March 26, 2013 1:54 PM To: user@hbase.apache.org Subject: Re: Cannot run selected test under 0.94, is OK under trunk/95 though. Use one of the Maven switches -PrunAllTests or -PlocalTests On Tue, Mar 26, 2013 at 9:52 PM, Tianying Chang tich...@ebaysf.com wrote: Hi, I can run either full test or a selected test under either trunk or 95. But after I checkout to branch 94, I found I cannot run a selected test anymore. It can still run the full test suite though. I am adding a new unit test, I don't want to have run the whole test suite each time. Anyone has any idea what went wrong here? I did following: git checkout -b myChange remote/origin/0.94 mvn -Dtest=TestAdmin test [ERROR] Failed to execute goal org.apache.maven.plugins: maven-surefire-plugin:2.12-TRUNK-HBASE-2:test (default-test) on project hbase: No tests were executed! (Set -DfailIfNoTests=false to ignore this error.) - [Help 1] org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12-TRUNK-HBASE-2:test (default-test) on project hbase: No tests were executed! (Set -DfailIfNoTests=false to ignore this error.) Thanks Tian-Ying -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
Re: Crash when run two jobs at the same time with same Hbase table
Dear JM, It's correct. The Hbase version is 0.94.2 and the hadoop version is 0.20.2 / 1.04。 We test this on both hadoop version 0.20.2 and 1.04. The error still there. Thanks a lot Best Regards / 商祺 郭伟 Guo Wei - 南京西桥科技有限公司 Western Bridge Tech Ltd., Nanjing 南京市玄武区花园路8号一号楼511 No. 511, Building 1, No. 8, Hua Yuan Road Xuanwu District, Nanjing, PR China Email: wei@wbkit.com Tel: +86 25 8528 4900 (Operator) Mobile: +86 138 1589 8257 Fax: +86 25 8528 4980 Weibo: http://weibo.com/guowee Web: http://www.wbkit.com - WesternBridge Tech: Professional software service provider. Professional is MANNER as well CAPABILITY. On 2013-3-26, at 下午9:18, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi, So basically, you have one job which is reading from A and writing to B, and one wich is reading from A and writing to C, and the 2 jobs are running at the same time. Is that correct? Are you able to reproduce that each time you are running the job? Which HBased and Hadoop versions are you running? JM 2013/3/26 GuoWei wei@wbkit.com: Dear, When I run two MR Jobs which will read same Hbase table and write to another same Hbase table at the same time. The result is one job successful finished. And another job crashed. And The following shows the error log. Please help me to find out why ? 2013-03-25 15:50:34,026 INFO org.apache.hadoop.mapred.JobClient - map 0% reduce 0%(JobClient.java:monitorAndPrintJob:1301) 2013-03-25 15:50:36,096 WARN org.apache.hadoop.mapred.Task - Could not find output size (Task.java:calculateOutputSize:948) org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find output/file.out in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:429) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160) at org.apache.hadoop.mapred.MapOutputFile.getOutputFile(MapOutputFile.java:56) at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:944) at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:924) at org.apache.hadoop.mapred.Task.done(Task.java:875) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:374) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) 2013-03-25 15:50:36,100 INFO org.apache.hadoop.mapred.LocalJobRunner - (LocalJobRunner.java:statusUpdate:321) 2013-03-25 15:50:36,102 INFO org.apache.hadoop.mapred.Task - Task 'attempt_local_0001_m_00_0' done.(Task.java:sendDone:959) 2013-03-25 15:50:36,111 WARN org.apache.hadoop.mapred.FileOutputCommitter - Output path is null in cleanup(FileOutputCommitter.java:cleanupJob:100) 2013-03-25 15:50:36,111 WARN org.apache.hadoop.mapred.LocalJobRunner - job_local_0001(LocalJobRunner.java:run:298) org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find output/file.out in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:429) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160) at org.apache.hadoop.mapred.MapOutputFile.getOutputFile(MapOutputFile.java:56) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:236) 2013-03-25 15:50:37,029 INFO org.apache.hadoop.mapred.JobClient - map 100% reduce 0%(JobClient.java:monitorAndPrintJob:1301) 2013-03-25 15:50:37,030 INFO org.apache.hadoop.mapred.JobClient - Job complete: job_local_0001(JobClient.java:monitorAndPrintJob:1356) 2013-03-25 15:50:37,031 INFO org.apache.hadoop.mapred.JobClient - Counters: 15(Counters.java:log:585) 2013-03-25 15:50:37,031 INFO org.apache.hadoop.mapred.JobClient - File Input Format Counters (Counters.java:log:587) 2013-03-25 15:50:37,031 INFO org.apache.hadoop.mapred.JobClient - Bytes Read=0(Counters.java:log:589) 2013-03-25 15:50:37,032 INFO org.apache.hadoop.mapred.JobClient - FileSystemCounters(Counters.java:log:587) 2013-03-25 15:50:37,032 INFO org.apache.hadoop.mapred.JobClient - FILE_BYTES_READ=10294950(Counters.java:log:589) 2013-03-25 15:50:37,033 INFO org.apache.hadoop.mapred.JobClient - FILE_BYTES_WRITTEN=10432139(Counters.java:log:589) 2013-03-25 15:50:37,033 INFO org.apache.hadoop.mapred.JobClient - Map-Reduce Framework(Counters.java:log:587) 2013-03-25 15:50:37,033 INFO org.apache.hadoop.mapred.JobClient - Map output materialized bytes=4006(Counters.java:log:589) 2013-03-25 15:50:37,034 INFO org.apache.hadoop.mapred.JobClient - Combine output records=0(Counters.java:log:589) 2013-03-25 15:50:37,034 INFO org.apache.hadoop.mapred.JobClient - Map input
Re: Crash when run two jobs at the same time with same Hbase table
Interesting. Need to check this. May be we should configure different names for the local output directory for each job. By any chance both jobs are writing to the same path? Regards Ram On Wed, Mar 27, 2013 at 6:44 AM, GuoWei wei@wbkit.com wrote: Dear JM, It's correct. The Hbase version is 0.94.2 and the hadoop version is 0.20.2 / 1.04。 We test this on both hadoop version 0.20.2 and 1.04. The error still there. Thanks a lot Best Regards / 商祺 郭伟 Guo Wei - 南京西桥科技有限公司 Western Bridge Tech Ltd., Nanjing 南京市玄武区花园路8号一号楼511 No. 511, Building 1, No. 8, Hua Yuan Road Xuanwu District, Nanjing, PR China Email: wei@wbkit.com Tel: +86 25 8528 4900 (Operator) Mobile: +86 138 1589 8257 Fax: +86 25 8528 4980 Weibo: http://weibo.com/guowee Web: http://www.wbkit.com - WesternBridge Tech: Professional software service provider. Professional is MANNER as well CAPABILITY. On 2013-3-26, at 下午9:18, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi, So basically, you have one job which is reading from A and writing to B, and one wich is reading from A and writing to C, and the 2 jobs are running at the same time. Is that correct? Are you able to reproduce that each time you are running the job? Which HBased and Hadoop versions are you running? JM 2013/3/26 GuoWei wei@wbkit.com: Dear, When I run two MR Jobs which will read same Hbase table and write to another same Hbase table at the same time. The result is one job successful finished. And another job crashed. And The following shows the error log. Please help me to find out why ? 2013-03-25 15:50:34,026 INFO org.apache.hadoop.mapred.JobClient - map 0% reduce 0%(JobClient.java:monitorAndPrintJob:1301) 2013-03-25 15:50:36,096 WARN org.apache.hadoop.mapred.Task - Could not find output size (Task.java:calculateOutputSize:948) org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find output/file.out in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:429) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160) at org.apache.hadoop.mapred.MapOutputFile.getOutputFile(MapOutputFile.java:56) at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:944) at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:924) at org.apache.hadoop.mapred.Task.done(Task.java:875) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:374) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) 2013-03-25 15:50:36,100 INFO org.apache.hadoop.mapred.LocalJobRunner - (LocalJobRunner.java:statusUpdate:321) 2013-03-25 15:50:36,102 INFO org.apache.hadoop.mapred.Task - Task 'attempt_local_0001_m_00_0' done.(Task.java:sendDone:959) 2013-03-25 15:50:36,111 WARN org.apache.hadoop.mapred.FileOutputCommitter - Output path is null in cleanup(FileOutputCommitter.java:cleanupJob:100) 2013-03-25 15:50:36,111 WARN org.apache.hadoop.mapred.LocalJobRunner - job_local_0001(LocalJobRunner.java:run:298) org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find output/file.out in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:429) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160) at org.apache.hadoop.mapred.MapOutputFile.getOutputFile(MapOutputFile.java:56) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:236) 2013-03-25 15:50:37,029 INFO org.apache.hadoop.mapred.JobClient - map 100% reduce 0%(JobClient.java:monitorAndPrintJob:1301) 2013-03-25 15:50:37,030 INFO org.apache.hadoop.mapred.JobClient - Job complete: job_local_0001(JobClient.java:monitorAndPrintJob:1356) 2013-03-25 15:50:37,031 INFO org.apache.hadoop.mapred.JobClient - Counters: 15(Counters.java:log:585) 2013-03-25 15:50:37,031 INFO org.apache.hadoop.mapred.JobClient - File Input Format Counters (Counters.java:log:587) 2013-03-25 15:50:37,031 INFO org.apache.hadoop.mapred.JobClient - Bytes Read=0(Counters.java:log:589) 2013-03-25 15:50:37,032 INFO org.apache.hadoop.mapred.JobClient - FileSystemCounters(Counters.java:log:587) 2013-03-25 15:50:37,032 INFO org.apache.hadoop.mapred.JobClient - FILE_BYTES_READ=10294950(Counters.java:log:589) 2013-03-25 15:50:37,033 INFO org.apache.hadoop.mapred.JobClient - FILE_BYTES_WRITTEN=10432139(Counters.java:log:589) 2013-03-25 15:50:37,033 INFO org.apache.hadoop.mapred.JobClient - Map-Reduce Framework(Counters.java:log:587) 2013-03-25 15:50:37,033 INFO org.apache.hadoop.mapred.JobClient -
Re: Crash when run two jobs at the same time with same Hbase table
Dear, How to set different names for the local output directory for each job ? Does SDK support this ? tks Best Regards Weibo: http://weibo.com/guowee Web: http://www.wbkit.com - WesternBridge Tech: Professional software service provider. Professional is MANNER as well CAPABILITY. On 2013-3-27, at 上午10:58, ramkrishna vasudevan ramkrishna.s.vasude...@gmail.com wrote: Interesting. Need to check this. May be we should configure different names for the local output directory for each job. By any chance both jobs are writing to the same path? Regards Ram On Wed, Mar 27, 2013 at 6:44 AM, GuoWei wei@wbkit.com wrote: Dear JM, It's correct. The Hbase version is 0.94.2 and the hadoop version is 0.20.2 / 1.04。 We test this on both hadoop version 0.20.2 and 1.04. The error still there. Thanks a lot Best Regards / 商祺 郭伟 Guo Wei - 南京西桥科技有限公司 Western Bridge Tech Ltd., Nanjing 南京市玄武区花园路8号一号楼511 No. 511, Building 1, No. 8, Hua Yuan Road Xuanwu District, Nanjing, PR China Email: wei@wbkit.com Tel: +86 25 8528 4900 (Operator) Mobile: +86 138 1589 8257 Fax: +86 25 8528 4980 Weibo: http://weibo.com/guowee Web: http://www.wbkit.com - WesternBridge Tech: Professional software service provider. Professional is MANNER as well CAPABILITY. On 2013-3-26, at 下午9:18, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi, So basically, you have one job which is reading from A and writing to B, and one wich is reading from A and writing to C, and the 2 jobs are running at the same time. Is that correct? Are you able to reproduce that each time you are running the job? Which HBased and Hadoop versions are you running? JM 2013/3/26 GuoWei wei@wbkit.com: Dear, When I run two MR Jobs which will read same Hbase table and write to another same Hbase table at the same time. The result is one job successful finished. And another job crashed. And The following shows the error log. Please help me to find out why ? 2013-03-25 15:50:34,026 INFO org.apache.hadoop.mapred.JobClient - map 0% reduce 0%(JobClient.java:monitorAndPrintJob:1301) 2013-03-25 15:50:36,096 WARN org.apache.hadoop.mapred.Task - Could not find output size (Task.java:calculateOutputSize:948) org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find output/file.out in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:429) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160) at org.apache.hadoop.mapred.MapOutputFile.getOutputFile(MapOutputFile.java:56) at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:944) at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:924) at org.apache.hadoop.mapred.Task.done(Task.java:875) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:374) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) 2013-03-25 15:50:36,100 INFO org.apache.hadoop.mapred.LocalJobRunner - (LocalJobRunner.java:statusUpdate:321) 2013-03-25 15:50:36,102 INFO org.apache.hadoop.mapred.Task - Task 'attempt_local_0001_m_00_0' done.(Task.java:sendDone:959) 2013-03-25 15:50:36,111 WARN org.apache.hadoop.mapred.FileOutputCommitter - Output path is null in cleanup(FileOutputCommitter.java:cleanupJob:100) 2013-03-25 15:50:36,111 WARN org.apache.hadoop.mapred.LocalJobRunner - job_local_0001(LocalJobRunner.java:run:298) org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find output/file.out in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:429) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160) at org.apache.hadoop.mapred.MapOutputFile.getOutputFile(MapOutputFile.java:56) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:236) 2013-03-25 15:50:37,029 INFO org.apache.hadoop.mapred.JobClient - map 100% reduce 0%(JobClient.java:monitorAndPrintJob:1301) 2013-03-25 15:50:37,030 INFO org.apache.hadoop.mapred.JobClient - Job complete: job_local_0001(JobClient.java:monitorAndPrintJob:1356) 2013-03-25 15:50:37,031 INFO org.apache.hadoop.mapred.JobClient - Counters: 15(Counters.java:log:585) 2013-03-25 15:50:37,031 INFO org.apache.hadoop.mapred.JobClient - File Input Format Counters (Counters.java:log:587) 2013-03-25 15:50:37,031 INFO org.apache.hadoop.mapred.JobClient - Bytes Read=0(Counters.java:log:589) 2013-03-25 15:50:37,032 INFO org.apache.hadoop.mapred.JobClient - FileSystemCounters(Counters.java:log:587) 2013-03-25