Re: independent scans to same region processed serially
Filed https://issues.apache.org/jira/browse/HBASE-7805 Test case attached It occurs only if the table has a region observer coprocessor. James On 02/09/2013 11:04 AM, lars hofhansl wrote: If I execute in parallel multiple scans to different parts of the same region, they appear to be processed serially. It's actually faster from the client side to execute a single serial scan than it is to execute multiple parallel scans to different segments of the region. I do have region observer coprocessors for the table I'm scanning, but my code is not doing any synchronization.
Re: : Region Servers crashing following: File does not exist, Too many open files exceptions
Yes, the limit is at 65535. /David On Sun, Feb 10, 2013 at 4:22 AM, Marcos Ortiz mlor...@uci.cu wrote: Did you increase the number of open files in your /etc/security/limits.conf in your system? On 02/09/2013 09:17 PM, David Koch wrote: Hello, Thank you for your reply, I checked the HDFS log for error messages that are indicative of xciever problems but could not find any. The settings suggested here: http://blog.cloudera.com/blog/2012/03/hbase-hadoop-xceivers/have been applied on our cluster. I did a grep File does not exist: /hbase/table_name/ /var/log/hadoop-hdfs/hadoop-cmf-hdfs1-NAMENODE-big* | wc on the namenode logs and there millions of such lines for one table only. The count is 0 for all other tables - even though they may be reported as inconsistent by hbchk. It seems like this is less of a performance issue but rather some stale where to find what data problem - possibly related to Zookeeper? I remember there being some kind of procedure for clearing ZK even though I cannot recall the steps involved. Any further help would be appreciated, Thanks, /David On Sun, Feb 10, 2013 at 2:24 AM, Dhaval Shah prince_mithi...@yahoo.co.in prince_mithi...@yahoo.co.inwrote: It seems like you need to increase the limit on the number of xceivers on the hdfs config looking at your error messages. -- On Sun 10 Feb, 2013 6:37 AM IST David Koch wrote: Hello, As of lately, we have been having issues with Region Servers crashing in our cluster. This happens while running Map/Reduce jobs over HBase tables in particular but also spontaneously when the cluster is seemingly idle. Restarting the Region Servers or even HBase entirely as well as HDFS and Map/Reduce services does not fix the problem and jobs will fail during the next attempt citing Region not served exceptions. It is not always the same nodes that crash. The log data during the minutes leading up to the crash contain many File does not exist /hbase/table_name/... error messages which change to Too many open files messages, finally, there are a few Failed to renew lease for DFSClient messages followed by several FATAL messages about HLog not being able to synch and immediately afterwards a terminal ABORTING region server. You can find an extract of a Region Server log here:http://pastebin.com/G39LQyQT. Running hbase hbck reveals inconsistencies in some tables, but attempting a repair with hbase hbck -repair stalls due to some regions being in transition, see here: http://pastebin.com/JAbcQ4cc. The setup contains 30 machines, 26GB RAM each, the services are managed using CDH4, so HBase version is 0.92.x. We did not tweak any of the default configuration settings, however table scans are done with sensible scan/batch/filter settings. Data intake is about 100GB/day which are added at a time when no Map/Reduce jobs are running. Tables have between 100 * 10^6 and 2 * 10^9 rows, with an average of 10 KVs, about 1kb each. Very few rows exceed 10^6 KV. What can we do to fix these issues? Are they symptomic of a mal-configured setup or some critical threshold level being reached? The cluster used to be stable. Thank you, /David -- Marcos Ortiz Valmaseda, Product Manager Data Scientist at UCI Blog: http://marcosluis2186.posterous.com Twitter: @marcosluis2186 http://twitter.com/marcosluis2186
Re: : Region Servers crashing following: File does not exist, Too many open files exceptions
On Sun, Feb 10, 2013 at 6:21 PM, David Koch ogd...@googlemail.com wrote: problems but could not find any. The settings increase the u limit for the user using you are starting the hadoop and hbase services, in os ∞ Shashwat Shriparv
Re: Moving master and namenode...
Hi Jean, Steps need to followed during migration of namenode. 1.Make New Server with same hostname. 2.Install hadoop 3.copy the metadata from old server and paste it in new server. 4.Make sure all the datanodes are down. 5.Stop old namenode 6.Start New namenode with old metadata. 7.if it come up properly,start all datanodes. On Sat, Feb 9, 2013 at 9:32 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi, I have added one new server in my cluster and I want to move the master and the namenode into this new server, and install the previous master as a datanode/RS. In the existing namenode, I will need to backup the NameNode storage directories and restore them on the new namenode, and then reconfigure all the servers to use this new namenode. Is that the right way to proceed? Any risks for the data? Regarding the master, I simply need to configure all the region servers to point to this new master? Nothing I need to transfert from the existing master to the new one? I just want to make sure I don't miss something and lose everything. Thanks, JM -- Regards, Varun Kumar.P
Re: : Region Servers crashing following: File does not exist, Too many open files exceptions
Like I said, the maximum permissible number of filehandlers is set to 65535 for users hbase (the one who starts HBase), mapred and hdfs The too many files warning occurs on the region servers but not on the HDFS namenode. /David On Sun, Feb 10, 2013 at 3:53 PM, shashwat shriparv dwivedishash...@gmail.com wrote: On Sun, Feb 10, 2013 at 6:21 PM, David Koch ogd...@googlemail.com wrote: problems but could not find any. The settings increase the u limit for the user using you are starting the hadoop and hbase services, in os ∞ Shashwat Shriparv
Re: Moving master and namenode...
The master does not have any local storage in a fully distributed setup, so the transfer can also be as easy as starting the new master on the new host and failing it over (by killing the original one). The NameNode move part is the one that gets tricky. HBase may store NN URLs in its ZK transient storage for purposes such as distributed log splitting, etc.. (correct me if this has been addressed), and if the NN move is ending up with a change of its hostname, then before you start HBase, you may want to scour the /hbase znodes in ZK to remove the znodes with a reference to the older NN URL in its data. Most just prefer to erase the whole /hbase znode, but it would work either way. The goal being to not have the new master or restarted slaves run into issues when it picks up a URL it can no longer fetch for processing. On Sat, Feb 9, 2013 at 9:32 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi, I have added one new server in my cluster and I want to move the master and the namenode into this new server, and install the previous master as a datanode/RS. In the existing namenode, I will need to backup the NameNode storage directories and restore them on the new namenode, and then reconfigure all the servers to use this new namenode. Is that the right way to proceed? Any risks for the data? Regarding the master, I simply need to configure all the region servers to point to this new master? Nothing I need to transfert from the existing master to the new one? I just want to make sure I don't miss something and lose everything. Thanks, JM -- Harsh J
Re: Get on a row with multiple columns
Back to BulkDeleteEndpoint, i got it to work but why are the scanner.next() calls executing on the Priority handler queue ? Varun On Sat, Feb 9, 2013 at 8:46 AM, lars hofhansl la...@apache.org wrote: The answer is probably :) It's disabled in 0.96 by default. Check out HBASE-7008 ( https://issues.apache.org/jira/browse/HBASE-7008) and the discussion there. Also check out the discussion in HBASE-5943 and HADOOP-8069 ( https://issues.apache.org/jira/browse/HADOOP-8069) -- Lars From: Jean-Marc Spaggiari jean-m...@spaggiari.org To: user@hbase.apache.org Sent: Saturday, February 9, 2013 5:02 AM Subject: Re: Get on a row with multiple columns Lars, should we always consider disabling Nagle? What's the down side? JM 2013/2/9, Varun Sharma va...@pinterest.com: Yeah, I meant true... On Sat, Feb 9, 2013 at 12:17 AM, lars hofhansl la...@apache.org wrote: Should be set to true. If tcpnodelay is set to true, Nagle's is disabled. -- Lars From: Varun Sharma va...@pinterest.com To: user@hbase.apache.org; lars hofhansl la...@apache.org Sent: Saturday, February 9, 2013 12:11 AM Subject: Re: Get on a row with multiple columns Okay I did my research - these need to be set to false. I agree. On Sat, Feb 9, 2013 at 12:05 AM, Varun Sharma va...@pinterest.com wrote: I have ipc.client.tcpnodelay, ipc.server.tcpnodelay set to false and the hbase one - [hbase].ipc.client.tcpnodelay set to true. Do these induce network latency ? On Fri, Feb 8, 2013 at 11:57 PM, lars hofhansl la...@apache.org wrote: Sorry.. I meant set these two config parameters to true (not false as I state below). - Original Message - From: lars hofhansl la...@apache.org To: user@hbase.apache.org user@hbase.apache.org Cc: Sent: Friday, February 8, 2013 11:41 PM Subject: Re: Get on a row with multiple columns Only somewhat related. Seeing the magic 40ms random read time there. Did you disable Nagle's? (set hbase.ipc.client.tcpnodelay and ipc.server.tcpnodelay to false in hbase-site.xml). From: Varun Sharma va...@pinterest.com To: user@hbase.apache.org; lars hofhansl la...@apache.org Sent: Friday, February 8, 2013 10:45 PM Subject: Re: Get on a row with multiple columns The use case is like your twitter feed. Tweets from people u follow. When someone unfollows, you need to delete a bunch of his tweets from the following feed. So, its frequent, and we are essentially running into some extreme corner cases like the one above. We need high write throughput for this, since when someone tweets, we need to fanout the tweet to all the followers. We need the ability to do fast deletes (unfollow) and fast adds (follow) and also be able to do fast random gets - when a real user loads the feed. I doubt we will able to play much with the schema here since we need to support a bunch of use cases. @lars: It does not take 30 seconds to place 300 delete markers. It takes 30 seconds to first find which of those 300 pins are in the set of columns present - this invokes 300 gets and then place the appropriate delete markers. Note that we can have tens of thousands of columns in a single row so a single get is not cheap. If we were to just place delete markers, that is very fast. But when started doing that, our random read performance suffered because of too many delete markers. The 90th percentile on random reads shot up from 40 milliseconds to 150 milliseconds, which is not acceptable for our usecase. Thanks Varun On Fri, Feb 8, 2013 at 10:33 PM, lars hofhansl la...@apache.org wrote: Can you organize your columns and then delete by column family? deleteColumn without specifying a TS is expensive, since HBase first has to figure out what the latest TS is. Should be better in 0.94.1 or later since deletes are batched like Puts (still need to retrieve the latest version, though). In 0.94.3 or later you can also the BulkDeleteEndPoint, which basically let's specify a scan condition and then place specific delete marker for all KVs encountered. If you wanted to get really fancy, you could hook up a coprocessor to the compaction process and simply filter all KVs you no longer want (without ever placing any delete markers). Are you saying it takes 15 seconds to place 300 version delete markers?! -- Lars From: Varun Sharma va...@pinterest.com To: user@hbase.apache.org Sent: Friday, February 8, 2013 10:05 PM Subject: Re: Get on a row with multiple columns We are given a set of 300 columns to delete. I tested two cases: 1) deleteColumns() - with the 's' This function simply adds delete markers
Re: Behaviour on Get call with ColumnPaginationFilter
ColumnPaginationFilter wouldn't load the entire row into memory: public ReturnCode filterKeyValue(KeyValue v) { if(count = offset + limit) { return ReturnCode.NEXT_ROW; } Cheers On Sat, Feb 9, 2013 at 8:53 PM, Varun Sharma va...@pinterest.com wrote: Hi, If I am running the following call: Get get = new Get(row); get.setFilter(new ColumnPaginationFilter(50, 0)); Does this load the entire row into memory ? Or will it only try to fetch as much as needed (basically load only the relevant hfile blocks into block cache instead of loading all the hfile blocks containing the row) ? Thanks Varun
Re: HBase Region/Table Hotspotting
Hi Joarder, Welcome to the HBase world. Let me take some time to address your questions the best I can: 1. How often you are facing Region or Table Hotspotting in HBase production systems? --- Hotspotting is not something that just happens. This is usually caused by bad key design and writing to one region more than the others. I would recommend watching some of Lar's YouTube videos on Schema Design in HBase. 2. If a hotspot is created, how quickly it is automatically cleared out (assuming sudden workload change)? --- It will not be automatically cleared out I think you may be misinformed here. Basically, it is on you to watch you table and your write distribution and determine that you have a hotspot and take the necessary action. Usually the only action is to split the region. If hotspots become a habitual problem you would most likely want to go back and re-evaluate your current key. 3. How often this kind of situation happens - A hotspot is detected and vanished out before taking an action? or hotspots stays longer period of time? --- Please see above 4. Or if the hotspot is stays, how it is handled (in general) in production system? --- Some people have to hotspot on purpose early on, because they only write to a subset of regions. You will have to manually watch for hotspots(which is much easier in later releases). 5. How large data transfer cost is minimized or avoid for re-sharding regions within a cluster in a single data center or within WAN? --- Not quite sure what you are saying here, so I will take a best guess at it. Sharding is handled in HBase by region splitting. The best way to success in HBase is to understand your data and you needs BEFORE you create you table and start writing into HBase. This way you can presplit your table to handle the incoming data and you won't have to do a massive amounts of splits. Later you can allow HBase to split your tables manually, or you can set the maxfile size high and manually control the splits or sharding. 6. Is hotspoting in HBase cluster is really a issue (big!) nowadays for OLAP workloads and real-time analytics? --- Just design your schema correctly and this should not be a problem for you. Please let me know if this answers your questions. On Sun, Feb 10, 2013 at 9:17 PM, Joarder KAMAL joard...@gmail.com wrote: This is my first email in the group. I am having a more general and open-ended question but hope to get some reasoning from the HBase user communities. I am a very basic HBase user and still learning. My intention to use HBase in one of our research project. Recently I was looking through Lars George's book HBase - The Definitive Guide and two particular topics caught my eyes. One is 'Region and Table Hotspotting' and the other is 'Region Auto-Sharding and Merging'. *Scenario: * If a hotspot is created in a particular region or in a table (having multiple regions) due to sudden workload change, then one may split the region into further small pieces and distributed it to a number of available physical machine in the cluster. This process should require large data transfer between different machines in the cluster and incur a performance cost. One may also change the 'key' definition and manage the regions. But I am not sure how effective or logical to change key designs on a production system. *Questions:* 1. How often you are facing Region or Table Hotspotting in HBase production systems? 2. If a hotspot is created, how quickly it is automatically cleared out (assuming sudden workload change)? 3. How often this kind of situation happens - A hotspot is detected and vanished out before taking an action? or hotspots stays longer period of time? 4. Or if the hotspot is stays, how it is handled (in general) in production system? 5. How large data transfer cost is minimized or avoid for re-sharding regions within a cluster in a single data center or within WAN? 6. Is hotspoting in HBase cluster is really a issue (big!) nowadays for OLAP workloads and real-time analytics? Further directions to more information about region/table hotspotting is most welcome. Many thanks in advance. Regards, Joarder Kamal -- Kevin O'Dell Customer Operations Engineer, Cloudera
Re: HBase Region/Table Hotspotting
Hi Kevin, Thanks a lot for your great answers. Regarding Q5. To clarify, lets say Facebook is using HBase for the integrated messaging/chat/email system in a very large-scale setup. And schema design of such system can change over the years (even over the months). Workload patterns may also change due to different usage characteristics (like the rate of messaging may be higher during a protest/specific event in a particular country). So, region/table hotspots have been created at random region servers within the cluster despite careful schema design and pre-planning. The facebook team rush to split the hotspotted regions manually and redistribute them over a new set of physical machines which are recently added to the system to increase scalability in the face of high user demand. Now hotspotted region data could be transferred into new physical machines gradually to handle the situations. Now if the shard (region) size is small enough then data transfer cost over the network could be minimum otherwise large volume of data needs to be transferred instantly. I have found in many places it is discouraged to have a large number of regions systems. However, would it be possible to have very large number of regions in a system thus minimizing data transfer cost in case hotspotting due to workload/design characteristics. Is there any drawbacks or known side-effects? I am rethinking other possibilities other pre-planned schema and row-key designs. Thanks again. Regards, Joarder Kamal On 11 February 2013 13:32, Kevin O'dell kevin.od...@cloudera.com wrote: Hi Joarder, Welcome to the HBase world. Let me take some time to address your questions the best I can: 1. How often you are facing Region or Table Hotspotting in HBase production systems? --- Hotspotting is not something that just happens. This is usually caused by bad key design and writing to one region more than the others. I would recommend watching some of Lar's YouTube videos on Schema Design in HBase. 2. If a hotspot is created, how quickly it is automatically cleared out (assuming sudden workload change)? --- It will not be automatically cleared out I think you may be misinformed here. Basically, it is on you to watch you table and your write distribution and determine that you have a hotspot and take the necessary action. Usually the only action is to split the region. If hotspots become a habitual problem you would most likely want to go back and re-evaluate your current key. 3. How often this kind of situation happens - A hotspot is detected and vanished out before taking an action? or hotspots stays longer period of time? --- Please see above 4. Or if the hotspot is stays, how it is handled (in general) in production system? --- Some people have to hotspot on purpose early on, because they only write to a subset of regions. You will have to manually watch for hotspots(which is much easier in later releases). 5. How large data transfer cost is minimized or avoid for re-sharding regions within a cluster in a single data center or within WAN? --- Not quite sure what you are saying here, so I will take a best guess at it. Sharding is handled in HBase by region splitting. The best way to success in HBase is to understand your data and you needs BEFORE you create you table and start writing into HBase. This way you can presplit your table to handle the incoming data and you won't have to do a massive amounts of splits. Later you can allow HBase to split your tables manually, or you can set the maxfile size high and manually control the splits or sharding. 6. Is hotspoting in HBase cluster is really a issue (big!) nowadays for OLAP workloads and real-time analytics? --- Just design your schema correctly and this should not be a problem for you. Please let me know if this answers your questions. On Sun, Feb 10, 2013 at 9:17 PM, Joarder KAMAL joard...@gmail.com wrote: This is my first email in the group. I am having a more general and open-ended question but hope to get some reasoning from the HBase user communities. I am a very basic HBase user and still learning. My intention to use HBase in one of our research project. Recently I was looking through Lars George's book HBase - The Definitive Guide and two particular topics caught my eyes. One is 'Region and Table Hotspotting' and the other is 'Region Auto-Sharding and Merging'. *Scenario: * If a hotspot is created in a particular region or in a table (having multiple regions) due to sudden workload change, then one may split the region into further small pieces and distributed it to a number of available physical machine in the cluster. This process should require large data transfer between different machines in the cluster and incur a performance cost. One may also change the 'key' definition and manage the regions. But I am not sure how effective or logical to change
Re: : Region Servers crashing following: File does not exist, Too many open files exceptions
Hi David, Have you changed anything on the configurations related to compactions? If there are more store files created and if the compactions are not run frequently we end up in this problem. Atleast there will be a consistent increase in the file handler count. Could you run compactions manually to see if it helps? Regards Ram On Mon, Feb 11, 2013 at 1:41 AM, David Koch ogd...@googlemail.com wrote: Like I said, the maximum permissible number of filehandlers is set to 65535 for users hbase (the one who starts HBase), mapred and hdfs The too many files warning occurs on the region servers but not on the HDFS namenode. /David On Sun, Feb 10, 2013 at 3:53 PM, shashwat shriparv dwivedishash...@gmail.com wrote: On Sun, Feb 10, 2013 at 6:21 PM, David Koch ogd...@googlemail.com wrote: problems but could not find any. The settings increase the u limit for the user using you are starting the hadoop and hbase services, in os ∞ Shashwat Shriparv
Re: HBase Region/Table Hotspotting
Matt Corgan summarized the pro and con of having large number of regions here: https://issues.apache.org/jira/browse/HBASE-7667?focusedCommentId=13575024page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13575024 Cheers On Sun, Feb 10, 2013 at 7:43 PM, Joarder KAMAL joard...@gmail.com wrote: Hi Kevin, Thanks a lot for your great answers. Regarding Q5. To clarify, lets say Facebook is using HBase for the integrated messaging/chat/email system in a very large-scale setup. And schema design of such system can change over the years (even over the months). Workload patterns may also change due to different usage characteristics (like the rate of messaging may be higher during a protest/specific event in a particular country). So, region/table hotspots have been created at random region servers within the cluster despite careful schema design and pre-planning. The facebook team rush to split the hotspotted regions manually and redistribute them over a new set of physical machines which are recently added to the system to increase scalability in the face of high user demand. Now hotspotted region data could be transferred into new physical machines gradually to handle the situations. Now if the shard (region) size is small enough then data transfer cost over the network could be minimum otherwise large volume of data needs to be transferred instantly. I have found in many places it is discouraged to have a large number of regions systems. However, would it be possible to have very large number of regions in a system thus minimizing data transfer cost in case hotspotting due to workload/design characteristics. Is there any drawbacks or known side-effects? I am rethinking other possibilities other pre-planned schema and row-key designs. Thanks again. Regards, Joarder Kamal On 11 February 2013 13:32, Kevin O'dell kevin.od...@cloudera.com wrote: Hi Joarder, Welcome to the HBase world. Let me take some time to address your questions the best I can: 1. How often you are facing Region or Table Hotspotting in HBase production systems? --- Hotspotting is not something that just happens. This is usually caused by bad key design and writing to one region more than the others. I would recommend watching some of Lar's YouTube videos on Schema Design in HBase. 2. If a hotspot is created, how quickly it is automatically cleared out (assuming sudden workload change)? --- It will not be automatically cleared out I think you may be misinformed here. Basically, it is on you to watch you table and your write distribution and determine that you have a hotspot and take the necessary action. Usually the only action is to split the region. If hotspots become a habitual problem you would most likely want to go back and re-evaluate your current key. 3. How often this kind of situation happens - A hotspot is detected and vanished out before taking an action? or hotspots stays longer period of time? --- Please see above 4. Or if the hotspot is stays, how it is handled (in general) in production system? --- Some people have to hotspot on purpose early on, because they only write to a subset of regions. You will have to manually watch for hotspots(which is much easier in later releases). 5. How large data transfer cost is minimized or avoid for re-sharding regions within a cluster in a single data center or within WAN? --- Not quite sure what you are saying here, so I will take a best guess at it. Sharding is handled in HBase by region splitting. The best way to success in HBase is to understand your data and you needs BEFORE you create you table and start writing into HBase. This way you can presplit your table to handle the incoming data and you won't have to do a massive amounts of splits. Later you can allow HBase to split your tables manually, or you can set the maxfile size high and manually control the splits or sharding. 6. Is hotspoting in HBase cluster is really a issue (big!) nowadays for OLAP workloads and real-time analytics? --- Just design your schema correctly and this should not be a problem for you. Please let me know if this answers your questions. On Sun, Feb 10, 2013 at 9:17 PM, Joarder KAMAL joard...@gmail.com wrote: This is my first email in the group. I am having a more general and open-ended question but hope to get some reasoning from the HBase user communities. I am a very basic HBase user and still learning. My intention to use HBase in one of our research project. Recently I was looking through Lars George's book HBase - The Definitive Guide and two particular topics caught my eyes. One is 'Region and Table Hotspotting' and the other is 'Region Auto-Sharding and Merging'. *Scenario: * If a hotspot is created in a particular region or in a table (having multiple
Re: HBase Region/Table Hotspotting
The most common cause for hotspotting is inserting rows with monotonically increasing row keys. In that case only the last region will get the writes and no amount of splitting will fix that (only one region serer will hold the last region of the table regardless of how small it is). There are ways around this. If you generate keys make sure they are not monotonically increasing. For example if you do not care about the sort order of the keys w.r.t. to each other you could reverse the bytes before you use them as row key. Another option is to prefix the key with a hash of the key (but then you loose the ability to do range scan across keys). If you still need to scan rows according to their sort order you can salt (as some call it) the key by prefix it with a limited number of random single digit (maybe 5-10 different numbers). Could also do a mod of the key. Each scan then has to issue multiple scans in parallel for each of the possible prefix numbers. (In fact that is a pretty effective way to avoid hotspotting and to parallelize your scans, but it needs some client side to reconcile the parallel scans). Another reason for hotspotting is inserting new versions a of small'ish set of row keys. In that case splitting might help, because it will increase the likelyhood of all those key falling into the same region. -- Lars From: Joarder KAMAL joard...@gmail.com To: user@hbase.apache.org; d...@hbase.apache.org Sent: Sunday, February 10, 2013 6:17 PM Subject: HBase Region/Table Hotspotting This is my first email in the group. I am having a more general and open-ended question but hope to get some reasoning from the HBase user communities. I am a very basic HBase user and still learning. My intention to use HBase in one of our research project. Recently I was looking through Lars George's book HBase - The Definitive Guide and two particular topics caught my eyes. One is 'Region and Table Hotspotting' and the other is 'Region Auto-Sharding and Merging'. *Scenario: * If a hotspot is created in a particular region or in a table (having multiple regions) due to sudden workload change, then one may split the region into further small pieces and distributed it to a number of available physical machine in the cluster. This process should require large data transfer between different machines in the cluster and incur a performance cost. One may also change the 'key' definition and manage the regions. But I am not sure how effective or logical to change key designs on a production system. *Questions:* 1. How often you are facing Region or Table Hotspotting in HBase production systems? 2. If a hotspot is created, how quickly it is automatically cleared out (assuming sudden workload change)? 3. How often this kind of situation happens - A hotspot is detected and vanished out before taking an action? or hotspots stays longer period of time? 4. Or if the hotspot is stays, how it is handled (in general) in production system? 5. How large data transfer cost is minimized or avoid for re-sharding regions within a cluster in a single data center or within WAN? 6. Is hotspoting in HBase cluster is really a issue (big!) nowadays for OLAP workloads and real-time analytics? Further directions to more information about region/table hotspotting is most welcome. Many thanks in advance. Regards, Joarder Kamal
Re: Regions in transition
I think it might be reason for the splits. From 0.94.0, seems the default split policy has changes http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/ConstantSizeRegionSplitPolicy.html On Sat, Feb 9, 2013 at 3:37 PM, ramkrishna vasudevan ramkrishna.s.vasude...@gmail.com wrote: Okie. So if you don't mind can you attach your logs. What is the split policy and what is the size of the region? Its better we take a look at it and solve the problem if it is a kernel problem. Regards Ram On Fri, Feb 8, 2013 at 6:09 PM, kiran kiran.sarvabho...@gmail.com wrote: after some searching i found that main reason for regions with same keys is splitting and it is the culprit in our case for inconsistency. But I set my filesize to a very large size but i am unsure why still splitting is happening On Fri, Feb 8, 2013 at 1:20 PM, kiran kiran.sarvabho...@gmail.com wrote: Thanks samar and ramakrishna I did use option 2 and deleted /hbase from the zkcli shell. I restarted the cluster now there are more regions in transition. Will it take some time to complete since it is a new assignment ? My hase version is 0.94.1, and I have a doubt with the property hbase.hregion.max.filesize. I set it to a very large value of 100GB. But some how table regions are not splitted as per the property. On Fri, Feb 8, 2013 at 12:45 PM, samar kumar samar.opensou...@gmail.com wrote: May be ./bin/hbase hbck -repairHoles could be helpful.. On 08/02/13 12:38 PM, Samir Ahmic ahmic.sa...@gmail.com wrote: Hi, Kiran Welcome to beautiful world of HBase transition states :) . When i face RIT issue this are steps that i use to resolve it: 1. hbase hbck -fixAssignments (this depends on your version of hbase it can be also just -fix) If you don't have luck with 1. then you will need manual intervention: Remove hbase znodes from zookeeper (use hbase zkcli) and options rmr (or delete ) /hbase, and then restart cluster. This should help you resolve RITs. Regards On Fri, Feb 8, 2013 at 7:50 AM, kiran kiran.sarvabho...@gmail.com wrote: PENDING_OPEN On Fri, Feb 8, 2013 at 12:16 PM, samar kumar samar.opensou...@gmail.comwrote: Can you mention the state of the region.. You can find the details in you master status page On 08/02/13 12:09 PM, kiran kiran.sarvabho...@gmail.com wrote: We ran the command unassign 'REGIONNAME',true the output is completed in 57 sec, but still region is in transition. On Fri, Feb 8, 2013 at 12:05 PM, samar kumar samar.opensou...@gmail.comwrote: Was the unassigning successful ? If not you can force it by - unassign 'REGIONNAME', true Regards, Samar On 08/02/13 12:01 PM, kiran kiran.sarvabho...@gmail.com wrote: i issued unassign and close_region both but the region is still in transition also I deleted the .META entry for the region. Do I need to restart master ? On Fri, Feb 8, 2013 at 11:18 AM, samar kumar samar.opensou...@gmail.comwrote: Regions should never overlap Incase a region is in transition for a long long time, you could possibly force Unassign a region , if you are ok to lose the region. Regards, Samar On 08/02/13 11:14 AM, kiran kiran.sarvabho...@gmail.com wrote: are always in transition ? Because of the cluster is not balancing and regionservers are going down one by one. As only some regions servers are handling the requests. Another wierd thing is the region stuck in transition startkey matches with -- Thank you Kiran Sarvabhotla -Even a correct decision is wrong when it is taken late -- Thank you Kiran Sarvabhotla -Even a correct decision is wrong when it is taken late -- Thank you Kiran Sarvabhotla -Even a correct decision is wrong when it is taken late -- Thank you Kiran Sarvabhotla -Even a correct decision is wrong when it is taken late -- Thank you Kiran Sarvabhotla -Even a correct decision is wrong when it is taken late -- Thank you Kiran Sarvabhotla -Even a correct decision is wrong when it is taken late
Re: HBase Region/Table Hotspotting
Thanks Lars for explaining the reasons for hotspotting and key design techniques. Just wondering, is it possible to alter key design (e.g. from sequential keys to salt keys) at run time in the production system? What are the impacts? To Ted, Thanks a lot for point out at [HBASE-7667]. Interesting idea indeed. And Matt Corgan explained the trade-offs between having fewer and more regions. He also pointed out how a large number of regions can impact the compaction process. Although I am an expert on HBase system, but what did you think about how to find an optimal value of stripes or sub-region for each region? Actually I didn't get the idea of having a fixed boundary stripes. Thanks again. HBase community is really great !! Regards, Joarder Kamal On 11 February 2013 16:14, lars hofhansl la...@apache.org wrote: The most common cause for hotspotting is inserting rows with monotonically increasing row keys. In that case only the last region will get the writes and no amount of splitting will fix that (only one region serer will hold the last region of the table regardless of how small it is). There are ways around this. If you generate keys make sure they are not monotonically increasing. For example if you do not care about the sort order of the keys w.r.t. to each other you could reverse the bytes before you use them as row key. Another option is to prefix the key with a hash of the key (but then you loose the ability to do range scan across keys). If you still need to scan rows according to their sort order you can salt (as some call it) the key by prefix it with a limited number of random single digit (maybe 5-10 different numbers). Could also do a mod of the key. Each scan then has to issue multiple scans in parallel for each of the possible prefix numbers. (In fact that is a pretty effective way to avoid hotspotting and to parallelize your scans, but it needs some client side to reconcile the parallel scans). Another reason for hotspotting is inserting new versions a of small'ish set of row keys. In that case splitting might help, because it will increase the likelyhood of all those key falling into the same region. -- Lars From: Joarder KAMAL joard...@gmail.com To: user@hbase.apache.org; d...@hbase.apache.org Sent: Sunday, February 10, 2013 6:17 PM Subject: HBase Region/Table Hotspotting This is my first email in the group. I am having a more general and open-ended question but hope to get some reasoning from the HBase user communities. I am a very basic HBase user and still learning. My intention to use HBase in one of our research project. Recently I was looking through Lars George's book HBase - The Definitive Guide and two particular topics caught my eyes. One is 'Region and Table Hotspotting' and the other is 'Region Auto-Sharding and Merging'. *Scenario: * If a hotspot is created in a particular region or in a table (having multiple regions) due to sudden workload change, then one may split the region into further small pieces and distributed it to a number of available physical machine in the cluster. This process should require large data transfer between different machines in the cluster and incur a performance cost. One may also change the 'key' definition and manage the regions. But I am not sure how effective or logical to change key designs on a production system. *Questions:* 1. How often you are facing Region or Table Hotspotting in HBase production systems? 2. If a hotspot is created, how quickly it is automatically cleared out (assuming sudden workload change)? 3. How often this kind of situation happens - A hotspot is detected and vanished out before taking an action? or hotspots stays longer period of time? 4. Or if the hotspot is stays, how it is handled (in general) in production system? 5. How large data transfer cost is minimized or avoid for re-sharding regions within a cluster in a single data center or within WAN? 6. Is hotspoting in HBase cluster is really a issue (big!) nowadays for OLAP workloads and real-time analytics? Further directions to more information about region/table hotspotting is most welcome. Many thanks in advance. Regards, Joarder Kamal