Re: HBase - Performance issue
So you have large RS and you have large regions. Your regions are huge relative to your RS memory heap. (Not ideal.) You have slow drives (5400rpm) and you have 1GbE network. Do didn’t say how many drives per server. Under load, you will saturate your network with just 4 drives. (Give or take. Never tried 5400 RPM drives) So you hit one bandwidth bottleneck there. The other is the ratio of spindles to CPU. So if you have 4 drives and 8 cores… again under load, you’ll start to see an I/O bottleneck … On average, how many regions do you have per table per server? I’d consider shrinking your regions. Sometimes you need to dial back from 11 do a more reasonable listening level… ;-) HTH -Mike On Sep 8, 2014, at 8:23 AM, kiran kiran.sarvabho...@gmail.com wrote: Hi Lars, Ours is a problem of I/O wait and network bandwidth increase around the same time Lars, Sorry to say this... our's is a production cluster and we ideally should never want a downtime... Also lars, we had very miserable experience while upgrading from 0.92 to 0.94... There was a never a mention of change in split policy in the release notes... and the policy was not ideal for our cluster and it took us atleast a week to figure out that Our cluster runs on commodity hardware with big regions (5-10gb)... Region sever mem is 10gb... 2TB SATA Hard disks (5400 - 7200 rpm)... Internal network bandwidth is 1 gig So please suggest us any work around with 0.94.1 On Sun, Sep 7, 2014 at 8:42 AM, lars hofhansl la...@apache.org wrote: Thinking about it again, if you ran into a HBASE-7336 you'd see high CPU load, but *not* IOWAIT. 0.94 is at 0.94.23, you should upgrade. A lot of fixes, improvements, and performance enhancements went in since 0.94.4. You can do a rolling upgrade straight to 0.94.23. With that out of the way, can you post a jstack of the processes that experience high wait times? -- Lars -- *From:* kiran kiran.sarvabho...@gmail.com *To:* user@hbase.apache.org; lars hofhansl la...@apache.org *Sent:* Saturday, September 6, 2014 11:30 AM *Subject:* Re: HBase - Performance issue Lars, We are facing a similar situation on the similar cluster configuration... We are having high I/O wait percentages on some machines in our cluster... We have short circuit reads enabled but still we are facing the similar problem.. the cpu wait goes upto 50% also in some case while issuing scan commands with multiple threads.. Is there a work around other than applying the patch for 0.94.4 ?? Thanks Kiran On Thu, Apr 25, 2013 at 12:12 AM, lars hofhansl la...@apache.org wrote: You may have run into https://issues.apache.org/jira/browse/HBASE-7336 (which is in 0.94.4) (Although I had not observed this effect as much when short circuit reads are enabled) - Original Message - From: kzurek kzu...@proximetry.pl To: user@hbase.apache.org Cc: Sent: Wednesday, April 24, 2013 3:12 AM Subject: HBase - Performance issue The problem is that when I'm putting my data (multithreaded client, ~30MB/s traffic outgoing) into the cluster the load is equally spread over all RegionServer with 3.5% average CPU wait time (average CPU user: 51%). When I've added similar, mutlithreaded client that Scans for, let say, 100 last samples of randomly generated key from chosen time range, I'm getting high CPU wait time (20% and up) on two (or more if there is higher number of threads, default 10) random RegionServers. Therefore, machines that held those RS are getting very hot - one of the consequences is that number of store file is constantly increasing, up to the maximum limit. Rest of the RS are having 10-12% CPU wait time and everything seems to be OK (number of store files varies so they are being compacted and not increasing over time). Any ideas? Maybe I could prioritize writes over reads somehow? Is it possible? If so what would be the best way to that and where it should be placed - on the client or cluster side)? Cluster specification: HBase Version0.94.2-cdh4.2.0 Hadoop Version2.0.0-cdh4.2.0 There are 6xDataNodes (5xHDD for storing data), 1xMasterNodes Other settings: - Bloom filters (ROWCOL) set - Short circuit turned on - HDFS Block Size: 128MB - Java Heap Size of Namenode/Secondary Namenode in Bytes: 8 GiB - Java Heap Size of HBase RegionServer in Bytes: 12 GiB - Java Heap Size of HBase Master in Bytes: 4 GiB - Java Heap Size of DataNode in Bytes: 1 GiB (default) Number of regions per RegionServer: 19 (total 114 regions on 6 RS) Key design: UUIDTIMESTAMP - UUID: 1-10M, TIMESTAMP: 1-N Table design: 1 column family with 20 columns of 8 bytes Get client: Multiple threads Each thread have its own tables instance with their Scanner. Each thread have its own range of UUIDs and randomly draws beginning of time range to build rowkey properly (see above). Each time Scan requests same
Re: HBase - Performance issue
Hi Lars, Ours is a problem of I/O wait and network bandwidth increase around the same time Lars, Sorry to say this... our's is a production cluster and we ideally should never want a downtime... Also lars, we had very miserable experience while upgrading from 0.92 to 0.94... There was a never a mention of change in split policy in the release notes... and the policy was not ideal for our cluster and it took us atleast a week to figure out that Our cluster runs on commodity hardware with big regions (5-10gb)... Region sever mem is 10gb... 2TB SATA Hard disks (5400 - 7200 rpm)... Internal network bandwidth is 1 gig So please suggest us any work around with 0.94.1 On Sun, Sep 7, 2014 at 8:42 AM, lars hofhansl la...@apache.org wrote: Thinking about it again, if you ran into a HBASE-7336 you'd see high CPU load, but *not* IOWAIT. 0.94 is at 0.94.23, you should upgrade. A lot of fixes, improvements, and performance enhancements went in since 0.94.4. You can do a rolling upgrade straight to 0.94.23. With that out of the way, can you post a jstack of the processes that experience high wait times? -- Lars -- *From:* kiran kiran.sarvabho...@gmail.com *To:* user@hbase.apache.org; lars hofhansl la...@apache.org *Sent:* Saturday, September 6, 2014 11:30 AM *Subject:* Re: HBase - Performance issue Lars, We are facing a similar situation on the similar cluster configuration... We are having high I/O wait percentages on some machines in our cluster... We have short circuit reads enabled but still we are facing the similar problem.. the cpu wait goes upto 50% also in some case while issuing scan commands with multiple threads.. Is there a work around other than applying the patch for 0.94.4 ?? Thanks Kiran On Thu, Apr 25, 2013 at 12:12 AM, lars hofhansl la...@apache.org wrote: You may have run into https://issues.apache.org/jira/browse/HBASE-7336 (which is in 0.94.4) (Although I had not observed this effect as much when short circuit reads are enabled) - Original Message - From: kzurek kzu...@proximetry.pl To: user@hbase.apache.org Cc: Sent: Wednesday, April 24, 2013 3:12 AM Subject: HBase - Performance issue The problem is that when I'm putting my data (multithreaded client, ~30MB/s traffic outgoing) into the cluster the load is equally spread over all RegionServer with 3.5% average CPU wait time (average CPU user: 51%). When I've added similar, mutlithreaded client that Scans for, let say, 100 last samples of randomly generated key from chosen time range, I'm getting high CPU wait time (20% and up) on two (or more if there is higher number of threads, default 10) random RegionServers. Therefore, machines that held those RS are getting very hot - one of the consequences is that number of store file is constantly increasing, up to the maximum limit. Rest of the RS are having 10-12% CPU wait time and everything seems to be OK (number of store files varies so they are being compacted and not increasing over time). Any ideas? Maybe I could prioritize writes over reads somehow? Is it possible? If so what would be the best way to that and where it should be placed - on the client or cluster side)? Cluster specification: HBase Version0.94.2-cdh4.2.0 Hadoop Version2.0.0-cdh4.2.0 There are 6xDataNodes (5xHDD for storing data), 1xMasterNodes Other settings: - Bloom filters (ROWCOL) set - Short circuit turned on - HDFS Block Size: 128MB - Java Heap Size of Namenode/Secondary Namenode in Bytes: 8 GiB - Java Heap Size of HBase RegionServer in Bytes: 12 GiB - Java Heap Size of HBase Master in Bytes: 4 GiB - Java Heap Size of DataNode in Bytes: 1 GiB (default) Number of regions per RegionServer: 19 (total 114 regions on 6 RS) Key design: UUIDTIMESTAMP - UUID: 1-10M, TIMESTAMP: 1-N Table design: 1 column family with 20 columns of 8 bytes Get client: Multiple threads Each thread have its own tables instance with their Scanner. Each thread have its own range of UUIDs and randomly draws beginning of time range to build rowkey properly (see above). Each time Scan requests same amount of rows, but with random rowkey. -- View this message in context: http://apache-hbase.679495.n3.nabble.com/HBase-Performance-issue-tp4042836.html Sent from the HBase User mailing list archive at Nabble.com. -- Thank you Kiran Sarvabhotla -Even a correct decision is wrong when it is taken late -- Thank you Kiran Sarvabhotla -Even a correct decision is wrong when it is taken late
Re: HBase - Performance issue
We have this setting enabled also... property namedfs.client.read.shortcircuit/name valuetrue/value /property On Mon, Sep 8, 2014 at 12:53 PM, kiran kiran.sarvabho...@gmail.com wrote: Hi Lars, Ours is a problem of I/O wait and network bandwidth increase around the same time Lars, Sorry to say this... our's is a production cluster and we ideally should never want a downtime... Also lars, we had very miserable experience while upgrading from 0.92 to 0.94... There was a never a mention of change in split policy in the release notes... and the policy was not ideal for our cluster and it took us atleast a week to figure out that Our cluster runs on commodity hardware with big regions (5-10gb)... Region sever mem is 10gb... 2TB SATA Hard disks (5400 - 7200 rpm)... Internal network bandwidth is 1 gig So please suggest us any work around with 0.94.1 On Sun, Sep 7, 2014 at 8:42 AM, lars hofhansl la...@apache.org wrote: Thinking about it again, if you ran into a HBASE-7336 you'd see high CPU load, but *not* IOWAIT. 0.94 is at 0.94.23, you should upgrade. A lot of fixes, improvements, and performance enhancements went in since 0.94.4. You can do a rolling upgrade straight to 0.94.23. With that out of the way, can you post a jstack of the processes that experience high wait times? -- Lars -- *From:* kiran kiran.sarvabho...@gmail.com *To:* user@hbase.apache.org; lars hofhansl la...@apache.org *Sent:* Saturday, September 6, 2014 11:30 AM *Subject:* Re: HBase - Performance issue Lars, We are facing a similar situation on the similar cluster configuration... We are having high I/O wait percentages on some machines in our cluster... We have short circuit reads enabled but still we are facing the similar problem.. the cpu wait goes upto 50% also in some case while issuing scan commands with multiple threads.. Is there a work around other than applying the patch for 0.94.4 ?? Thanks Kiran On Thu, Apr 25, 2013 at 12:12 AM, lars hofhansl la...@apache.org wrote: You may have run into https://issues.apache.org/jira/browse/HBASE-7336 (which is in 0.94.4) (Although I had not observed this effect as much when short circuit reads are enabled) - Original Message - From: kzurek kzu...@proximetry.pl To: user@hbase.apache.org Cc: Sent: Wednesday, April 24, 2013 3:12 AM Subject: HBase - Performance issue The problem is that when I'm putting my data (multithreaded client, ~30MB/s traffic outgoing) into the cluster the load is equally spread over all RegionServer with 3.5% average CPU wait time (average CPU user: 51%). When I've added similar, mutlithreaded client that Scans for, let say, 100 last samples of randomly generated key from chosen time range, I'm getting high CPU wait time (20% and up) on two (or more if there is higher number of threads, default 10) random RegionServers. Therefore, machines that held those RS are getting very hot - one of the consequences is that number of store file is constantly increasing, up to the maximum limit. Rest of the RS are having 10-12% CPU wait time and everything seems to be OK (number of store files varies so they are being compacted and not increasing over time). Any ideas? Maybe I could prioritize writes over reads somehow? Is it possible? If so what would be the best way to that and where it should be placed - on the client or cluster side)? Cluster specification: HBase Version0.94.2-cdh4.2.0 Hadoop Version2.0.0-cdh4.2.0 There are 6xDataNodes (5xHDD for storing data), 1xMasterNodes Other settings: - Bloom filters (ROWCOL) set - Short circuit turned on - HDFS Block Size: 128MB - Java Heap Size of Namenode/Secondary Namenode in Bytes: 8 GiB - Java Heap Size of HBase RegionServer in Bytes: 12 GiB - Java Heap Size of HBase Master in Bytes: 4 GiB - Java Heap Size of DataNode in Bytes: 1 GiB (default) Number of regions per RegionServer: 19 (total 114 regions on 6 RS) Key design: UUIDTIMESTAMP - UUID: 1-10M, TIMESTAMP: 1-N Table design: 1 column family with 20 columns of 8 bytes Get client: Multiple threads Each thread have its own tables instance with their Scanner. Each thread have its own range of UUIDs and randomly draws beginning of time range to build rowkey properly (see above). Each time Scan requests same amount of rows, but with random rowkey. -- View this message in context: http://apache-hbase.679495.n3.nabble.com/HBase-Performance-issue-tp4042836.html Sent from the HBase User mailing list archive at Nabble.com. -- Thank you Kiran Sarvabhotla -Even a correct decision is wrong when it is taken late -- Thank you Kiran Sarvabhotla -Even a correct decision is wrong when it is taken late -- Thank you Kiran Sarvabhotla -Even a correct decision is wrong when it is taken late
Re: HBase - Performance issue
What about providing the jstack as Lars suggested? That doesn't require you to upgrade (yet) 0.94.23 is the same major version as 0.94.1. Upgrading to this version is not the same process as a major upgrade from 0.92 to 0.94. Changes like the split policy difference you mention don't happen in point releases. You should consider upgrading to the latest 0.94.x, if not now than at some point, because a volunteer open source community really can only support the latest release of a major version. You can insist on working with a (now, very old) release, but we might not be able to help you much. On Mon, Sep 8, 2014 at 12:23 AM, kiran kiran.sarvabho...@gmail.com wrote: Hi Lars, Ours is a problem of I/O wait and network bandwidth increase around the same time Lars, Sorry to say this... our's is a production cluster and we ideally should never want a downtime... Also lars, we had very miserable experience while upgrading from 0.92 to 0.94... There was a never a mention of change in split policy in the release notes... and the policy was not ideal for our cluster and it took us atleast a week to figure out that Our cluster runs on commodity hardware with big regions (5-10gb)... Region sever mem is 10gb... 2TB SATA Hard disks (5400 - 7200 rpm)... Internal network bandwidth is 1 gig So please suggest us any work around with 0.94.1 On Sun, Sep 7, 2014 at 8:42 AM, lars hofhansl la...@apache.org wrote: Thinking about it again, if you ran into a HBASE-7336 you'd see high CPU load, but *not* IOWAIT. 0.94 is at 0.94.23, you should upgrade. A lot of fixes, improvements, and performance enhancements went in since 0.94.4. You can do a rolling upgrade straight to 0.94.23. With that out of the way, can you post a jstack of the processes that experience high wait times? -- Lars -- *From:* kiran kiran.sarvabho...@gmail.com *To:* user@hbase.apache.org; lars hofhansl la...@apache.org *Sent:* Saturday, September 6, 2014 11:30 AM *Subject:* Re: HBase - Performance issue Lars, We are facing a similar situation on the similar cluster configuration... We are having high I/O wait percentages on some machines in our cluster... We have short circuit reads enabled but still we are facing the similar problem.. the cpu wait goes upto 50% also in some case while issuing scan commands with multiple threads.. Is there a work around other than applying the patch for 0.94.4 ?? Thanks Kiran On Thu, Apr 25, 2013 at 12:12 AM, lars hofhansl la...@apache.org wrote: You may have run into https://issues.apache.org/jira/browse/HBASE-7336 (which is in 0.94.4) (Although I had not observed this effect as much when short circuit reads are enabled) - Original Message - From: kzurek kzu...@proximetry.pl To: user@hbase.apache.org Cc: Sent: Wednesday, April 24, 2013 3:12 AM Subject: HBase - Performance issue The problem is that when I'm putting my data (multithreaded client, ~30MB/s traffic outgoing) into the cluster the load is equally spread over all RegionServer with 3.5% average CPU wait time (average CPU user: 51%). When I've added similar, mutlithreaded client that Scans for, let say, 100 last samples of randomly generated key from chosen time range, I'm getting high CPU wait time (20% and up) on two (or more if there is higher number of threads, default 10) random RegionServers. Therefore, machines that held those RS are getting very hot - one of the consequences is that number of store file is constantly increasing, up to the maximum limit. Rest of the RS are having 10-12% CPU wait time and everything seems to be OK (number of store files varies so they are being compacted and not increasing over time). Any ideas? Maybe I could prioritize writes over reads somehow? Is it possible? If so what would be the best way to that and where it should be placed - on the client or cluster side)? Cluster specification: HBase Version0.94.2-cdh4.2.0 Hadoop Version2.0.0-cdh4.2.0 There are 6xDataNodes (5xHDD for storing data), 1xMasterNodes Other settings: - Bloom filters (ROWCOL) set - Short circuit turned on - HDFS Block Size: 128MB - Java Heap Size of Namenode/Secondary Namenode in Bytes: 8 GiB - Java Heap Size of HBase RegionServer in Bytes: 12 GiB - Java Heap Size of HBase Master in Bytes: 4 GiB - Java Heap Size of DataNode in Bytes: 1 GiB (default) Number of regions per RegionServer: 19 (total 114 regions on 6 RS) Key design: UUIDTIMESTAMP - UUID: 1-10M, TIMESTAMP: 1-N Table design: 1 column family with 20 columns of 8 bytes Get client: Multiple threads Each thread have its own tables instance with their Scanner. Each thread have its own range of UUIDs and randomly draws beginning of time range to build rowkey properly (see above). Each time Scan requests same amount of rows, but with random rowkey. -- View this message in context: http://apache-hbase.679495.n3.nabble.com/HBase
Re: HBase - Performance issue
Lars, We are facing a similar situation on the similar cluster configuration... We are having high I/O wait percentages on some machines in our cluster... We have short circuit reads enabled but still we are facing the similar problem.. the cpu wait goes upto 50% also in some case while issuing scan commands with multiple threads.. Is there a work around other than applying the patch for 0.94.4 ?? Thanks Kiran On Thu, Apr 25, 2013 at 12:12 AM, lars hofhansl la...@apache.org wrote: You may have run into https://issues.apache.org/jira/browse/HBASE-7336 (which is in 0.94.4) (Although I had not observed this effect as much when short circuit reads are enabled) - Original Message - From: kzurek kzu...@proximetry.pl To: user@hbase.apache.org Cc: Sent: Wednesday, April 24, 2013 3:12 AM Subject: HBase - Performance issue The problem is that when I'm putting my data (multithreaded client, ~30MB/s traffic outgoing) into the cluster the load is equally spread over all RegionServer with 3.5% average CPU wait time (average CPU user: 51%). When I've added similar, mutlithreaded client that Scans for, let say, 100 last samples of randomly generated key from chosen time range, I'm getting high CPU wait time (20% and up) on two (or more if there is higher number of threads, default 10) random RegionServers. Therefore, machines that held those RS are getting very hot - one of the consequences is that number of store file is constantly increasing, up to the maximum limit. Rest of the RS are having 10-12% CPU wait time and everything seems to be OK (number of store files varies so they are being compacted and not increasing over time). Any ideas? Maybe I could prioritize writes over reads somehow? Is it possible? If so what would be the best way to that and where it should be placed - on the client or cluster side)? Cluster specification: HBase Version0.94.2-cdh4.2.0 Hadoop Version2.0.0-cdh4.2.0 There are 6xDataNodes (5xHDD for storing data), 1xMasterNodes Other settings: - Bloom filters (ROWCOL) set - Short circuit turned on - HDFS Block Size: 128MB - Java Heap Size of Namenode/Secondary Namenode in Bytes: 8 GiB - Java Heap Size of HBase RegionServer in Bytes: 12 GiB - Java Heap Size of HBase Master in Bytes: 4 GiB - Java Heap Size of DataNode in Bytes: 1 GiB (default) Number of regions per RegionServer: 19 (total 114 regions on 6 RS) Key design: UUIDTIMESTAMP - UUID: 1-10M, TIMESTAMP: 1-N Table design: 1 column family with 20 columns of 8 bytes Get client: Multiple threads Each thread have its own tables instance with their Scanner. Each thread have its own range of UUIDs and randomly draws beginning of time range to build rowkey properly (see above). Each time Scan requests same amount of rows, but with random rowkey. -- View this message in context: http://apache-hbase.679495.n3.nabble.com/HBase-Performance-issue-tp4042836.html Sent from the HBase User mailing list archive at Nabble.com. -- Thank you Kiran Sarvabhotla -Even a correct decision is wrong when it is taken late
Re: HBase - Performance issue
Also the hbase version is 0.94.1 On Sun, Sep 7, 2014 at 12:00 AM, kiran kiran.sarvabho...@gmail.com wrote: Lars, We are facing a similar situation on the similar cluster configuration... We are having high I/O wait percentages on some machines in our cluster... We have short circuit reads enabled but still we are facing the similar problem.. the cpu wait goes upto 50% also in some case while issuing scan commands with multiple threads.. Is there a work around other than applying the patch for 0.94.4 ?? Thanks Kiran On Thu, Apr 25, 2013 at 12:12 AM, lars hofhansl la...@apache.org wrote: You may have run into https://issues.apache.org/jira/browse/HBASE-7336 (which is in 0.94.4) (Although I had not observed this effect as much when short circuit reads are enabled) - Original Message - From: kzurek kzu...@proximetry.pl To: user@hbase.apache.org Cc: Sent: Wednesday, April 24, 2013 3:12 AM Subject: HBase - Performance issue The problem is that when I'm putting my data (multithreaded client, ~30MB/s traffic outgoing) into the cluster the load is equally spread over all RegionServer with 3.5% average CPU wait time (average CPU user: 51%). When I've added similar, mutlithreaded client that Scans for, let say, 100 last samples of randomly generated key from chosen time range, I'm getting high CPU wait time (20% and up) on two (or more if there is higher number of threads, default 10) random RegionServers. Therefore, machines that held those RS are getting very hot - one of the consequences is that number of store file is constantly increasing, up to the maximum limit. Rest of the RS are having 10-12% CPU wait time and everything seems to be OK (number of store files varies so they are being compacted and not increasing over time). Any ideas? Maybe I could prioritize writes over reads somehow? Is it possible? If so what would be the best way to that and where it should be placed - on the client or cluster side)? Cluster specification: HBase Version0.94.2-cdh4.2.0 Hadoop Version2.0.0-cdh4.2.0 There are 6xDataNodes (5xHDD for storing data), 1xMasterNodes Other settings: - Bloom filters (ROWCOL) set - Short circuit turned on - HDFS Block Size: 128MB - Java Heap Size of Namenode/Secondary Namenode in Bytes: 8 GiB - Java Heap Size of HBase RegionServer in Bytes: 12 GiB - Java Heap Size of HBase Master in Bytes: 4 GiB - Java Heap Size of DataNode in Bytes: 1 GiB (default) Number of regions per RegionServer: 19 (total 114 regions on 6 RS) Key design: UUIDTIMESTAMP - UUID: 1-10M, TIMESTAMP: 1-N Table design: 1 column family with 20 columns of 8 bytes Get client: Multiple threads Each thread have its own tables instance with their Scanner. Each thread have its own range of UUIDs and randomly draws beginning of time range to build rowkey properly (see above). Each time Scan requests same amount of rows, but with random rowkey. -- View this message in context: http://apache-hbase.679495.n3.nabble.com/HBase-Performance-issue-tp4042836.html Sent from the HBase User mailing list archive at Nabble.com. -- Thank you Kiran Sarvabhotla -Even a correct decision is wrong when it is taken late -- Thank you Kiran Sarvabhotla -Even a correct decision is wrong when it is taken late
Re: HBase - Performance issue
What type of drives. controllers, and network bandwidth do you have? Just curious. On Sep 6, 2014, at 7:37 PM, kiran kiran.sarvabho...@gmail.com wrote: Also the hbase version is 0.94.1 On Sun, Sep 7, 2014 at 12:00 AM, kiran kiran.sarvabho...@gmail.com wrote: Lars, We are facing a similar situation on the similar cluster configuration... We are having high I/O wait percentages on some machines in our cluster... We have short circuit reads enabled but still we are facing the similar problem.. the cpu wait goes upto 50% also in some case while issuing scan commands with multiple threads.. Is there a work around other than applying the patch for 0.94.4 ?? Thanks Kiran On Thu, Apr 25, 2013 at 12:12 AM, lars hofhansl la...@apache.org wrote: You may have run into https://issues.apache.org/jira/browse/HBASE-7336 (which is in 0.94.4) (Although I had not observed this effect as much when short circuit reads are enabled) - Original Message - From: kzurek kzu...@proximetry.pl To: user@hbase.apache.org Cc: Sent: Wednesday, April 24, 2013 3:12 AM Subject: HBase - Performance issue The problem is that when I'm putting my data (multithreaded client, ~30MB/s traffic outgoing) into the cluster the load is equally spread over all RegionServer with 3.5% average CPU wait time (average CPU user: 51%). When I've added similar, mutlithreaded client that Scans for, let say, 100 last samples of randomly generated key from chosen time range, I'm getting high CPU wait time (20% and up) on two (or more if there is higher number of threads, default 10) random RegionServers. Therefore, machines that held those RS are getting very hot - one of the consequences is that number of store file is constantly increasing, up to the maximum limit. Rest of the RS are having 10-12% CPU wait time and everything seems to be OK (number of store files varies so they are being compacted and not increasing over time). Any ideas? Maybe I could prioritize writes over reads somehow? Is it possible? If so what would be the best way to that and where it should be placed - on the client or cluster side)? Cluster specification: HBase Version0.94.2-cdh4.2.0 Hadoop Version2.0.0-cdh4.2.0 There are 6xDataNodes (5xHDD for storing data), 1xMasterNodes Other settings: - Bloom filters (ROWCOL) set - Short circuit turned on - HDFS Block Size: 128MB - Java Heap Size of Namenode/Secondary Namenode in Bytes: 8 GiB - Java Heap Size of HBase RegionServer in Bytes: 12 GiB - Java Heap Size of HBase Master in Bytes: 4 GiB - Java Heap Size of DataNode in Bytes: 1 GiB (default) Number of regions per RegionServer: 19 (total 114 regions on 6 RS) Key design: UUIDTIMESTAMP - UUID: 1-10M, TIMESTAMP: 1-N Table design: 1 column family with 20 columns of 8 bytes Get client: Multiple threads Each thread have its own tables instance with their Scanner. Each thread have its own range of UUIDs and randomly draws beginning of time range to build rowkey properly (see above). Each time Scan requests same amount of rows, but with random rowkey. -- View this message in context: http://apache-hbase.679495.n3.nabble.com/HBase-Performance-issue-tp4042836.html Sent from the HBase User mailing list archive at Nabble.com. -- Thank you Kiran Sarvabhotla -Even a correct decision is wrong when it is taken late -- Thank you Kiran Sarvabhotla -Even a correct decision is wrong when it is taken late
Re: HBase - Performance issue
Thinking about it again, if you ran into a HBASE-7336 you'd see high CPU load, but *not* IOWAIT. 0.94 is at 0.94.23, you should upgrade. A lot of fixes, improvements, and performance enhancements went in since 0.94.4. You can do a rolling upgrade straight to 0.94.23. With that out of the way, can you post a jstack of the processes that experience high wait times? -- Lars From: kiran kiran.sarvabho...@gmail.com To: user@hbase.apache.org; lars hofhansl la...@apache.org Sent: Saturday, September 6, 2014 11:30 AM Subject: Re: HBase - Performance issue Lars, We are facing a similar situation on the similar cluster configuration... We are having high I/O wait percentages on some machines in our cluster... We have short circuit reads enabled but still we are facing the similar problem.. the cpu wait goes upto 50% also in some case while issuing scan commands with multiple threads.. Is there a work around other than applying the patch for 0.94.4 ?? Thanks Kiran On Thu, Apr 25, 2013 at 12:12 AM, lars hofhansl la...@apache.org wrote: You may have run into https://issues.apache.org/jira/browse/HBASE-7336 (which is in 0.94.4) (Although I had not observed this effect as much when short circuit reads are enabled) - Original Message - From: kzurek kzu...@proximetry.pl To: user@hbase.apache.org Cc: Sent: Wednesday, April 24, 2013 3:12 AM Subject: HBase - Performance issue The problem is that when I'm putting my data (multithreaded client, ~30MB/s traffic outgoing) into the cluster the load is equally spread over all RegionServer with 3.5% average CPU wait time (average CPU user: 51%). When I've added similar, mutlithreaded client that Scans for, let say, 100 last samples of randomly generated key from chosen time range, I'm getting high CPU wait time (20% and up) on two (or more if there is higher number of threads, default 10) random RegionServers. Therefore, machines that held those RS are getting very hot - one of the consequences is that number of store file is constantly increasing, up to the maximum limit. Rest of the RS are having 10-12% CPU wait time and everything seems to be OK (number of store files varies so they are being compacted and not increasing over time). Any ideas? Maybe I could prioritize writes over reads somehow? Is it possible? If so what would be the best way to that and where it should be placed - on the client or cluster side)? Cluster specification: HBase Version0.94.2-cdh4.2.0 Hadoop Version2.0.0-cdh4.2.0 There are 6xDataNodes (5xHDD for storing data), 1xMasterNodes Other settings: - Bloom filters (ROWCOL) set - Short circuit turned on - HDFS Block Size: 128MB - Java Heap Size of Namenode/Secondary Namenode in Bytes: 8 GiB - Java Heap Size of HBase RegionServer in Bytes: 12 GiB - Java Heap Size of HBase Master in Bytes: 4 GiB - Java Heap Size of DataNode in Bytes: 1 GiB (default) Number of regions per RegionServer: 19 (total 114 regions on 6 RS) Key design: UUIDTIMESTAMP - UUID: 1-10M, TIMESTAMP: 1-N Table design: 1 column family with 20 columns of 8 bytes Get client: Multiple threads Each thread have its own tables instance with their Scanner. Each thread have its own range of UUIDs and randomly draws beginning of time range to build rowkey properly (see above). Each time Scan requests same amount of rows, but with random rowkey. -- View this message in context: http://apache-hbase.679495.n3.nabble.com/HBase-Performance-issue-tp4042836.html Sent from the HBase User mailing list archive at Nabble.com. -- Thank you Kiran Sarvabhotla -Even a correct decision is wrong when it is taken late
Re: Hbase Performance Issue
I am thankful to all for taking out time and suggesting me the solutions. Recently, i have implemented a solution using the bulk load. It seems little faster than the one using the API's, but it still takes far too long than compared to HDFS. Processing and saving of 10 GB data on HDFS takes only 3 mins witn 10 nodes cluster whereas Hbase taking about 30 mins. Bulk load has its own issue that i need to partition the table before hand, otherwise it runs only one reducer. I am working over a platform where I can't anticipate how much data is going to be loaded in the table and its difficult to pre-split the table. Is there any way that i can run multiple reducers without pre-splitting the table? On Wed, Jan 8, 2014 at 4:53 AM, Suraj Varma svarma...@gmail.com wrote: Akhtar: There is no manual step for bulk load. You essentially have your script that runs the map reduce job that creates the HFiles. On success of this script/command, you run the completebulkload command ... the whole bulk load can be automated, just like your map reduce job. --Suraj On Mon, Jan 6, 2014 at 11:42 AM, Mike Axiak m...@axiak.net wrote: I suggest you look at hannibal [1] to look at the distribution of the data on your cluster: 1: https://github.com/sentric/hannibal On Mon, Jan 6, 2014 at 2:14 PM, Doug Meil doug.m...@explorysmedical.com wrote: In addition to what everybody else said, look what *where* the regions are for the target table. There may be 5 regions (for example), but look to see if they are all on the same RS. On 1/6/14 5:45 AM, Nicolas Liochon nkey...@gmail.com wrote: It's very strange that you don't see a perf improvement when you increase the number of nodes. Nothing in what you've done change the performances at the end? You may want to check: - the number of regions for this table. Are all the region server busy? Do you have some split on the table? - How much data you actually write. Is the compression enabled on this table? - Do you have compactions? You may want to change the max store file settings for unfrequent write load (see http://gbif.blogspot.fr/2012/07/optimizing-writes-in-hbase.html). It would be interesting to test as well the 0.96 release. On Sun, Jan 5, 2014 at 2:12 AM, Vladimir Rodionov vrodio...@carrieriq.comwrote: I think in this case, writing data to HDFS or HFile directly (for subsequent bulk loading) is the best option. HBase will never compete in write speed with HDFS. Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Ted Yu [yuzhih...@gmail.com] Sent: Saturday, January 04, 2014 2:33 PM To: user@hbase.apache.org Subject: Re: Hbase Performance Issue There're 8 items under: http://hbase.apache.org/book.html#perf.writing I guess you have through all of them :-) On Sat, Jan 4, 2014 at 1:34 PM, Akhtar Muhammad Din akhtar.m...@gmail.comwrote: Thanks guys for your precious time. Vladimir, as Ted rightly said i want to improve write performance currently (of course i want to read data as fast as possible later on) Kevin, my current understanding of bulk load is that you generate StoreFiles and later load through a command line program. I dont want to do any manual step. Our system is getting data after every 15 minutes, so requirement is to automate it through client API completely. Confidentiality Notice: The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or Notifications@carrieriq.comand delete or destroy any copy of this message and its attachments. -- Regards Akhtar Muhammad Din
Re: Hbase Performance Issue
Akhtar: There is no manual step for bulk load. You essentially have your script that runs the map reduce job that creates the HFiles. On success of this script/command, you run the completebulkload command ... the whole bulk load can be automated, just like your map reduce job. --Suraj On Mon, Jan 6, 2014 at 11:42 AM, Mike Axiak m...@axiak.net wrote: I suggest you look at hannibal [1] to look at the distribution of the data on your cluster: 1: https://github.com/sentric/hannibal On Mon, Jan 6, 2014 at 2:14 PM, Doug Meil doug.m...@explorysmedical.com wrote: In addition to what everybody else said, look what *where* the regions are for the target table. There may be 5 regions (for example), but look to see if they are all on the same RS. On 1/6/14 5:45 AM, Nicolas Liochon nkey...@gmail.com wrote: It's very strange that you don't see a perf improvement when you increase the number of nodes. Nothing in what you've done change the performances at the end? You may want to check: - the number of regions for this table. Are all the region server busy? Do you have some split on the table? - How much data you actually write. Is the compression enabled on this table? - Do you have compactions? You may want to change the max store file settings for unfrequent write load (see http://gbif.blogspot.fr/2012/07/optimizing-writes-in-hbase.html). It would be interesting to test as well the 0.96 release. On Sun, Jan 5, 2014 at 2:12 AM, Vladimir Rodionov vrodio...@carrieriq.comwrote: I think in this case, writing data to HDFS or HFile directly (for subsequent bulk loading) is the best option. HBase will never compete in write speed with HDFS. Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Ted Yu [yuzhih...@gmail.com] Sent: Saturday, January 04, 2014 2:33 PM To: user@hbase.apache.org Subject: Re: Hbase Performance Issue There're 8 items under: http://hbase.apache.org/book.html#perf.writing I guess you have through all of them :-) On Sat, Jan 4, 2014 at 1:34 PM, Akhtar Muhammad Din akhtar.m...@gmail.comwrote: Thanks guys for your precious time. Vladimir, as Ted rightly said i want to improve write performance currently (of course i want to read data as fast as possible later on) Kevin, my current understanding of bulk load is that you generate StoreFiles and later load through a command line program. I dont want to do any manual step. Our system is getting data after every 15 minutes, so requirement is to automate it through client API completely. Confidentiality Notice: The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or notificati...@carrieriq.com and delete or destroy any copy of this message and its attachments.
Re: Hbase Performance Issue
It's very strange that you don't see a perf improvement when you increase the number of nodes. Nothing in what you've done change the performances at the end? You may want to check: - the number of regions for this table. Are all the region server busy? Do you have some split on the table? - How much data you actually write. Is the compression enabled on this table? - Do you have compactions? You may want to change the max store file settings for unfrequent write load (see http://gbif.blogspot.fr/2012/07/optimizing-writes-in-hbase.html). It would be interesting to test as well the 0.96 release. On Sun, Jan 5, 2014 at 2:12 AM, Vladimir Rodionov vrodio...@carrieriq.comwrote: I think in this case, writing data to HDFS or HFile directly (for subsequent bulk loading) is the best option. HBase will never compete in write speed with HDFS. Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Ted Yu [yuzhih...@gmail.com] Sent: Saturday, January 04, 2014 2:33 PM To: user@hbase.apache.org Subject: Re: Hbase Performance Issue There're 8 items under: http://hbase.apache.org/book.html#perf.writing I guess you have through all of them :-) On Sat, Jan 4, 2014 at 1:34 PM, Akhtar Muhammad Din akhtar.m...@gmail.comwrote: Thanks guys for your precious time. Vladimir, as Ted rightly said i want to improve write performance currently (of course i want to read data as fast as possible later on) Kevin, my current understanding of bulk load is that you generate StoreFiles and later load through a command line program. I dont want to do any manual step. Our system is getting data after every 15 minutes, so requirement is to automate it through client API completely. Confidentiality Notice: The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or notificati...@carrieriq.com and delete or destroy any copy of this message and its attachments.
Re: Hbase Performance Issue
In addition to what everybody else said, look what *where* the regions are for the target table. There may be 5 regions (for example), but look to see if they are all on the same RS. On 1/6/14 5:45 AM, Nicolas Liochon nkey...@gmail.com wrote: It's very strange that you don't see a perf improvement when you increase the number of nodes. Nothing in what you've done change the performances at the end? You may want to check: - the number of regions for this table. Are all the region server busy? Do you have some split on the table? - How much data you actually write. Is the compression enabled on this table? - Do you have compactions? You may want to change the max store file settings for unfrequent write load (see http://gbif.blogspot.fr/2012/07/optimizing-writes-in-hbase.html). It would be interesting to test as well the 0.96 release. On Sun, Jan 5, 2014 at 2:12 AM, Vladimir Rodionov vrodio...@carrieriq.comwrote: I think in this case, writing data to HDFS or HFile directly (for subsequent bulk loading) is the best option. HBase will never compete in write speed with HDFS. Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Ted Yu [yuzhih...@gmail.com] Sent: Saturday, January 04, 2014 2:33 PM To: user@hbase.apache.org Subject: Re: Hbase Performance Issue There're 8 items under: http://hbase.apache.org/book.html#perf.writing I guess you have through all of them :-) On Sat, Jan 4, 2014 at 1:34 PM, Akhtar Muhammad Din akhtar.m...@gmail.comwrote: Thanks guys for your precious time. Vladimir, as Ted rightly said i want to improve write performance currently (of course i want to read data as fast as possible later on) Kevin, my current understanding of bulk load is that you generate StoreFiles and later load through a command line program. I dont want to do any manual step. Our system is getting data after every 15 minutes, so requirement is to automate it through client API completely. Confidentiality Notice: The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or notificati...@carrieriq.com and delete or destroy any copy of this message and its attachments.
Re: Hbase Performance Issue
I suggest you look at hannibal [1] to look at the distribution of the data on your cluster: 1: https://github.com/sentric/hannibal On Mon, Jan 6, 2014 at 2:14 PM, Doug Meil doug.m...@explorysmedical.comwrote: In addition to what everybody else said, look what *where* the regions are for the target table. There may be 5 regions (for example), but look to see if they are all on the same RS. On 1/6/14 5:45 AM, Nicolas Liochon nkey...@gmail.com wrote: It's very strange that you don't see a perf improvement when you increase the number of nodes. Nothing in what you've done change the performances at the end? You may want to check: - the number of regions for this table. Are all the region server busy? Do you have some split on the table? - How much data you actually write. Is the compression enabled on this table? - Do you have compactions? You may want to change the max store file settings for unfrequent write load (see http://gbif.blogspot.fr/2012/07/optimizing-writes-in-hbase.html). It would be interesting to test as well the 0.96 release. On Sun, Jan 5, 2014 at 2:12 AM, Vladimir Rodionov vrodio...@carrieriq.comwrote: I think in this case, writing data to HDFS or HFile directly (for subsequent bulk loading) is the best option. HBase will never compete in write speed with HDFS. Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Ted Yu [yuzhih...@gmail.com] Sent: Saturday, January 04, 2014 2:33 PM To: user@hbase.apache.org Subject: Re: Hbase Performance Issue There're 8 items under: http://hbase.apache.org/book.html#perf.writing I guess you have through all of them :-) On Sat, Jan 4, 2014 at 1:34 PM, Akhtar Muhammad Din akhtar.m...@gmail.comwrote: Thanks guys for your precious time. Vladimir, as Ted rightly said i want to improve write performance currently (of course i want to read data as fast as possible later on) Kevin, my current understanding of bulk load is that you generate StoreFiles and later load through a command line program. I dont want to do any manual step. Our system is getting data after every 15 minutes, so requirement is to automate it through client API completely. Confidentiality Notice: The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or notificati...@carrieriq.com and delete or destroy any copy of this message and its attachments.
Hbase Performance Issue
Hi, I have been running a map reduce job that joins 2 datasets of 1.3 and 4 GB in size. Joining is done at reduce side. Output is written to either Hbase or HDFS depending upon configuration. The problem I am having is that Hbase takes about 60-80 minutes to write the processed data, on the other hand HDFS takes only 3-5 mins to write the same data. I really want to improve the Hbase speed and bring it down to 1-2 min. I am using amazon EC2 instances, launched a cluster of size 3 and later 10, have tried both c3.4xlarge and c3.8xlarge instances. I can see significant increase in performance while writing to HDFS as i use cluster with more nodes, having high specifications, but in the case of Hbase there was no significant change in performance. I have been going through different posts, articles and have read Hbase book to solve the Hbase performance issue but have not been able to succeed so far. Here are the few things i have tried out so far: *Client Side* - Turned off writing to WAL - Experimented with write buffer size - Turned off auto flush on table - Used cache, experimented with different sizes *Hbase Server Side* - Increased region servers heap size to 8 GB - Experimented with handlers count - Increased Memstore flush size to 512 MB - Experimented with hbase.hregion.max.filesize, tried different sizes There are many other parameters i have tried out following the suggestions from different sources, but nothing worked so far. Your help will be really appreciated. -- Regards Akhtar Muhammad Din
Re: Hbase Performance Issue
im using CDH 4.5: Hadoop: 2.0.0-cdh4.5.0 HBase: 0.94.6-cdh4.5.0 Regards On Sun, Jan 5, 2014 at 1:24 AM, Ted Yu yuzhih...@gmail.com wrote: What version of HBase / hdfs are you running with ? Cheers On Sat, Jan 4, 2014 at 12:17 PM, Akhtar Muhammad Din akhtar.m...@gmail.comwrote: Hi, I have been running a map reduce job that joins 2 datasets of 1.3 and 4 GB in size. Joining is done at reduce side. Output is written to either Hbase or HDFS depending upon configuration. The problem I am having is that Hbase takes about 60-80 minutes to write the processed data, on the other hand HDFS takes only 3-5 mins to write the same data. I really want to improve the Hbase speed and bring it down to 1-2 min. I am using amazon EC2 instances, launched a cluster of size 3 and later 10, have tried both c3.4xlarge and c3.8xlarge instances. I can see significant increase in performance while writing to HDFS as i use cluster with more nodes, having high specifications, but in the case of Hbase there was no significant change in performance. I have been going through different posts, articles and have read Hbase book to solve the Hbase performance issue but have not been able to succeed so far. Here are the few things i have tried out so far: *Client Side* - Turned off writing to WAL - Experimented with write buffer size - Turned off auto flush on table - Used cache, experimented with different sizes *Hbase Server Side* - Increased region servers heap size to 8 GB - Experimented with handlers count - Increased Memstore flush size to 512 MB - Experimented with hbase.hregion.max.filesize, tried different sizes There are many other parameters i have tried out following the suggestions from different sources, but nothing worked so far. Your help will be really appreciated. -- Regards Akhtar Muhammad Din -- Regards Akhtar Muhammad Din
RE: Hbase Performance Issue
You cay try MapReduce over snapshot files https://issues.apache.org/jira/browse/HBASE-8369 but you will need to patch 0.94. Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Akhtar Muhammad Din [akhtar.m...@gmail.com] Sent: Saturday, January 04, 2014 12:44 PM To: user@hbase.apache.org Subject: Re: Hbase Performance Issue im using CDH 4.5: Hadoop: 2.0.0-cdh4.5.0 HBase: 0.94.6-cdh4.5.0 Regards On Sun, Jan 5, 2014 at 1:24 AM, Ted Yu yuzhih...@gmail.com wrote: What version of HBase / hdfs are you running with ? Cheers On Sat, Jan 4, 2014 at 12:17 PM, Akhtar Muhammad Din akhtar.m...@gmail.comwrote: Hi, I have been running a map reduce job that joins 2 datasets of 1.3 and 4 GB in size. Joining is done at reduce side. Output is written to either Hbase or HDFS depending upon configuration. The problem I am having is that Hbase takes about 60-80 minutes to write the processed data, on the other hand HDFS takes only 3-5 mins to write the same data. I really want to improve the Hbase speed and bring it down to 1-2 min. I am using amazon EC2 instances, launched a cluster of size 3 and later 10, have tried both c3.4xlarge and c3.8xlarge instances. I can see significant increase in performance while writing to HDFS as i use cluster with more nodes, having high specifications, but in the case of Hbase there was no significant change in performance. I have been going through different posts, articles and have read Hbase book to solve the Hbase performance issue but have not been able to succeed so far. Here are the few things i have tried out so far: *Client Side* - Turned off writing to WAL - Experimented with write buffer size - Turned off auto flush on table - Used cache, experimented with different sizes *Hbase Server Side* - Increased region servers heap size to 8 GB - Experimented with handlers count - Increased Memstore flush size to 512 MB - Experimented with hbase.hregion.max.filesize, tried different sizes There are many other parameters i have tried out following the suggestions from different sources, but nothing worked so far. Your help will be really appreciated. -- Regards Akhtar Muhammad Din -- Regards Akhtar Muhammad Din Confidentiality Notice: The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or notificati...@carrieriq.com and delete or destroy any copy of this message and its attachments.
Re: Hbase Performance Issue
bq. Output is written to either Hbase Looks like Akhtar wants to boost write performance to HBase. MapReduce over snapshot files targets higher read throughput. Cheers On Sat, Jan 4, 2014 at 12:55 PM, Vladimir Rodionov vrodio...@carrieriq.comwrote: You cay try MapReduce over snapshot files https://issues.apache.org/jira/browse/HBASE-8369 but you will need to patch 0.94. Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Akhtar Muhammad Din [akhtar.m...@gmail.com] Sent: Saturday, January 04, 2014 12:44 PM To: user@hbase.apache.org Subject: Re: Hbase Performance Issue im using CDH 4.5: Hadoop: 2.0.0-cdh4.5.0 HBase: 0.94.6-cdh4.5.0 Regards On Sun, Jan 5, 2014 at 1:24 AM, Ted Yu yuzhih...@gmail.com wrote: What version of HBase / hdfs are you running with ? Cheers On Sat, Jan 4, 2014 at 12:17 PM, Akhtar Muhammad Din akhtar.m...@gmail.comwrote: Hi, I have been running a map reduce job that joins 2 datasets of 1.3 and 4 GB in size. Joining is done at reduce side. Output is written to either Hbase or HDFS depending upon configuration. The problem I am having is that Hbase takes about 60-80 minutes to write the processed data, on the other hand HDFS takes only 3-5 mins to write the same data. I really want to improve the Hbase speed and bring it down to 1-2 min. I am using amazon EC2 instances, launched a cluster of size 3 and later 10, have tried both c3.4xlarge and c3.8xlarge instances. I can see significant increase in performance while writing to HDFS as i use cluster with more nodes, having high specifications, but in the case of Hbase there was no significant change in performance. I have been going through different posts, articles and have read Hbase book to solve the Hbase performance issue but have not been able to succeed so far. Here are the few things i have tried out so far: *Client Side* - Turned off writing to WAL - Experimented with write buffer size - Turned off auto flush on table - Used cache, experimented with different sizes *Hbase Server Side* - Increased region servers heap size to 8 GB - Experimented with handlers count - Increased Memstore flush size to 512 MB - Experimented with hbase.hregion.max.filesize, tried different sizes There are many other parameters i have tried out following the suggestions from different sources, but nothing worked so far. Your help will be really appreciated. -- Regards Akhtar Muhammad Din -- Regards Akhtar Muhammad Din Confidentiality Notice: The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or notificati...@carrieriq.com and delete or destroy any copy of this message and its attachments.
Re: Hbase Performance Issue
Have you tried writing out an hfile and then bulk loading the data? On Jan 4, 2014 4:01 PM, Ted Yu yuzhih...@gmail.com wrote: bq. Output is written to either Hbase Looks like Akhtar wants to boost write performance to HBase. MapReduce over snapshot files targets higher read throughput. Cheers On Sat, Jan 4, 2014 at 12:55 PM, Vladimir Rodionov vrodio...@carrieriq.comwrote: You cay try MapReduce over snapshot files https://issues.apache.org/jira/browse/HBASE-8369 but you will need to patch 0.94. Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Akhtar Muhammad Din [akhtar.m...@gmail.com] Sent: Saturday, January 04, 2014 12:44 PM To: user@hbase.apache.org Subject: Re: Hbase Performance Issue im using CDH 4.5: Hadoop: 2.0.0-cdh4.5.0 HBase: 0.94.6-cdh4.5.0 Regards On Sun, Jan 5, 2014 at 1:24 AM, Ted Yu yuzhih...@gmail.com wrote: What version of HBase / hdfs are you running with ? Cheers On Sat, Jan 4, 2014 at 12:17 PM, Akhtar Muhammad Din akhtar.m...@gmail.comwrote: Hi, I have been running a map reduce job that joins 2 datasets of 1.3 and 4 GB in size. Joining is done at reduce side. Output is written to either Hbase or HDFS depending upon configuration. The problem I am having is that Hbase takes about 60-80 minutes to write the processed data, on the other hand HDFS takes only 3-5 mins to write the same data. I really want to improve the Hbase speed and bring it down to 1-2 min. I am using amazon EC2 instances, launched a cluster of size 3 and later 10, have tried both c3.4xlarge and c3.8xlarge instances. I can see significant increase in performance while writing to HDFS as i use cluster with more nodes, having high specifications, but in the case of Hbase there was no significant change in performance. I have been going through different posts, articles and have read Hbase book to solve the Hbase performance issue but have not been able to succeed so far. Here are the few things i have tried out so far: *Client Side* - Turned off writing to WAL - Experimented with write buffer size - Turned off auto flush on table - Used cache, experimented with different sizes *Hbase Server Side* - Increased region servers heap size to 8 GB - Experimented with handlers count - Increased Memstore flush size to 512 MB - Experimented with hbase.hregion.max.filesize, tried different sizes There are many other parameters i have tried out following the suggestions from different sources, but nothing worked so far. Your help will be really appreciated. -- Regards Akhtar Muhammad Din -- Regards Akhtar Muhammad Din Confidentiality Notice: The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or notificati...@carrieriq.com and delete or destroy any copy of this message and its attachments.
Re: Hbase Performance Issue
Thanks guys for your precious time. Vladimir, as Ted rightly said i want to improve write performance currently (of course i want to read data as fast as possible later on) Kevin, my current understanding of bulk load is that you generate StoreFiles and later load through a command line program. I dont want to do any manual step. Our system is getting data after every 15 minutes, so requirement is to automate it through client API completely. On Sun, Jan 5, 2014 at 2:19 AM, Kevin O'dell kevin.od...@cloudera.comwrote: Have you tried writing out an hfile and then bulk loading the data? On Jan 4, 2014 4:01 PM, Ted Yu yuzhih...@gmail.com wrote: bq. Output is written to either Hbase Looks like Akhtar wants to boost write performance to HBase. MapReduce over snapshot files targets higher read throughput. Cheers On Sat, Jan 4, 2014 at 12:55 PM, Vladimir Rodionov vrodio...@carrieriq.comwrote: You cay try MapReduce over snapshot files https://issues.apache.org/jira/browse/HBASE-8369 but you will need to patch 0.94. Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Akhtar Muhammad Din [akhtar.m...@gmail.com] Sent: Saturday, January 04, 2014 12:44 PM To: user@hbase.apache.org Subject: Re: Hbase Performance Issue im using CDH 4.5: Hadoop: 2.0.0-cdh4.5.0 HBase: 0.94.6-cdh4.5.0 Regards On Sun, Jan 5, 2014 at 1:24 AM, Ted Yu yuzhih...@gmail.com wrote: What version of HBase / hdfs are you running with ? Cheers On Sat, Jan 4, 2014 at 12:17 PM, Akhtar Muhammad Din akhtar.m...@gmail.comwrote: Hi, I have been running a map reduce job that joins 2 datasets of 1.3 and 4 GB in size. Joining is done at reduce side. Output is written to either Hbase or HDFS depending upon configuration. The problem I am having is that Hbase takes about 60-80 minutes to write the processed data, on the other hand HDFS takes only 3-5 mins to write the same data. I really want to improve the Hbase speed and bring it down to 1-2 min. I am using amazon EC2 instances, launched a cluster of size 3 and later 10, have tried both c3.4xlarge and c3.8xlarge instances. I can see significant increase in performance while writing to HDFS as i use cluster with more nodes, having high specifications, but in the case of Hbase there was no significant change in performance. I have been going through different posts, articles and have read Hbase book to solve the Hbase performance issue but have not been able to succeed so far. Here are the few things i have tried out so far: *Client Side* - Turned off writing to WAL - Experimented with write buffer size - Turned off auto flush on table - Used cache, experimented with different sizes *Hbase Server Side* - Increased region servers heap size to 8 GB - Experimented with handlers count - Increased Memstore flush size to 512 MB - Experimented with hbase.hregion.max.filesize, tried different sizes There are many other parameters i have tried out following the suggestions from different sources, but nothing worked so far. Your help will be really appreciated. -- Regards Akhtar Muhammad Din -- Regards Akhtar Muhammad Din Confidentiality Notice: The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or notificati...@carrieriq.com and delete or destroy any copy of this message and its attachments. -- Regards Akhtar Muhammad Din
Re: Hbase Performance Issue
Could you give us a region server log to look at during a job? On Jan 4, 2014 4:35 PM, Akhtar Muhammad Din akhtar.m...@gmail.com wrote: Thanks guys for your precious time. Vladimir, as Ted rightly said i want to improve write performance currently (of course i want to read data as fast as possible later on) Kevin, my current understanding of bulk load is that you generate StoreFiles and later load through a command line program. I dont want to do any manual step. Our system is getting data after every 15 minutes, so requirement is to automate it through client API completely. On Sun, Jan 5, 2014 at 2:19 AM, Kevin O'dell kevin.od...@cloudera.com wrote: Have you tried writing out an hfile and then bulk loading the data? On Jan 4, 2014 4:01 PM, Ted Yu yuzhih...@gmail.com wrote: bq. Output is written to either Hbase Looks like Akhtar wants to boost write performance to HBase. MapReduce over snapshot files targets higher read throughput. Cheers On Sat, Jan 4, 2014 at 12:55 PM, Vladimir Rodionov vrodio...@carrieriq.comwrote: You cay try MapReduce over snapshot files https://issues.apache.org/jira/browse/HBASE-8369 but you will need to patch 0.94. Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Akhtar Muhammad Din [akhtar.m...@gmail.com] Sent: Saturday, January 04, 2014 12:44 PM To: user@hbase.apache.org Subject: Re: Hbase Performance Issue im using CDH 4.5: Hadoop: 2.0.0-cdh4.5.0 HBase: 0.94.6-cdh4.5.0 Regards On Sun, Jan 5, 2014 at 1:24 AM, Ted Yu yuzhih...@gmail.com wrote: What version of HBase / hdfs are you running with ? Cheers On Sat, Jan 4, 2014 at 12:17 PM, Akhtar Muhammad Din akhtar.m...@gmail.comwrote: Hi, I have been running a map reduce job that joins 2 datasets of 1.3 and 4 GB in size. Joining is done at reduce side. Output is written to either Hbase or HDFS depending upon configuration. The problem I am having is that Hbase takes about 60-80 minutes to write the processed data, on the other hand HDFS takes only 3-5 mins to write the same data. I really want to improve the Hbase speed and bring it down to 1-2 min. I am using amazon EC2 instances, launched a cluster of size 3 and later 10, have tried both c3.4xlarge and c3.8xlarge instances. I can see significant increase in performance while writing to HDFS as i use cluster with more nodes, having high specifications, but in the case of Hbase there was no significant change in performance. I have been going through different posts, articles and have read Hbase book to solve the Hbase performance issue but have not been able to succeed so far. Here are the few things i have tried out so far: *Client Side* - Turned off writing to WAL - Experimented with write buffer size - Turned off auto flush on table - Used cache, experimented with different sizes *Hbase Server Side* - Increased region servers heap size to 8 GB - Experimented with handlers count - Increased Memstore flush size to 512 MB - Experimented with hbase.hregion.max.filesize, tried different sizes There are many other parameters i have tried out following the suggestions from different sources, but nothing worked so far. Your help will be really appreciated. -- Regards Akhtar Muhammad Din -- Regards Akhtar Muhammad Din Confidentiality Notice: The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or notificati...@carrieriq.com and delete or destroy any copy of this message and its attachments. -- Regards Akhtar Muhammad Din
Re: Hbase Performance Issue
There're 8 items under: http://hbase.apache.org/book.html#perf.writing I guess you have through all of them :-) On Sat, Jan 4, 2014 at 1:34 PM, Akhtar Muhammad Din akhtar.m...@gmail.comwrote: Thanks guys for your precious time. Vladimir, as Ted rightly said i want to improve write performance currently (of course i want to read data as fast as possible later on) Kevin, my current understanding of bulk load is that you generate StoreFiles and later load through a command line program. I dont want to do any manual step. Our system is getting data after every 15 minutes, so requirement is to automate it through client API completely. On Sun, Jan 5, 2014 at 2:19 AM, Kevin O'dell kevin.od...@cloudera.com wrote: Have you tried writing out an hfile and then bulk loading the data? On Jan 4, 2014 4:01 PM, Ted Yu yuzhih...@gmail.com wrote: bq. Output is written to either Hbase Looks like Akhtar wants to boost write performance to HBase. MapReduce over snapshot files targets higher read throughput. Cheers On Sat, Jan 4, 2014 at 12:55 PM, Vladimir Rodionov vrodio...@carrieriq.comwrote: You cay try MapReduce over snapshot files https://issues.apache.org/jira/browse/HBASE-8369 but you will need to patch 0.94. Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Akhtar Muhammad Din [akhtar.m...@gmail.com] Sent: Saturday, January 04, 2014 12:44 PM To: user@hbase.apache.org Subject: Re: Hbase Performance Issue im using CDH 4.5: Hadoop: 2.0.0-cdh4.5.0 HBase: 0.94.6-cdh4.5.0 Regards On Sun, Jan 5, 2014 at 1:24 AM, Ted Yu yuzhih...@gmail.com wrote: What version of HBase / hdfs are you running with ? Cheers On Sat, Jan 4, 2014 at 12:17 PM, Akhtar Muhammad Din akhtar.m...@gmail.comwrote: Hi, I have been running a map reduce job that joins 2 datasets of 1.3 and 4 GB in size. Joining is done at reduce side. Output is written to either Hbase or HDFS depending upon configuration. The problem I am having is that Hbase takes about 60-80 minutes to write the processed data, on the other hand HDFS takes only 3-5 mins to write the same data. I really want to improve the Hbase speed and bring it down to 1-2 min. I am using amazon EC2 instances, launched a cluster of size 3 and later 10, have tried both c3.4xlarge and c3.8xlarge instances. I can see significant increase in performance while writing to HDFS as i use cluster with more nodes, having high specifications, but in the case of Hbase there was no significant change in performance. I have been going through different posts, articles and have read Hbase book to solve the Hbase performance issue but have not been able to succeed so far. Here are the few things i have tried out so far: *Client Side* - Turned off writing to WAL - Experimented with write buffer size - Turned off auto flush on table - Used cache, experimented with different sizes *Hbase Server Side* - Increased region servers heap size to 8 GB - Experimented with handlers count - Increased Memstore flush size to 512 MB - Experimented with hbase.hregion.max.filesize, tried different sizes There are many other parameters i have tried out following the suggestions from different sources, but nothing worked so far. Your help will be really appreciated. -- Regards Akhtar Muhammad Din -- Regards Akhtar Muhammad Din Confidentiality Notice: The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or notificati...@carrieriq.com and delete or destroy any copy of this message and its attachments. -- Regards Akhtar Muhammad Din
RE: Hbase Performance Issue
I think in this case, writing data to HDFS or HFile directly (for subsequent bulk loading) is the best option. HBase will never compete in write speed with HDFS. Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Ted Yu [yuzhih...@gmail.com] Sent: Saturday, January 04, 2014 2:33 PM To: user@hbase.apache.org Subject: Re: Hbase Performance Issue There're 8 items under: http://hbase.apache.org/book.html#perf.writing I guess you have through all of them :-) On Sat, Jan 4, 2014 at 1:34 PM, Akhtar Muhammad Din akhtar.m...@gmail.comwrote: Thanks guys for your precious time. Vladimir, as Ted rightly said i want to improve write performance currently (of course i want to read data as fast as possible later on) Kevin, my current understanding of bulk load is that you generate StoreFiles and later load through a command line program. I dont want to do any manual step. Our system is getting data after every 15 minutes, so requirement is to automate it through client API completely. Confidentiality Notice: The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or notificati...@carrieriq.com and delete or destroy any copy of this message and its attachments.
HBase - Performance issue
The problem is that when I'm putting my data (multithreaded client, ~30MB/s traffic outgoing) into the cluster the load is equally spread over all RegionServer with 3.5% average CPU wait time (average CPU user: 51%). When I've added similar, mutlithreaded client that Scans for, let say, 100 last samples of randomly generated key from chosen time range, I'm getting high CPU wait time (20% and up) on two (or more if there is higher number of threads, default 10) random RegionServers. Therefore, machines that held those RS are getting very hot - one of the consequences is that number of store file is constantly increasing, up to the maximum limit. Rest of the RS are having 10-12% CPU wait time and everything seems to be OK (number of store files varies so they are being compacted and not increasing over time). Any ideas? Maybe I could prioritize writes over reads somehow? Is it possible? If so what would be the best way to that and where it should be placed - on the client or cluster side)? Cluster specification: HBase Version 0.94.2-cdh4.2.0 Hadoop Version 2.0.0-cdh4.2.0 There are 6xDataNodes (5xHDD for storing data), 1xMasterNodes Other settings: - Bloom filters (ROWCOL) set - Short circuit turned on - HDFS Block Size: 128MB - Java Heap Size of Namenode/Secondary Namenode in Bytes: 8 GiB - Java Heap Size of HBase RegionServer in Bytes: 12 GiB - Java Heap Size of HBase Master in Bytes: 4 GiB - Java Heap Size of DataNode in Bytes: 1 GiB (default) Number of regions per RegionServer: 19 (total 114 regions on 6 RS) Key design: UUIDTIMESTAMP - UUID: 1-10M, TIMESTAMP: 1-N Table design: 1 column family with 20 columns of 8 bytes Get client: Multiple threads Each thread have its own tables instance with their Scanner. Each thread have its own range of UUIDs and randomly draws beginning of time range to build rowkey properly (see above). Each time Scan requests same amount of rows, but with random rowkey. -- View this message in context: http://apache-hbase.679495.n3.nabble.com/HBase-Performance-issue-tp4042836.html Sent from the HBase User mailing list archive at Nabble.com.
Re: HBase - Performance issue
Hi How many request handlers are there in ur RS? Can you up this number and see? -Anoop- On Wed, Apr 24, 2013 at 3:42 PM, kzurek kzu...@proximetry.pl wrote: The problem is that when I'm putting my data (multithreaded client, ~30MB/s traffic outgoing) into the cluster the load is equally spread over all RegionServer with 3.5% average CPU wait time (average CPU user: 51%). When I've added similar, mutlithreaded client that Scans for, let say, 100 last samples of randomly generated key from chosen time range, I'm getting high CPU wait time (20% and up) on two (or more if there is higher number of threads, default 10) random RegionServers. Therefore, machines that held those RS are getting very hot - one of the consequences is that number of store file is constantly increasing, up to the maximum limit. Rest of the RS are having 10-12% CPU wait time and everything seems to be OK (number of store files varies so they are being compacted and not increasing over time). Any ideas? Maybe I could prioritize writes over reads somehow? Is it possible? If so what would be the best way to that and where it should be placed - on the client or cluster side)? Cluster specification: HBase Version 0.94.2-cdh4.2.0 Hadoop Version 2.0.0-cdh4.2.0 There are 6xDataNodes (5xHDD for storing data), 1xMasterNodes Other settings: - Bloom filters (ROWCOL) set - Short circuit turned on - HDFS Block Size: 128MB - Java Heap Size of Namenode/Secondary Namenode in Bytes: 8 GiB - Java Heap Size of HBase RegionServer in Bytes: 12 GiB - Java Heap Size of HBase Master in Bytes: 4 GiB - Java Heap Size of DataNode in Bytes: 1 GiB (default) Number of regions per RegionServer: 19 (total 114 regions on 6 RS) Key design: UUIDTIMESTAMP - UUID: 1-10M, TIMESTAMP: 1-N Table design: 1 column family with 20 columns of 8 bytes Get client: Multiple threads Each thread have its own tables instance with their Scanner. Each thread have its own range of UUIDs and randomly draws beginning of time range to build rowkey properly (see above). Each time Scan requests same amount of rows, but with random rowkey. -- View this message in context: http://apache-hbase.679495.n3.nabble.com/HBase-Performance-issue-tp4042836.html Sent from the HBase User mailing list archive at Nabble.com.
Re: HBase - Performance issue
I've following settings: hbase.master.handler.count = 25 (default value in CDH4.2) hbase.regionserver.handler.count = 20 (default 10) -- View this message in context: http://apache-hbase.679495.n3.nabble.com/HBase-Performance-issue-tp4042836p4042840.html Sent from the HBase User mailing list archive at Nabble.com.
Re: HBase - Performance issue
You may have run into https://issues.apache.org/jira/browse/HBASE-7336 (which is in 0.94.4) (Although I had not observed this effect as much when short circuit reads are enabled) - Original Message - From: kzurek kzu...@proximetry.pl To: user@hbase.apache.org Cc: Sent: Wednesday, April 24, 2013 3:12 AM Subject: HBase - Performance issue The problem is that when I'm putting my data (multithreaded client, ~30MB/s traffic outgoing) into the cluster the load is equally spread over all RegionServer with 3.5% average CPU wait time (average CPU user: 51%). When I've added similar, mutlithreaded client that Scans for, let say, 100 last samples of randomly generated key from chosen time range, I'm getting high CPU wait time (20% and up) on two (or more if there is higher number of threads, default 10) random RegionServers. Therefore, machines that held those RS are getting very hot - one of the consequences is that number of store file is constantly increasing, up to the maximum limit. Rest of the RS are having 10-12% CPU wait time and everything seems to be OK (number of store files varies so they are being compacted and not increasing over time). Any ideas? Maybe I could prioritize writes over reads somehow? Is it possible? If so what would be the best way to that and where it should be placed - on the client or cluster side)? Cluster specification: HBase Version 0.94.2-cdh4.2.0 Hadoop Version 2.0.0-cdh4.2.0 There are 6xDataNodes (5xHDD for storing data), 1xMasterNodes Other settings: - Bloom filters (ROWCOL) set - Short circuit turned on - HDFS Block Size: 128MB - Java Heap Size of Namenode/Secondary Namenode in Bytes: 8 GiB - Java Heap Size of HBase RegionServer in Bytes: 12 GiB - Java Heap Size of HBase Master in Bytes: 4 GiB - Java Heap Size of DataNode in Bytes: 1 GiB (default) Number of regions per RegionServer: 19 (total 114 regions on 6 RS) Key design: UUIDTIMESTAMP - UUID: 1-10M, TIMESTAMP: 1-N Table design: 1 column family with 20 columns of 8 bytes Get client: Multiple threads Each thread have its own tables instance with their Scanner. Each thread have its own range of UUIDs and randomly draws beginning of time range to build rowkey properly (see above). Each time Scan requests same amount of rows, but with random rowkey. -- View this message in context: http://apache-hbase.679495.n3.nabble.com/HBase-Performance-issue-tp4042836.html Sent from the HBase User mailing list archive at Nabble.com.
Re: hbase performance issue
Your post is missing the most important configurations, mainly the region server heap size and GC configs. Also, how much of those 300GB do you need to serve? Does the working dataset fit in cache? J-D On Sun, Mar 11, 2012 at 12:39 PM, Антон Лыска ant...@wildec.com wrote: Hi guys! I have a little instance of hbase cluster with only 2 machines (8core cpu, 12G mem, 3*1GB hdd on each machine). I use cloudera's cdh3u1 distro. Cluster serves two tables and total data size is about 300 GB with 300 regions. The average Get time is usually 20-50ms, but sometimes it rises up to 500-800ms which is unacceptable. Gets per day: 13*10^6 Puts per day: 11*10^6 Deletes per day: 2*10^6 My conf is: configuration property namedfs.replication/name value2/value /property property namehbase.regionserver.handler.count/name value50/value /property property namehbase.hregion.majorcompaction/name value864/value /property /configuration My scheme is: {NAME = 'table1', MAX_FILESIZE = '536870912', FAMILIES = [{NAME = 'c', BLOOMFILTER = 'NONE', REPLICATION_SCOPE = '0', COMPRESSION = 'NONE', VERSIONS = '1', TTL = '2147483647', BLOCKSIZE = '16384', IN_MEMORY = 'false', BLOCKCACHE = 'true'}, {NAME = 'p', BLOOMFILTER = 'NONE', REPLICATION_SCOPE = '0', COMPRESSION = 'NONE', VERSIONS = '1', TTL = '2147483647', BLOCKSIZE = '65536', IN_MEMORY = 'false', BLOCKCACHE = 'true'}, {NAME = 's', BLOOMFILTER = 'NONE', REPLICATION_SCOPE = '0', COMPRESSION = 'NONE', VERSIONS = '1', TTL = '2147483647', BLOCKSIZE = '65536', IN_MEMORY = 'false', BLOCKCACHE = 'true'}]} {NAME = 'table2', FAMILIES = [{NAME = 'n', BLOOMFILTER = 'NONE', REPLICATION_SCOPE = '0', VERSIONS = '1', COMPRESSION = 'NONE', TTL = '2147483647', BLOCKSIZE = '65536', IN_MEMORY = 'false', BLOCKCACHE = 'true'}]} I disabled major compaction by setting a big value, and run it manually each day at 3:00am (server is least loaded at that time). Get time usually starts increasing at around 23:00-24:00. Once hbase is restarted, Get time returns to 20ms. What it can be? what options should I set to avoid this issue? Also I have installed ganglia, but I haven't seen anything strange there. Thank you in advance! Best regards, Anton.
Re: hbase performance issue
If you're using Cloudera, you want to be on CDH3u3 because it has several HDFS performance fixes for low-latency reads. That still doesn't address your 23:00-hour perf issue, but that's something that will help. On 3/11/12 3:39 PM, Антон Лыска ant...@wildec.com wrote: Hi guys! I have a little instance of hbase cluster with only 2 machines (8core cpu, 12G mem, 3*1GB hdd on each machine). I use cloudera's cdh3u1 distro. Cluster serves two tables and total data size is about 300 GB with 300 regions. The average Get time is usually 20-50ms, but sometimes it rises up to 500-800ms which is unacceptable. Gets per day: 13*10^6 Puts per day: 11*10^6 Deletes per day: 2*10^6 My conf is: configuration property namedfs.replication/name value2/value /property property namehbase.regionserver.handler.count/name value50/value /property property namehbase.hregion.majorcompaction/name value864/value /property /configuration My scheme is: {NAME = 'table1', MAX_FILESIZE = '536870912', FAMILIES = [{NAME = 'c', BLOOMFILTER = 'NONE', REPLICATION_SCOPE = '0', COMPRESSION = 'NONE', VERSIONS = '1', TTL = '2147483647', BLOCKSIZE = '16384', IN_MEMORY = 'false', BLOCKCACHE = 'true'}, {NAME = 'p', BLOOMFILTER = 'NONE', REPLICATION_SCOPE = '0', COMPRESSION = 'NONE', VERSIONS = '1', TTL = '2147483647', BLOCKSIZE = '65536', IN_MEMORY = 'false', BLOCKCACHE = 'true'}, {NAME = 's', BLOOMFILTER = 'NONE', REPLICATION_SCOPE = '0', COMPRESSION = 'NONE', VERSIONS = '1', TTL = '2147483647', BLOCKSIZE = '65536', IN_MEMORY = 'false', BLOCKCACHE = 'true'}]} {NAME = 'table2', FAMILIES = [{NAME = 'n', BLOOMFILTER = 'NONE', REPLICATION_SCOPE = '0', VERSIONS = '1', COMPRESSION = 'NONE', TTL = '2147483647', BLOCKSIZE = '65536', IN_MEMORY = 'false', BLOCKCACHE = 'true'}]} I disabled major compaction by setting a big value, and run it manually each day at 3:00am (server is least loaded at that time). Get time usually starts increasing at around 23:00-24:00. Once hbase is restarted, Get time returns to 20ms. What it can be? what options should I set to avoid this issue? Also I have installed ganglia, but I haven't seen anything strange there. Thank you in advance! Best regards, Anton.