Apache Phoenix + Solr integration?
Hi all, Do you know of any integration approach to stream documents from Phoenix to Solr in a similar way to what Lily HBase Indexer does? Thanks!
Re: High CPU usage on Hbase region Server with GlobalMemoryManager warnings
that kind of messages may happen when there were queries that utilize memory manager (usually joins and group by) and they were timed out or failed due to some reason. So the message itself is hardly related to CPU usage or GC. BUT. That may mean that your region servers are unable to handle properly such kind of workload. Since you say that this issue started after Yarn work I would suggest checking swappiness and huge pages (there are quite a lot of resources over the Internet how they affect HBase). It might be the case when you just run out of HW resources. Thanks, Sergey On Wed, Jan 31, 2018 at 6:40 PM, Jins Georgewrote: > Hi, > > On analyzing a prod issue of High CPU usage on Hbase Region server, I came > across warning messages from region server logs complaining about Orphaned > chunk of memory. > > 2018-01-30 19:16:31,565 WARN org.apache.phoenix.memory.GlobalMemoryManager: > Orphaned chunk of 104000 bytes found during finalize > 2018-01-30 19:16:31,565 WARN org.apache.phoenix.memory.GlobalMemoryManager: > Orphaned chunk of 104000 bytes found during finalize > > > The high CPU usage looks like due to garbage collection and it lasted for > almost 6 hours. And throughout 6 hours, region server logs had these > warning messages logged. > > Cluster Details: > 4 node( 1 master + 3 slaves) cdh cluster > Hbase version 1.2 > Phoenix version 4.7 > Region Server Heap : 4G > Total Regions: ~135 > Total tables : ~35 > > Out of 3 region servers, 2 of them had the warning logs and both suffered > high CPU. Third region server nither had High CPU nor the warning logs. Any > idea why these messages are logged and can that trigger continuous GC ? > > Before this issue started( or around the same time) huge application log > files were copied to HDFS by Yarn.. But can't think of that causing issue > on Hbase Region server. > > Any help is appreciated. > > Thanks, > Jins George >
Re: HBase Timeout on queries
I don’t think the HBase row_counter job is going to be faster than a count(*) query. Both require a full table scan, so neither will be particularly fast. A couple of alternatives if you’re ok with an approximate count: 1) enable stats collection (but you can leave off usage to parallelize queries) and the do a SUM over the size column for the table using stats table directly, or 2) do a count(*) using TABLESAMPLE clause (again enabling stats as described above) to prevent a full scan. On Thu, Feb 1, 2018 at 8:11 AM Flavio Pompermaierwrote: > Hi Anil, > Obviously I'm not using HBase just for the count query..Most of the time I > do INSERT and selective queries, I was just trying to figure out if my > HBase + Phoenix installation is robust enough to deal with a huge amount of > data.. > > On Thu, Feb 1, 2018 at 5:07 PM, anil gupta wrote: > >> Hey Flavio, >> >> IMHO, If most of your app is just doing full table scans then i am not >> really sure HBase(or any other NoSql) will be a good fit for your >> solution.(building an OLAP system?) If you have point lookups and short >> range scans then HBase/Phoenix will work well. >> Also, if you wanna do select count(*). The HBase row_counter job will be >> much faster than phoenix queries. >> >> Thanks, >> Anil Gupta >> >> On Thu, Feb 1, 2018 at 7:35 AM, Flavio Pompermaier >> wrote: >> >>> I was able to make it work changing the following params (both on server >>> and client side and restarting hbase) and now the query answers in about 6 >>> minutes: >>> >>> hbase.rpc.timeout (to 60) >>> phoenix.query.timeoutMs (to 60) >>> hbase.client.scanner.timeout.period (from 1 m to 10m) >>> hbase.regionserver.lease.period (from 1 m to 10m) >>> >>> However I'd like to know id those performances could be easily improved >>> or not. Any ideas? >>> >>> On Thu, Feb 1, 2018 at 4:30 PM, Vaghawan Ojha >>> wrote: >>> I've the same problem, even after I increased the hbase.rpc.timeout the result is same. The difference is that I use 4.12. On Thu, Feb 1, 2018 at 8:23 PM, Flavio Pompermaier < pomperma...@okkam.it> wrote: > Hi to all, > I'm trying to use the brand-new Phoenix 4.13.2-cdh5.11.2 over HBase > and everything was fine until the data was quite small (about few > millions). As I inserted 170 M of rows in my table I cannot get the row > count anymore (using ELECT COUNT) because of > org.apache.hbase.ipc.CallTimeoutException (operationTimeout 6 > expired). > How can I fix this problem? I could increase the hbase.rpc.timeout > parameter but I suspect I could improve a little bit the HBase performance > first..the problem is that I don't know how. > > Thanks in advance, > Flavio > >>> >> >> >> -- >> Thanks & Regards, >> Anil Gupta >> >
Re: HBase Timeout on queries
Hi Anil, Obviously I'm not using HBase just for the count query..Most of the time I do INSERT and selective queries, I was just trying to figure out if my HBase + Phoenix installation is robust enough to deal with a huge amount of data.. On Thu, Feb 1, 2018 at 5:07 PM, anil guptawrote: > Hey Flavio, > > IMHO, If most of your app is just doing full table scans then i am not > really sure HBase(or any other NoSql) will be a good fit for your > solution.(building an OLAP system?) If you have point lookups and short > range scans then HBase/Phoenix will work well. > Also, if you wanna do select count(*). The HBase row_counter job will be > much faster than phoenix queries. > > Thanks, > Anil Gupta > > On Thu, Feb 1, 2018 at 7:35 AM, Flavio Pompermaier > wrote: > >> I was able to make it work changing the following params (both on server >> and client side and restarting hbase) and now the query answers in about 6 >> minutes: >> >> hbase.rpc.timeout (to 60) >> phoenix.query.timeoutMs (to 60) >> hbase.client.scanner.timeout.period (from 1 m to 10m) >> hbase.regionserver.lease.period (from 1 m to 10m) >> >> However I'd like to know id those performances could be easily improved >> or not. Any ideas? >> >> On Thu, Feb 1, 2018 at 4:30 PM, Vaghawan Ojha >> wrote: >> >>> I've the same problem, even after I increased the hbase.rpc.timeout the >>> result is same. The difference is that I use 4.12. >>> >>> >>> On Thu, Feb 1, 2018 at 8:23 PM, Flavio Pompermaier >> > wrote: >>> Hi to all, I'm trying to use the brand-new Phoenix 4.13.2-cdh5.11.2 over HBase and everything was fine until the data was quite small (about few millions). As I inserted 170 M of rows in my table I cannot get the row count anymore (using ELECT COUNT) because of org.apache.hbase.ipc.CallTimeoutException (operationTimeout 6 expired). How can I fix this problem? I could increase the hbase.rpc.timeout parameter but I suspect I could improve a little bit the HBase performance first..the problem is that I don't know how. Thanks in advance, Flavio >>> >>> >> > > > -- > Thanks & Regards, > Anil Gupta >
Re: HBase Timeout on queries
I was able to make it work changing the following params (both on server and client side and restarting hbase) and now the query answers in about 6 minutes: hbase.rpc.timeout (to 60) phoenix.query.timeoutMs (to 60) hbase.client.scanner.timeout.period (from 1 m to 10m) hbase.regionserver.lease.period (from 1 m to 10m) However I'd like to know id those performances could be easily improved or not. Any ideas? On Thu, Feb 1, 2018 at 4:30 PM, Vaghawan Ojhawrote: > I've the same problem, even after I increased the hbase.rpc.timeout the > result is same. The difference is that I use 4.12. > > > On Thu, Feb 1, 2018 at 8:23 PM, Flavio Pompermaier > wrote: > >> Hi to all, >> I'm trying to use the brand-new Phoenix 4.13.2-cdh5.11.2 over HBase and >> everything was fine until the data was quite small (about few millions). As >> I inserted 170 M of rows in my table I cannot get the row count anymore >> (using ELECT COUNT) because of org.apache.hbase.ipc.CallTimeoutException >> (operationTimeout 6 expired). >> How can I fix this problem? I could increase the hbase.rpc.timeout >> parameter but I suspect I could improve a little bit the HBase performance >> first..the problem is that I don't know how. >> >> Thanks in advance, >> Flavio >> > >
Re: HBase Timeout on queries
I've the same problem, even after I increased the hbase.rpc.timeout the result is same. The difference is that I use 4.12. On Thu, Feb 1, 2018 at 8:23 PM, Flavio Pompermaierwrote: > Hi to all, > I'm trying to use the brand-new Phoenix 4.13.2-cdh5.11.2 over HBase and > everything was fine until the data was quite small (about few millions). As > I inserted 170 M of rows in my table I cannot get the row count anymore > (using ELECT COUNT) because of org.apache.hbase.ipc.CallTimeoutException > (operationTimeout 6 expired). > How can I fix this problem? I could increase the hbase.rpc.timeout > parameter but I suspect I could improve a little bit the HBase performance > first..the problem is that I don't know how. > > Thanks in advance, > Flavio >
HBase Timeout on queries
Hi to all, I'm trying to use the brand-new Phoenix 4.13.2-cdh5.11.2 over HBase and everything was fine until the data was quite small (about few millions). As I inserted 170 M of rows in my table I cannot get the row count anymore (using ELECT COUNT) because of org.apache.hbase.ipc.CallTimeoutException (operationTimeout 6 expired). How can I fix this problem? I could increase the hbase.rpc.timeout parameter but I suspect I could improve a little bit the HBase performance first..the problem is that I don't know how. Thanks in advance, Flavio