Thanks Vlad for your response. After the changing the below parameters from their default values in hbase-site.xml, queries are working fine.
<property> <name>hbase.regionserver.lease.period</name> <value>1200000</value> </property> <property> <name>hbase.rpc.timeout</name> <value>1200000</value> </property> <property> <name>hbase.client.scanner.caching</name> <value>1000</value> </property> Still there are few quires taking a lot of time. Join between two tables is taking more than 5 mins with filter condition, if i omit the filter condition query is failing at all. table1 - 5.5M records -- 2 GB of compressed data table2 - 8.5M records -- 2 GB of compressed data. We have 2 GB of heap space on 4 region servers. 2GB of heap space on master. No activity is going on the cluster when I was running the queries. Do you recommend any of the parameters to tune memory and GC for Phoenix and Hbase? Thanks, Siva. On Mon, Jun 1, 2015 at 1:14 PM, Vladimir Rodionov <vladrodio...@gmail.com> wrote: > >> Is IO exception is because of Phoenix > >> not able to read from multiple regions since error was resolved after > the > >> compaction? or Any other thoughts? > > Compaction does not decrease # of regions - it sorts/merges data into a > single file (in case of a major compaction) for every > region/column_family. SocketTimeout exception is probably because Phoenix > must read data from multiple files in > every region before compaction - this requires more CPU, more RAM and > produces more temp garbage. > Excessive GC activity, in turn, results in socket timeouts. Check GC logs > in RS and check RS logs for other errors - > they will probably give you a clue on what is going on during a query > execution. > > -Vlad > > > > On Mon, Jun 1, 2015 at 11:10 AM, Siva <sbhavan...@gmail.com> wrote: > >> Hi Everyone, >> >> We load the data to Hbase tables through BulkImports. >> >> If the data set is small, we can query the imported data from phoenix with >> no issues. >> >> If data size is huge (with respect to our cluster, we have very small >> cluster), I m encountering the following error >> (org.apache.phoenix.exception.PhoenixIOException). >> >> 0: jdbc:phoenix:172.31.45.176:2181:/hbase> select count(*) >> . . . . . . . . . . . . . . . . . . . . .> from "ldll_compression" >> ldll join "ds_compression" ds on (ds."statusid" = ldll."statusid") >> . . . . . . . . . . . . . . . . . . . . .> where ldll."logdate" >= >> '2015-02-04' >> . . . . . . . . . . . . . . . . . . . . .> and ldll."logdate" <= >> '2015-02-06' >> . . . . . . . . . . . . . . . . . . . . .> and ldll."dbname" = >> 'lmguaranteedrate'; >> +------------------------------------------+ >> | COUNT(1) | >> +------------------------------------------+ >> java.lang.RuntimeException: >> org.apache.phoenix.exception.PhoenixIOException: >> org.apache.phoenix.exception.PhoenixIOException: Failed after attempts=36, >> exceptions: >> Mon Jun 01 13:50:57 EDT 2015, null, java.net.SocketTimeoutException: >> callTimeout=60000, callDuration=62358: row '' on table 'ldll_compression' >> at >> region=ldll_compression,,1432851434288.1a8b511def7d0c9e69a5491c6330d715., >> hostname=ip-172-31-32-181.us-west-2.compute.internal,60020,1432768597149, >> seqNum=16566 >> >> at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2440) >> at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2074) >> at sqlline.SqlLine.print(SqlLine.java:1735) >> at sqlline.SqlLine$Commands.execute(SqlLine.java:3683) >> at sqlline.SqlLine$Commands.sql(SqlLine.java:3584) >> at sqlline.SqlLine.dispatch(SqlLine.java:821) >> at sqlline.SqlLine.begin(SqlLine.java:699) >> at sqlline.SqlLine.mainWithInputRedirection(SqlLine.java:441) >> at sqlline.SqlLine.main(SqlLine.java:424) >> >> I did the major compaction for "ldll_compression" through Hbase >> shell(major_compact 'ldll_compression'). Same query ran successfully after >> the compaction. >> >> 0: jdbc:phoenix:172.31.45.176:2181:/hbase> select count(*) >> . . . . . . . . . . . . . . . . . . . . .> from "ldll_compression" >> ldll join "ds_compression" ds on (ds."statusid" = ldll."statusid") >> . . . . . . . . . . . . . . . . . . . . .> where ldll."logdate" >= >> '2015-02-04' >> . . . . . . . . . . . . . . . . . . . . .> and ldll."logdate" <= >> '2015-02-06' >> . . . . . . . . . . . . . . . . . . . . .> and ldll."dbname" = >> 'lmguaranteedrate' >> . . . . . . . . . . . . . . . . . . . . .> ; >> +------------------------------------------+ >> | COUNT(1) | >> +------------------------------------------+ >> | 13480 | >> +------------------------------------------+ >> 1 row selected (72.36 seconds) >> >> Did anyone face the similar issue? Is IO exception is because of Phoenix >> not able to read from multiple regions since error was resolved after the >> compaction? or Any other thoughts? >> >> Thanks, >> Siva. >> > >