Hi Krunal, Can you try with merge join by specifying the hint "USE_SORT_MERGE_JOIN"? and if it still does not work, would you mind posting the exact error message of running this merge-join?
Thanks, Maryann On Tue, Jun 16, 2015 at 6:12 PM, Krunal Varajiya <krunal.varaj...@ask.com> wrote: > Does anybody has any idea why below join is throwing OOM error? I will > really appreciate any help here. We are stuck here is as none of our join > works even with 5M rows. > > From: Krunal <krunal.varaj...@ask.com> > Reply-To: "user@phoenix.apache.org" <user@phoenix.apache.org> > Date: Wednesday, June 10, 2015 at 3:49 PM > To: Maryann Xue <maryann....@gmail.com> > > Cc: "user@phoenix.apache.org" <user@phoenix.apache.org> > Subject: Re: Join create OOM with java heap space on phoenix client > > Hey Maryann, > > Sorry I just realized that I am using Phoenix 4.3.0. > > I am using sample tables generated using performance.py script from > Phoenix package! I have generated 5M, 25M, 100M using this script and > running join in these tables! > > Here is table definition: > > * TABLE_CAT ** | ** > TABLE_SCHEM ** | ** TABLE_NAME ** > | ** COLUMN_NAME ** | ** > DATA_TYPE ** |* > > > *+------------------------------------------+------------------------------------------+------------------------------------------+------------------------------------------+------------------------------+* > > *| * * | * > * | *PERFORMANCE_5000000 * | *HOST > * | *1 * |* > > *| * * | * > * | *PERFORMANCE_5000000 * | *DOMAIN > * | *12 * |* > > *| * * | * > * | *PERFORMANCE_5000000 * | *FEATURE > * | *12 * |* > > *| * * | * > * | *PERFORMANCE_5000000 * | *DATE > * | *91 * |* > > *| * * | * > * | *PERFORMANCE_5000000 * | *CORE > * | *-5 * |* > > *| * * | * > * | *PERFORMANCE_5000000 * | *DB > * | *-5 * |* > > *| * * | * > * | *PERFORMANCE_5000000 > > Thank you for your time! > - Krunal > > From: Maryann Xue <maryann....@gmail.com> > Date: Monday, June 1, 2015 at 8:13 AM > To: Krunal <krunal.varaj...@ask.com> > Cc: "user@phoenix.apache.org" <user@phoenix.apache.org> > Subject: Re: Join create OOM with java heap space on phoenix client > > Hi Krunal, > > Thanks for the explanation! I am back to work now. Could you please also > post the table definition? I have some guess here, but need more > information to confirm. And which version of Phoenix are you using? > > > Thanks, > Maryann > > On Wed, May 27, 2015 at 4:55 PM, Krunal Varajiya <krunal.varaj...@ask.com> > wrote: > >> Hey Maryann >> >> Thanks for your reply! I understand it is difficult to access some >> stuff in China! I can wait till you come back, meantime I will keep you >> posted on if we make any progress on this! >> >> My comments are below in blue: >> >> Have a fun time in China! >> - Krunal >> >> >> From: Maryann Xue <maryann....@gmail.com> >> Reply-To: "user@phoenix.apache.org" <user@phoenix.apache.org> >> Date: Tuesday, May 26, 2015 at 5:45 PM >> To: "user@phoenix.apache.org" <user@phoenix.apache.org> >> Subject: Re: Join create OOM with java heap space on phoenix client >> >> Hi Krunal, >> >> Sorry for the late reply. I have been on vacation. >> >> 1. Can you make sure that the connection/statement is closed after each >> run of your query (even with exception)? >> >> *I think this is related to second issue mentioned below as it happens >> for join. I tried some other queries they are working fine!* >> >> 2. You might want to try switching the join tables in your query first >> by putting the larger table as LHS, and if it still does not work, you can >> force a merge join by adding hint "USE_SORT_MERGE_JOIN". >> >> *I have tried changing lhs and rhs, but no luck. I have also tried >> "USE_SORT_MERGE_JOIN" but for some reason it crashes region server while >> scanning result to local.* >> *I have also tried changing some tuning parameters, but none of the >> settings worked!* >> *One thing I am curious is why it is dumping data in local heap, I see >> millions of instances for org.apache.phoenix.expression.literalexpression. >> Shouldn't it execute joins on server side? * >> *I tried increasing memory upto 64gb and still it fails! **This is just >> one single connection and only one query running at a time, I am not sure >> what will happen with multiple connection and multiple queries!* >> *A**nd if you see in example my data size for both LHS and RHS is not >> that big. These are sample tables come with Phoenix.* >> >> Thanks, >> Maryann >> >> On Thursday, May 21, 2015, Krunal Varajiya <krunal.varaj...@ask.com> >> wrote: >> >>> Hi >>> >>> I have 2 issues with phoenix client: >>> >>> 1. Heap memory is not cleanup after each query is finished. So, it >>> keeps increasing every time when we submit new query. >>> 2. I am try to do a normal join operation on two tables but getting >>> exception. Below is the details: >>> >>> These are some sample queries I tried: >>> >>> 1. select p1.host, count(1) from PERFORMANCE_5000000 p1, >>> PERFORMANCE_25000000 p2 where p1.host = p2.host group by p1.host; >>> 2. select p1.host from PERFORMANCE_5000000 p1, PERFORMANCE_25000000 >>> p2 where p1.host = p2.host group by p1.host; >>> 3. select count(1) from PERFORMANCE_5000000 p1, PERFORMANCE_25000000 >>> p2 where p1.host = p2.host group by p1.host; >>> >>> Here is explain plan: >>> >>> explain select count(1) from PERFORMANCE_5000000 p1, >>> PERFORMANCE_25000000 p2 where p1.host = p2.host group by p1.host; >>> >>> +------------------------------------------+ >>> >>> | PLAN | >>> >>> +------------------------------------------+ >>> >>> | CLIENT 9-CHUNK PARALLEL 1-WAY FULL SCAN OVER PERFORMANCE_5000000 | >>> >>> | SERVER FILTER BY FIRST KEY ONLY | >>> >>> | SERVER AGGREGATE INTO ORDERED DISTINCT ROWS BY [HOST] | >>> >>> | CLIENT MERGE SORT | >>> >>> | PARALLEL INNER-JOIN TABLE 0 (SKIP MERGE) | >>> >>> | CLIENT 18-CHUNK PARALLEL 1-WAY FULL SCAN OVER >>> PERFORMANCE_25000000 | >>> >>> | SERVER FILTER BY FIRST KEY ONLY | >>> >>> | DYNAMIC SERVER FILTER BY HOST IN (P2.HOST) | >>> >>> +------------------------------------------+ >>> >>> 8 rows selected (0.127 seconds) >>> >>> Phoenix client heap size is 16GB. ( noticed that above queries are >>> dumping data in local heap, I see millions of instances for >>> org.apache.phoenix.expression.literalexpression) >>> >>> and my exceptions are: >>> >>> java.sql.SQLException: Encountered exception in sub plan [0] execution. >>> >>> at >>> org.apache.phoenix.execute.HashJoinPlan.iterator(HashJoinPlan.java:156) >>> >>> at >>> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:235) >>> >>> at >>> org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:226) >>> >>> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) >>> >>> at >>> org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:225) >>> >>> at >>> org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1066) >>> >>> at sqlline.Commands.execute(Commands.java:822) >>> >>> at sqlline.Commands.sql(Commands.java:732) >>> >>> at sqlline.SqlLine.dispatch(SqlLine.java:808) >>> >>> at sqlline.SqlLine.begin(SqlLine.java:681) >>> >>> at sqlline.SqlLine.start(SqlLine.java:398) >>> >>> at sqlline.SqlLine.main(SqlLine.java:292) >>> >>> Caused by: java.sql.SQLException: >>> java.util.concurrent.ExecutionException: java.lang.Exception: >>> java.lang.OutOfMemoryError: Java heap space >>> >>> at >>> org.apache.phoenix.cache.ServerCacheClient.addServerCache(ServerCacheClient.java:247) >>> >>> at >>> org.apache.phoenix.join.HashCacheClient.addHashCache(HashCacheClient.java:83) >>> >>> at >>> org.apache.phoenix.execute.HashJoinPlan$HashSubPlan.execute(HashJoinPlan.java:338) >>> >>> at org.apache.phoenix.execute.HashJoinPlan$1.call(HashJoinPlan.java:135) >>> >>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>> >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >>> >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >>> >>> at java.lang.Thread.run(Thread.java:745) >>> >>> Caused by: java.util.concurrent.ExecutionException: java.lang.Exception: >>> java.lang.OutOfMemoryError: Java heap space >>> >>> at java.util.concurrent.FutureTask.report(FutureTask.java:122) >>> >>> at java.util.concurrent.FutureTask.get(FutureTask.java:206) >>> >>> at >>> org.apache.phoenix.cache.ServerCacheClient.addServerCache(ServerCacheClient.java:239) >>> >>> ... 7 more >>> >>> Caused by: java.lang.Exception: java.lang.OutOfMemoryError: Java heap >>> space >>> >>> at >>> org.apache.phoenix.cache.ServerCacheClient$1.call(ServerCacheClient.java:212) >>> >>> at >>> org.apache.phoenix.cache.ServerCacheClient$1.call(ServerCacheClient.java:182) >>> >>> ... 4 more >>> >>> Caused by: java.lang.OutOfMemoryError: Java heap space >>> >>> May 20, 2015 4:58:01 PM ServerCommunicatorAdmin reqIncoming >>> >>> WARNING: The server has decided to close this client connection. >>> >>> 15/05/20 16:56:43 WARN client.HTable: Error calling coprocessor service >>> org.apache.phoenix.coprocessor.generated.ServerCachingProtos$ServerCachingService >>> for row CSGoogle\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 >>> >>> java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: >>> Java heap space >>> >>> at java.util.concurrent.FutureTask.report(FutureTask.java:122) >>> >>> at java.util.concurrent.FutureTask.get(FutureTask.java:192) >>> >>> at >>> org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1620) >>> >>> at >>> org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1577) >>> >>> at >>> org.apache.phoenix.cache.ServerCacheClient$1.call(ServerCacheClient.java:188) >>> >>> at >>> org.apache.phoenix.cache.ServerCacheClient$1.call(ServerCacheClient.java:182) >>> >>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>> >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >>> >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >>> >>> at java.lang.Thread.run(Thread.java:745) >>> >>> Caused by: java.lang.OutOfMemoryError: Java heap space >>> >>> at com.google.protobuf.ByteString$CodedBuilder.<init> >>> (ByteString.java:907) >>> >>> at com.google.protobuf.ByteString$CodedBuilder.<init> >>> (ByteString.java:902) >>> >>> at com.google.protobuf.ByteString.newCodedBuilder(ByteString.java:898) >>> >>> at >>> com.google.protobuf.AbstractMessageLite.toByteString(AbstractMessageLite.java:49) >>> >>> at >>> org.apache.hadoop.hbase.ipc.RegionCoprocessorRpcChannel.callExecService(RegionCoprocessorRpcChannel.java:81) >>> >>> at >>> org.apache.hadoop.hbase.ipc.CoprocessorRpcChannel.callMethod(CoprocessorRpcChannel.java:57) >>> >>> at >>> org.apache.phoenix.coprocessor.generated.ServerCachingProtos$ServerCachingService$Stub.addServerCache(ServerCachingProtos.java:3270) >>> >>> at >>> org.apache.phoenix.cache.ServerCacheClient$1$1.call(ServerCacheClient.java:204) >>> >>> at >>> org.apache.phoenix.cache.ServerCacheClient$1$1.call(ServerCacheClient.java:189) >>> >>> at org.apache.hadoop.hbase.client.HTable$17.call(HTable.java:1608) >>> >>> ... 4 more >>> >>> 0: jdbc:phoenix:prbhadoop004iad.io.askjeeves.> 15/05/20 16:56:43 WARN >>> client.HTable: Error calling coprocessor service >>> org.apache.phoenix.coprocessor.generated.ServerCachingProtos$ServerCachingService >>> for row EUGoogle\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 >>> >>> java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: >>> Java heap space >>> >>> at java.util.concurrent.FutureTask.report(FutureTask.java:122) >>> >>> at java.util.concurrent.FutureTask.get(FutureTask.java:192) >>> >>> at >>> org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1620) >>> >>> at >>> org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1577) >>> >>> at >>> org.apache.phoenix.cache.ServerCacheClient$1.call(ServerCacheClient.java:188) >>> >>> at >>> org.apache.phoenix.cache.ServerCacheClient$1.call(ServerCacheClient.java:182) >>> >>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>> >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >>> >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >>> >>> at java.lang.Thread.run(Thread.java:745) >>> >>> Caused by: java.lang.OutOfMemoryError: Java heap space >>> >>> 15/05/20 16:59:37 WARN >>> client.HConnectionManager$HConnectionImplementation: This client just lost >>> it's session with ZooKeeper, closing it. It will be recreated next time >>> someone needs it >>> >>> org.apache.zookeeper.KeeperException$SessionExpiredException: >>> KeeperErrorCode = Session expired >>> >>> at >>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:403) >>> >>> at >>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:321) >>> >>> at >>> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) >>> >>> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) >>> >>> >>> Error: (state=,code=0) >>> >>> java.sql.SQLFeatureNotSupportedException >>> >>> at >>> org.apache.phoenix.jdbc.PhoenixStatement.cancel(PhoenixStatement.java:958) >>> >>> at sqlline.DispatchCallback.forceKillSqlQuery(DispatchCallback.java:83) >>> >>> at sqlline.SqlLine.begin(SqlLine.java:695) >>> >>> at sqlline.SqlLine.start(SqlLine.java:398) >>> >>> at sqlline.SqlLine.main(SqlLine.java:292) >>> >>> >>> Can someone please help? >>> >>> >>> Thanks! >>> >> >