Hi,
I am using hbase 1.1 with phoenix 4.6. 
I have a table with row key as (current_timestamp, id) which is salted and 
index on (id). This table has ~3 million records.
I have a query like given below. 
SELECT  ID, CURRENT_TIMESTAMP, <SOME OTHER COLUMNS> from TBL                    
    as a inner join (                                select 
max(CURRENT_TIMESTAMP) as mct, ID as tid from TBL where ID like 
'42ecf4abd4bd7e7606025dc8eee3de6a3cc04418cbc2619ddc01f54d88d7c3bf%' group by 
ID) as tmp                        on a.ID=tmp.tid and 
a.CURRENT_TIMESTAMP=tmp.mct


The query hangs for long and finally fails with a timeout. I have 12 region 
servers each with 5GB heap and also the total records satisfying the above 
query is 62K whose CSV dump is ~10MB only. 
DoNotRetryIOException: Could not find hash cache for join Id: Ӧ�8�D�. The cache 
might have expired and have been removed

and - 
Caused by: java.sql.SQLException: Encountered exception in sub plan [0] 
execution.
        at 
org.apache.phoenix.execute.HashJoinPlan.iterator(HashJoinPlan.java:175)
        at 
com.infa.products.ldm.ingestion.server.java.hadoop.impl.FixPhoenixIngestInputFormat.getQueryPlan(FixPhoenixIngestInputFormat.java:94)
        ... 22 more
and
Caused by: java.sql.SQLException:
    java.util.concurrent.TimeoutException        at 
org.apache.phoenix.cache.ServerCacheClient.addServerCache(ServerCacheClient.java:264)

I can try playing around with parameters such as 
phoenix.coprocessor.maxServerCacheTimeToLiveMs and switching to sort_merge_join 
actually helped.
But my question is as per Joins | Apache Phoenix in a case such as lhs INNER 
JOIN rhs, it is rhs which will be built as hash table in server cache. So, in 
the above query I assume this gets cached?
 select max(CURRENT_TIMESTAMP) as mct, ID as tid from TBL where ID like 
'42ecf4abd4bd7e7606025dc8eee3de6a3cc04418cbc2619ddc01f54d88d7c3bf%' group by 
ID) as tmp 
Thanks,
Sumit

Reply via email to