Re: Could not find hash cache for joinId

James Taylor Wed, 08 Jul 2015 12:09:08 -0700

Alex,
Do you pool the PhoenixConnection and if so can you try it without pooling?
Phoenix connections are not meant to be poooled.
Thanks,
James


On Wed, Jul 8, 2015 at 12:05 PM, Alex Kamil <[email protected]> wrote:

> Maryann,
>
>
> - the patch didn't help when applied to the client (we havent put it on
> the server yet)
> - starting another client instance in a separate jvm and running the query
> there after the query fails on the first client  - returns the same error
> - the counts are : table1: 68834 rows, table2: 2138 rows
> - to support multitenancy we currently set "MULTI_TENANT=true" in the
> CREATE stmt
> - we use tenant-based connection with apache dbcp connection pool using
> this code:
>
> *BasicDataSource ds = new BasicDataSource();*
>
> *ds.setDriverClassName("org.apache.phoenix.jdbc.PhoenixDriver");*
>
> *ds.setUrl("jdbc:phoenix:" + url);*
>
> *ds.setInitialSize(50);*
>
> *if (tenant != null) ds.setConnectionProperties("TenantId=" + tenant);*
>
> *return ds;*
> - when we don't use tenant based connection there is no error
> - verified that the tenant_id used in tenant connection has access to the
> records (created with the same tenant_id)
> - the problem occurs only on the cluster but works in stand-alone mode
>
> - are there any settings to be set on server or client side in the code or
> in hbase-site.xml to enable multitenancy?
> - were there any bug fixes related to multitenancy or cache management in
> joins since 3.3.0
>
> thanks
> Alex
>
> On Tue, Jul 7, 2015 at 2:22 PM, Maryann Xue <[email protected]> wrote:
>
>> It could be not the real cache expiration (which should not be considered
>> a bug), since your increasing the cache live time didn't solve the problem.
>> So the problem might be the cache had not been sent over to that server at
>> all, which then would be a bug, and mostly likely it would be because the
>> client didn't do it right.
>>
>> So starting a new client after the problem happens should be a good test
>> of the above theory.
>>
>> Anyway, what's the approximate time of running a count(*) on your
>> test.table2?
>>
>>
>> Thanks,
>> Maryann
>>
>> On Tue, Jul 7, 2015 at 1:53 PM, Alex Kamil <[email protected]> wrote:
>>
>>> Maryann,
>>>
>>> is this patch only for the client? as we saw the error in regionserver
>>> logs and it seems that server side cache has expired
>>>
>>> also by "start a new process doing the same query" do you mean start
>>> two client instances and run the query from one then from the other client?
>>>
>>> thanks
>>> Alex
>>>
>>> On Tue, Jul 7, 2015 at 1:20 PM, Maryann Xue <[email protected]>
>>> wrote:
>>>
>>>> My question was actually if the problem appears on your cluster, will
>>>> it go away if you just start a new process doing the same query? I do have
>>>> a patch, but it only fixes the problem I assume here, and it might be
>>>> something else.
>>>>
>>>>
>>>> Thanks,
>>>> Maryann
>>>>
>>>> On Tue, Jul 7, 2015 at 12:59 PM, Alex Kamil <[email protected]>
>>>> wrote:
>>>>
>>>>> a patch would be great, we saw that this problem goes away in
>>>>> standalone mode but reappears on the cluster
>>>>>
>>>>> On Tue, Jul 7, 2015 at 12:56 PM, Alex Kamil <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> sure, sounds good
>>>>>>
>>>>>> On Tue, Jul 7, 2015 at 10:57 AM, Maryann Xue <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Alex,
>>>>>>>
>>>>>>> I suspect it's related to using cached region locations that might
>>>>>>> have been invalid. A simple way to verify this is try starting a new 
>>>>>>> java
>>>>>>> process doing this query and see if the problem goes away.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Maryann
>>>>>>>
>>>>>>> On Mon, Jul 6, 2015 at 10:56 PM, Maryann Xue <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks a lot for the details, Alex! That might be a bug if it
>>>>>>>> failed only on cluster and increasing cache alive time didn't not help.
>>>>>>>> Would you mind testing it out for me if I provide a simple patch 
>>>>>>>> tomorrow?
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Maryann
>>>>>>>>
>>>>>>>> On Mon, Jul 6, 2015 at 9:09 PM, Alex Kamil <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> one more thing - the same query (via tenant connection) works in
>>>>>>>>> standalone mode but fails on a cluster.
>>>>>>>>> I've tried modifying
>>>>>>>>> phoenix.coprocessor.maxServerCacheTimeToLiveMs
>>>>>>>>> <https://phoenix.apache.org/tuning.html> from the default
>>>>>>>>> 30000(ms) to 300000 with no effect
>>>>>>>>>
>>>>>>>>> On Mon, Jul 6, 2015 at 7:35 PM, Alex Kamil <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> also pls note that it only fails with tenant-specific connections
>>>>>>>>>>
>>>>>>>>>> On Mon, Jul 6, 2015 at 7:17 PM, Alex Kamil <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Maryann,
>>>>>>>>>>>
>>>>>>>>>>> here is the query, I don't see warnings
>>>>>>>>>>> SELECT '\''||C.ROWKEY||'\'' AS RK, C.VS FROM  test.table1 AS C
>>>>>>>>>>> JOIN (SELECT DISTINCT B.ROWKEY, B.VS FROM test.table2 AS B) B
>>>>>>>>>>> ON (C.ROWKEY=B.ROWKEY AND C.VS=B.VS) LIMIT 2147483647;
>>>>>>>>>>>
>>>>>>>>>>> thanks
>>>>>>>>>>> Alex
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Jul 3, 2015 at 10:36 PM, Maryann Xue <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Alex,
>>>>>>>>>>>>
>>>>>>>>>>>> Most likely what happened was as suggested by the error
>>>>>>>>>>>> message: the cache might have expired. Could you please check if 
>>>>>>>>>>>> there are
>>>>>>>>>>>> any Phoenix warnings in the client log and share your query?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Maryann
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Jul 3, 2015 at 4:01 PM, Alex Kamil <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> getting this error with phoenix 3.3.0/hbase 0.94.15, any
>>>>>>>>>>>>> ideas?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> org.apache.phoenix.exception.PhoenixIOException: 
>>>>>>>>>>>>> org.apache.phoenix.exception.PhoenixIOException: 
>>>>>>>>>>>>> org.apache.hadoop.hbase.DoNotRetryIOException: Could not find 
>>>>>>>>>>>>> hash cache for joinId: ???Z
>>>>>>>>>>>>> ^XI??. The cache might have expired
>>>>>>>>>>>>>
>>>>>>>>>>>>> and have been removed.
>>>>>>>>>>>>>
>>>>>>>>>>>>>         at 
>>>>>>>>>>>>> org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:96)
>>>>>>>>>>>>>
>>>>>>>>>>>>>         at 
>>>>>>>>>>>>> org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:511)
>>>>>>>>>>>>>
>>>>>>>>>>>>>         at 
>>>>>>>>>>>>> org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48)
>>>>>>>>>>>>>
>>>>>>>>>>>>>         at 
>>>>>>>>>>>>> org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84)
>>>>>>>>>>>>>
>>>>>>>>>>>>>         at 
>>>>>>>>>>>>> org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111)
>>>>>>>>>>>>>
>>>>>>>>>>>>>         at 
>>>>>>>>>>>>> org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44)
>>>>>>>>>>>>>
>>>>>>>>>>>>>         at 
>>>>>>>>>>>>> org.apache.phoenix.iterate.LimitingResultIterator.next(LimitingResultIterator.java:47)
>>>>>>>>>>>>>
>>>>>>>>>>>>>         at 
>>>>>>>>>>>>> org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44)
>>>>>>>>>>>>>
>>>>>>>>>>>>>         at 
>>>>>>>>>>>>> org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739)
>>>>>>>>>>>>>
>>>>>>>>>>>>>         at 
>>>>>>>>>>>>> org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
>>>>>>>>>>>>>
>>>>>>>>>>>>> thanks
>>>>>>>>>>>>> Alex
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Could not find hash cache for joinId

Reply via email to