Hi Alex, Could you please try this new patch?
Thanks, Maryann On Wed, Jul 8, 2015 at 3:53 PM, Maryann Xue <maryann....@gmail.com <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>> wrote: > Thanks again for all this information! Would you mind checking a couple > more things for me? For test.table1, does it have its regions on all region > servers in your cluster? And for region servers whose logs have that error > message, do they have table1's regions and what are the startkeys of those > regions? > > > Thanks, > Maryann > > On Wed, Jul 8, 2015 at 3:05 PM, Alex Kamil <alex.ka...@gmail.com > <javascript:_e(%7B%7D,'cvml','alex.ka...@gmail.com');>> wrote: > >> Maryann, >> >> >> - the patch didn't help when applied to the client (we havent put it on >> the server yet) >> - starting another client instance in a separate jvm and running the >> query there after the query fails on the first client - returns the same >> error >> - the counts are : table1: 68834 rows, table2: 2138 rows >> - to support multitenancy we currently set "MULTI_TENANT=true" in the >> CREATE stmt >> - we use tenant-based connection with apache dbcp connection pool using >> this code: >> >> *BasicDataSource ds = new BasicDataSource();* >> >> *ds.setDriverClassName("org.apache.phoenix.jdbc.PhoenixDriver");* >> >> *ds.setUrl("jdbc:phoenix:" + url);* >> >> *ds.setInitialSize(50);* >> >> *if (tenant != null) ds.setConnectionProperties("TenantId=" + tenant);* >> >> *return ds;* >> - when we don't use tenant based connection there is no error >> - verified that the tenant_id used in tenant connection has access to >> the records (created with the same tenant_id) >> - the problem occurs only on the cluster but works in stand-alone mode >> >> - are there any settings to be set on server or client side in the code >> or in hbase-site.xml to enable multitenancy? >> - were there any bug fixes related to multitenancy or cache management in >> joins since 3.3.0 >> >> thanks >> Alex >> >> On Tue, Jul 7, 2015 at 2:22 PM, Maryann Xue <maryann....@gmail.com >> <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>> wrote: >> >>> It could be not the real cache expiration (which should not be >>> considered a bug), since your increasing the cache live time didn't solve >>> the problem. So the problem might be the cache had not been sent over to >>> that server at all, which then would be a bug, and mostly likely it would >>> be because the client didn't do it right. >>> >>> So starting a new client after the problem happens should be a good test >>> of the above theory. >>> >>> Anyway, what's the approximate time of running a count(*) on your >>> test.table2? >>> >>> >>> Thanks, >>> Maryann >>> >>> On Tue, Jul 7, 2015 at 1:53 PM, Alex Kamil <alex.ka...@gmail.com >>> <javascript:_e(%7B%7D,'cvml','alex.ka...@gmail.com');>> wrote: >>> >>>> Maryann, >>>> >>>> is this patch only for the client? as we saw the error in regionserver >>>> logs and it seems that server side cache has expired >>>> >>>> also by "start a new process doing the same query" do you mean start >>>> two client instances and run the query from one then from the other client? >>>> >>>> thanks >>>> Alex >>>> >>>> On Tue, Jul 7, 2015 at 1:20 PM, Maryann Xue <maryann....@gmail.com >>>> <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>> wrote: >>>> >>>>> My question was actually if the problem appears on your cluster, will >>>>> it go away if you just start a new process doing the same query? I do have >>>>> a patch, but it only fixes the problem I assume here, and it might be >>>>> something else. >>>>> >>>>> >>>>> Thanks, >>>>> Maryann >>>>> >>>>> On Tue, Jul 7, 2015 at 12:59 PM, Alex Kamil <alex.ka...@gmail.com >>>>> <javascript:_e(%7B%7D,'cvml','alex.ka...@gmail.com');>> wrote: >>>>> >>>>>> a patch would be great, we saw that this problem goes away in >>>>>> standalone mode but reappears on the cluster >>>>>> >>>>>> On Tue, Jul 7, 2015 at 12:56 PM, Alex Kamil <alex.ka...@gmail.com >>>>>> <javascript:_e(%7B%7D,'cvml','alex.ka...@gmail.com');>> wrote: >>>>>> >>>>>>> sure, sounds good >>>>>>> >>>>>>> On Tue, Jul 7, 2015 at 10:57 AM, Maryann Xue <maryann....@gmail.com >>>>>>> <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>> wrote: >>>>>>> >>>>>>>> Hi Alex, >>>>>>>> >>>>>>>> I suspect it's related to using cached region locations that might >>>>>>>> have been invalid. A simple way to verify this is try starting a new >>>>>>>> java >>>>>>>> process doing this query and see if the problem goes away. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Maryann >>>>>>>> >>>>>>>> On Mon, Jul 6, 2015 at 10:56 PM, Maryann Xue <maryann....@gmail.com >>>>>>>> <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>> wrote: >>>>>>>> >>>>>>>>> Thanks a lot for the details, Alex! That might be a bug if it >>>>>>>>> failed only on cluster and increasing cache alive time didn't not >>>>>>>>> help. >>>>>>>>> Would you mind testing it out for me if I provide a simple patch >>>>>>>>> tomorrow? >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Maryann >>>>>>>>> >>>>>>>>> On Mon, Jul 6, 2015 at 9:09 PM, Alex Kamil <alex.ka...@gmail.com >>>>>>>>> <javascript:_e(%7B%7D,'cvml','alex.ka...@gmail.com');>> wrote: >>>>>>>>> >>>>>>>>>> one more thing - the same query (via tenant connection) works in >>>>>>>>>> standalone mode but fails on a cluster. >>>>>>>>>> I've tried modifying >>>>>>>>>> phoenix.coprocessor.maxServerCacheTimeToLiveMs >>>>>>>>>> <https://phoenix.apache.org/tuning.html> from the default >>>>>>>>>> 30000(ms) to 300000 with no effect >>>>>>>>>> >>>>>>>>>> On Mon, Jul 6, 2015 at 7:35 PM, Alex Kamil <alex.ka...@gmail.com >>>>>>>>>> <javascript:_e(%7B%7D,'cvml','alex.ka...@gmail.com');>> wrote: >>>>>>>>>> >>>>>>>>>>> also pls note that it only fails with tenant-specific connections >>>>>>>>>>> >>>>>>>>>>> On Mon, Jul 6, 2015 at 7:17 PM, Alex Kamil <alex.ka...@gmail.com >>>>>>>>>>> <javascript:_e(%7B%7D,'cvml','alex.ka...@gmail.com');>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Maryann, >>>>>>>>>>>> >>>>>>>>>>>> here is the query, I don't see warnings >>>>>>>>>>>> SELECT '\''||C.ROWKEY||'\'' AS RK, C.VS FROM test.table1 AS C >>>>>>>>>>>> JOIN (SELECT DISTINCT B.ROWKEY, B.VS FROM test.table2 AS B) B >>>>>>>>>>>> ON (C.ROWKEY=B.ROWKEY AND C.VS=B.VS) LIMIT 2147483647; >>>>>>>>>>>> >>>>>>>>>>>> thanks >>>>>>>>>>>> Alex >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jul 3, 2015 at 10:36 PM, Maryann Xue < >>>>>>>>>>>> maryann....@gmail.com >>>>>>>>>>>> <javascript:_e(%7B%7D,'cvml','maryann....@gmail.com');>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Alex, >>>>>>>>>>>>> >>>>>>>>>>>>> Most likely what happened was as suggested by the error >>>>>>>>>>>>> message: the cache might have expired. Could you please check if >>>>>>>>>>>>> there are >>>>>>>>>>>>> any Phoenix warnings in the client log and share your query? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Maryann >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Jul 3, 2015 at 4:01 PM, Alex Kamil < >>>>>>>>>>>>> alex.ka...@gmail.com >>>>>>>>>>>>> <javascript:_e(%7B%7D,'cvml','alex.ka...@gmail.com');>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> getting this error with phoenix 3.3.0/hbase 0.94.15, any >>>>>>>>>>>>>> ideas? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> org.apache.phoenix.exception.PhoenixIOException: >>>>>>>>>>>>>> org.apache.phoenix.exception.PhoenixIOException: >>>>>>>>>>>>>> org.apache.hadoop.hbase.DoNotRetryIOException: Could not find >>>>>>>>>>>>>> hash cache for joinId: ???Z >>>>>>>>>>>>>> ^XI??. The cache might have expired >>>>>>>>>>>>>> >>>>>>>>>>>>>> and have been removed. >>>>>>>>>>>>>> >>>>>>>>>>>>>> at >>>>>>>>>>>>>> org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:96) >>>>>>>>>>>>>> >>>>>>>>>>>>>> at >>>>>>>>>>>>>> org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:511) >>>>>>>>>>>>>> >>>>>>>>>>>>>> at >>>>>>>>>>>>>> org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48) >>>>>>>>>>>>>> >>>>>>>>>>>>>> at >>>>>>>>>>>>>> org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84) >>>>>>>>>>>>>> >>>>>>>>>>>>>> at >>>>>>>>>>>>>> org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111) >>>>>>>>>>>>>> >>>>>>>>>>>>>> at >>>>>>>>>>>>>> org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) >>>>>>>>>>>>>> >>>>>>>>>>>>>> at >>>>>>>>>>>>>> org.apache.phoenix.iterate.LimitingResultIterator.next(LimitingResultIterator.java:47) >>>>>>>>>>>>>> >>>>>>>>>>>>>> at >>>>>>>>>>>>>> org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) >>>>>>>>>>>>>> >>>>>>>>>>>>>> at >>>>>>>>>>>>>> org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739) >>>>>>>>>>>>>> >>>>>>>>>>>>>> at >>>>>>>>>>>>>> org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207) >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks >>>>>>>>>>>>>> Alex >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
could_not_find_hash_cache.patch
Description: Binary data