Alex, Do you pool the PhoenixConnection and if so can you try it without pooling? Phoenix connections are not meant to be poooled. Thanks, James
On Wed, Jul 8, 2015 at 12:05 PM, Alex Kamil <[email protected]> wrote: > Maryann, > > > - the patch didn't help when applied to the client (we havent put it on > the server yet) > - starting another client instance in a separate jvm and running the query > there after the query fails on the first client - returns the same error > - the counts are : table1: 68834 rows, table2: 2138 rows > - to support multitenancy we currently set "MULTI_TENANT=true" in the > CREATE stmt > - we use tenant-based connection with apache dbcp connection pool using > this code: > > *BasicDataSource ds = new BasicDataSource();* > > *ds.setDriverClassName("org.apache.phoenix.jdbc.PhoenixDriver");* > > *ds.setUrl("jdbc:phoenix:" + url);* > > *ds.setInitialSize(50);* > > *if (tenant != null) ds.setConnectionProperties("TenantId=" + tenant);* > > *return ds;* > - when we don't use tenant based connection there is no error > - verified that the tenant_id used in tenant connection has access to the > records (created with the same tenant_id) > - the problem occurs only on the cluster but works in stand-alone mode > > - are there any settings to be set on server or client side in the code or > in hbase-site.xml to enable multitenancy? > - were there any bug fixes related to multitenancy or cache management in > joins since 3.3.0 > > thanks > Alex > > On Tue, Jul 7, 2015 at 2:22 PM, Maryann Xue <[email protected]> wrote: > >> It could be not the real cache expiration (which should not be considered >> a bug), since your increasing the cache live time didn't solve the problem. >> So the problem might be the cache had not been sent over to that server at >> all, which then would be a bug, and mostly likely it would be because the >> client didn't do it right. >> >> So starting a new client after the problem happens should be a good test >> of the above theory. >> >> Anyway, what's the approximate time of running a count(*) on your >> test.table2? >> >> >> Thanks, >> Maryann >> >> On Tue, Jul 7, 2015 at 1:53 PM, Alex Kamil <[email protected]> wrote: >> >>> Maryann, >>> >>> is this patch only for the client? as we saw the error in regionserver >>> logs and it seems that server side cache has expired >>> >>> also by "start a new process doing the same query" do you mean start >>> two client instances and run the query from one then from the other client? >>> >>> thanks >>> Alex >>> >>> On Tue, Jul 7, 2015 at 1:20 PM, Maryann Xue <[email protected]> >>> wrote: >>> >>>> My question was actually if the problem appears on your cluster, will >>>> it go away if you just start a new process doing the same query? I do have >>>> a patch, but it only fixes the problem I assume here, and it might be >>>> something else. >>>> >>>> >>>> Thanks, >>>> Maryann >>>> >>>> On Tue, Jul 7, 2015 at 12:59 PM, Alex Kamil <[email protected]> >>>> wrote: >>>> >>>>> a patch would be great, we saw that this problem goes away in >>>>> standalone mode but reappears on the cluster >>>>> >>>>> On Tue, Jul 7, 2015 at 12:56 PM, Alex Kamil <[email protected]> >>>>> wrote: >>>>> >>>>>> sure, sounds good >>>>>> >>>>>> On Tue, Jul 7, 2015 at 10:57 AM, Maryann Xue <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi Alex, >>>>>>> >>>>>>> I suspect it's related to using cached region locations that might >>>>>>> have been invalid. A simple way to verify this is try starting a new >>>>>>> java >>>>>>> process doing this query and see if the problem goes away. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Maryann >>>>>>> >>>>>>> On Mon, Jul 6, 2015 at 10:56 PM, Maryann Xue <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Thanks a lot for the details, Alex! That might be a bug if it >>>>>>>> failed only on cluster and increasing cache alive time didn't not help. >>>>>>>> Would you mind testing it out for me if I provide a simple patch >>>>>>>> tomorrow? >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Maryann >>>>>>>> >>>>>>>> On Mon, Jul 6, 2015 at 9:09 PM, Alex Kamil <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> one more thing - the same query (via tenant connection) works in >>>>>>>>> standalone mode but fails on a cluster. >>>>>>>>> I've tried modifying >>>>>>>>> phoenix.coprocessor.maxServerCacheTimeToLiveMs >>>>>>>>> <https://phoenix.apache.org/tuning.html> from the default >>>>>>>>> 30000(ms) to 300000 with no effect >>>>>>>>> >>>>>>>>> On Mon, Jul 6, 2015 at 7:35 PM, Alex Kamil <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> also pls note that it only fails with tenant-specific connections >>>>>>>>>> >>>>>>>>>> On Mon, Jul 6, 2015 at 7:17 PM, Alex Kamil <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Maryann, >>>>>>>>>>> >>>>>>>>>>> here is the query, I don't see warnings >>>>>>>>>>> SELECT '\''||C.ROWKEY||'\'' AS RK, C.VS FROM test.table1 AS C >>>>>>>>>>> JOIN (SELECT DISTINCT B.ROWKEY, B.VS FROM test.table2 AS B) B >>>>>>>>>>> ON (C.ROWKEY=B.ROWKEY AND C.VS=B.VS) LIMIT 2147483647; >>>>>>>>>>> >>>>>>>>>>> thanks >>>>>>>>>>> Alex >>>>>>>>>>> >>>>>>>>>>> On Fri, Jul 3, 2015 at 10:36 PM, Maryann Xue < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Alex, >>>>>>>>>>>> >>>>>>>>>>>> Most likely what happened was as suggested by the error >>>>>>>>>>>> message: the cache might have expired. Could you please check if >>>>>>>>>>>> there are >>>>>>>>>>>> any Phoenix warnings in the client log and share your query? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Maryann >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jul 3, 2015 at 4:01 PM, Alex Kamil < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> getting this error with phoenix 3.3.0/hbase 0.94.15, any >>>>>>>>>>>>> ideas? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> org.apache.phoenix.exception.PhoenixIOException: >>>>>>>>>>>>> org.apache.phoenix.exception.PhoenixIOException: >>>>>>>>>>>>> org.apache.hadoop.hbase.DoNotRetryIOException: Could not find >>>>>>>>>>>>> hash cache for joinId: ???Z >>>>>>>>>>>>> ^XI??. The cache might have expired >>>>>>>>>>>>> >>>>>>>>>>>>> and have been removed. >>>>>>>>>>>>> >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:96) >>>>>>>>>>>>> >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:511) >>>>>>>>>>>>> >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.phoenix.iterate.MergeSortResultIterator.getIterators(MergeSortResultIterator.java:48) >>>>>>>>>>>>> >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:84) >>>>>>>>>>>>> >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:111) >>>>>>>>>>>>> >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) >>>>>>>>>>>>> >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.phoenix.iterate.LimitingResultIterator.next(LimitingResultIterator.java:47) >>>>>>>>>>>>> >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44) >>>>>>>>>>>>> >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:739) >>>>>>>>>>>>> >>>>>>>>>>>>> at >>>>>>>>>>>>> org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207) >>>>>>>>>>>>> >>>>>>>>>>>>> thanks >>>>>>>>>>>>> Alex >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
