[
https://issues.apache.org/jira/browse/PHOENIX-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16291313#comment-16291313
]
James Taylor commented on PHOENIX-4460:
---------------------------------------
Something must be holding onto the List<List<KeyRange>> preventing it from
being GCed. This simply clears that when the filtering is done such that it
should be GCed (and GCed earlier).
Still need to figure out the *why* (only happens with old client/new server)
and the *who* (is something holding on to the filter instance?), but focusing
mainly on a quick solution right now. Let's see if it helps - looks like the
test to repo is pretty straightforward.
> High GC / RS shutdown when we use select query with "IN" clause using 4.10
> phoenix client on 4.13 phoenix server
> ----------------------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-4460
> URL: https://issues.apache.org/jira/browse/PHOENIX-4460
> Project: Phoenix
> Issue Type: Bug
> Reporter: James Taylor
> Assignee: James Taylor
> Priority: Blocker
> Fix For: 4.14.0, 4.13.2
>
> Attachments: PHOENIX-4460.patch
>
>
> We were able to reproduce the High GC / RS shutdown / phoenix KeyRange query
> high object count issue on cluster today.
> Main observation is that this is reproducible when firing lots of query
> select from xyz where abc in (?, ?, ...) of this type with 4.10 phoenix
> client hitting 4.13 phoenix on HBase server side
> (4.10 client/4.10 server works fine, 4.13 client with 4.13 server works fine)
> We wrote a loader client (attached) with the below table/query , upserted
> ~100 million rows and fired the query in parallel using 4-5 loader clients
> with 16 threads each
> {code}
> TABLE: = "CREATE TABLE " + TABLE_NAME_TEMPLATE
> + " (\n" + " TestKey varchar(255) PRIMARY KEY, TestVal1 varchar(200),
> TestVal2 varchar(200), " + "TestValue varchar(10000))";
> QUERY: = "SELECT * FROM " + TABLE_NAME_TEMPLATE + " WHERE TestKey IN (?, ?,
> ?, ?, ?, ?, ?, ?, ?, ?)"
> {code}
> After running this client immediately within a min or two we see the
> phoenix.query.KeyRange object count immediately going up to several lakhs and
> keeps on increasing continuously. This count doesn't seem to come down even
> after shutting down the clients
> {code}
> -bash-4.1$ ~/current/bigdata-util/tools/Linux/jdk/jdk1.8.0_102_x64/bin/jmap
> -histo:live 90725 | grep KeyRange
> 47: 274852 6596448 org.apache.phoenix.query.KeyRange
> 1851: 2 48 org.apache.phoenix.query.KeyRange$Bound
> 2434: 1 24 [Lorg.apache.phoenix.query.KeyRange$Bound;
> 3411: 1 16 org.apache.phoenix.query.KeyRange$1
> 3412: 1 16 org.apache.phoenix.query.KeyRange$2
> {code}
> After some time we also started seeing High GC issues and RegionServers
> crashing
> Experiment Summary:
> - 4.13 client/4.13 Server --- Issue not reproducible (we do see KeyRange
> count increasing upto few 100's)
> - 4.10 client/4.10 Server --- Issue not reproducible (we do see KeyRange
> count increasing upto few 100's)
> - 4.10 client/4.13 Server --- Issue reproducible as described above
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)