[ https://issues.apache.org/jira/browse/PHOENIX-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15488407#comment-15488407 ]
Enis Soztutar commented on PHOENIX-3072: ---------------------------------------- bq. It's difficult to tell what's changed with all the whitespace diffs. Can you generate a patch without that? Sure. We should however clean up the code base. It has a lot of whitespace and indentation issues already. bq. It looks like you're setting a new "PRIORITY" attribute on table descriptor for indexes? How/where is this used? (never mind on this - I see it's part of an HBase JIRA). Yep, it is introduced and used via HBASE-16095. bq. How will you handle local indexes since the table descriptor is the same data and index table Local indexes will not have this problem since they are in the same table. There is no inter-dependency between index regions and data table regions. bq. Minor nit: is I suppose you're not using the HBase static constant for "PRIORITY" because this doesn't appear until HBase 1.3? Maybe we should define one in QueryConstants with a comment? Done in v2. bq. Didn't priority get exposed as an attribute on operations now? If so, would that be an alternate implementation mechanism which is a bit more flexible? This is not related to RPCs at all. The deadlock happens at region opening. BTW the patch for per-operation priorities is not committed yet I think on the HBase side. bq. What about existing tables and indexes - I didn't see any upgrade code that sets this for those. If setting priority on operation is an option, that'd get around this. I've thought about this, but it seems dangerous to alter the existing tables when Phoenix is upgraded. That is why there is no upgrade handling. Altering an existing table might have implications on availability, etc. Do we do this kind of alter for other features on Phoenix upgrade? If so we can hook into that. > Deadlock on region opening with secondary index recovery > -------------------------------------------------------- > > Key: PHOENIX-3072 > URL: https://issues.apache.org/jira/browse/PHOENIX-3072 > Project: Phoenix > Issue Type: Bug > Reporter: Enis Soztutar > Assignee: Enis Soztutar > Fix For: 4.9.0, 4.8.1 > > Attachments: phoenix-3072_v1.patch, phoenix-3072_v2.patch > > > There is a distributed deadlock happening in clusters with some moderate > number of regions for the data tables and secondary index tables and cluster > and it is cluster restart or some large failure. We have seen this in a > couple of production cases already. > Opening of regions in hbase is performed by a thread pool with 3 threads by > default. Every regionserver can open 3 regions at a time. However, opening > data table regions has to write to multiple index regions during WAL > recovery. All other region open requests are queued up in a single queue. > This causes a deadlock, since the secondary index regions are also opened by > the same thread pools that we do the work. So if there is greater number of > data table regions then available number of region opening threads from > regionservers, the secondary index region open requests just wait to be > processed in the queue. Since these index regions are not open, the region > opening of data table regions just block the region opening threads for a > long time. > One proposed fix is to use a different thread pool for opening regions of the > secondary index tables so that we will not deadlock. See HBASE-16095 for the > HBase-level fix. In Phoenix, we just have to set the priority for secondary > index tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)