[
https://issues.apache.org/jira/browse/PHOENIX-1402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196635#comment-14196635
]
Lars Hofhansl commented on PHOENIX-1402:
----------------------------------------
Are the stats recalculated *synchronously* in the split hook? That would
definitely throw off HBase in bad ways! A split in HBase does very little work:
update 3 rows in META, write two reference files that point to the old/new
region dir resp. That's it. All actual work of rewriting the files, is done
asynchronously.
I assume we pass the work on to some thread pool and exit the hook immediately,
right? If we don't do that this is a critical bug IMHO, as it violates HBase's
assumption that a split can finish within a second or two.
For now I think we can just drop the old stats and do nothing else. The stats
would be out of date until the next major compaction, but that is acceptable.
Splitting the stats is nice too, but if we have to read/write another table
synchronously in the split path, that's bad - even that should be offloaded
into the threadpool and the split hook return immediately.
> Don't recalculate stats on split
> --------------------------------
>
> Key: PHOENIX-1402
> URL: https://issues.apache.org/jira/browse/PHOENIX-1402
> Project: Phoenix
> Issue Type: Bug
> Reporter: James Taylor
>
> Rather than scan the new regions on a split (which is potentially expensive,
> and might be causing the timeouts you're seeing [~jfernando_sfdc]), we should
> instead just split up the existing guideposts between new two new regions
> based on the split point.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)