File a blocker please Lars. I'm pretty sure the boolean on whether we are
doing a recovery or not has been there a long time so yeah, a single server
recovery could throw us off, but you make a practical point, that one
server should not destroy locality over the cluster.
St.Ack
On Tue, Mar 14,
Let me check with @Enis in JIRA and get back to you (maybe days, due to
schedule) later.
Best Regards,
Yu
On 15 March 2017 at 05:57, jeff saremi wrote:
> What's involved in getting this change merged into the main branch? These
> 2 counters (fsReadLatency,
Great, and I changed my vote to -0 because Stack made a good argument that
making more changes would invalidate review up to this point, and I trust
this will be resolved before release.
On Tue, Mar 14, 2017 at 4:29 PM, Josh Elser wrote:
> Sorry Andrew, let me clarify as that
Sorry Andrew, let me clarify as that didn't come out right.
I didn't mean that isn't a conversation worth having _now_, just that I
was intentionally avoiding it in my previous email because I didn't
understand the scope of those issues that Vlad had identified. I wanted
to better understand
To be honest, Andrew, it is a blocker because I called it BLOCKER. By
BLOCKER I meant - MUST be resolved by 2.0 RC1.
How far are we from 2.0 RC1, by the way? I am pretty sure, that not only
Phase 3 will be completed by that date, but even more advanced - Phase 4,
with features like snapshot-less
> I'm going to intentional avoid addressing the discussion of shipping
partial features (related, but not relevant at the moment).
Then we are not having the same conversation, because it is precisely
because this is a vote for this feature to go into 2.0, which is already
overdue, so should be
What's involved in getting this change merged into the main branch? These 2
counters (fsReadLatency, fsWriteLatency) are super important to us
understanding what goes on behind every request. These are the minimum we need
to have especially in the absence of HTrace.
I just checked the latest
>> How would a user recover from such a state
User will need to run full backup for every table. No, we have not
encountered this issue during testing, but, of course, it is possible,
especially on a large cluster.
-Vlad
On Tue, Mar 14, 2017 at 2:36 PM, Josh Elser wrote:
>
stack created HBASE-17788:
-
Summary: Procedure V2 performance improvements
Key: HBASE-17788
URL: https://issues.apache.org/jira/browse/HBASE-17788
Project: HBase
Issue Type: Sub-task
Thanks for quick reply, Vlad!
How would a user recover from such a state with the backup table in a
broken state? Have you encountered such a scenario in your testing?
The docs issue seems to be pretty minor too. I remember walking through
the original patch (HBASE-16574) and it was pretty
[
https://issues.apache.org/jira/browse/HBASE-17768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Enis Soztutar resolved HBASE-17768.
---
Resolution: Fixed
Hadoop Flags: Reviewed
Fix Version/s: HBASE-14850
Thanks
Josh,
On a doc side we have a very good manual:
https://issues.apache.org/jira/secure/attachment/12829269/Backup-and-Restore-Apache_19Sep2016.pdf
The only thing what was changed is the command-line tools args format, but
you can start from there - just type command w/o args.
In the current
I took a moment to read through the "blockers" as originally identified
by Vlad, and (to echo Enis' take) I read the majority of them as being
blockers not for the next release, but for a "full-fledged feature". I'm
going to intentional avoid addressing the discussion of shipping partial
stack created HBASE-17787:
-
Summary: Drop rollback for all procedures; not useful afterall
Key: HBASE-17787
URL: https://issues.apache.org/jira/browse/HBASE-17787
Project: HBase
Issue Type: Sub-task
stack created HBASE-17786:
-
Summary: Create LoadBalancer perf-tests (test balancer algorithm
decoupled from workload)
Key: HBASE-17786
URL: https://issues.apache.org/jira/browse/HBASE-17786
Project: HBase
Andrew Purtell created HBASE-17785:
--
Summary: RSGroupBasedLoadBalancer has no concept of default group
Key: HBASE-17785
URL: https://issues.apache.org/jira/browse/HBASE-17785
Project: HBase
[
https://issues.apache.org/jira/browse/HBASE-17781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anoop Sam John resolved HBASE-17781.
Resolution: Invalid
HBASE-17723 is not yet committed. So if locally applying that patch
Wait, HBASE-15251 is not enough methinks. The checks added help, but
are not covering all the possible edge cases. In particular, say a
node really fails, why not just reassign the few regions it did hold
and leave all the others where they are? Seems insane as it is.
On Tue, Mar 14, 2017 at 2:24
[
https://issues.apache.org/jira/browse/HBASE-14129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lars George resolved HBASE-14129.
-
Resolution: Won't Fix
Closing as "won't fix" as the hardcoded flag is too intrusive. The cluster
Looking at the code more... it seems the issue is here
In AssignmentManager.processDeadServersAndRegionsInTransition():
...
failoverCleanupDone();
if (!failover) {
// Fresh cluster startup.
LOG.info("Clean cluster startup. Assigning user regions");
assignAllUserRegions(allRegions);
}
...
Hi,
I had this happened at multiple clusters recently where after the
restart the locality dropped from close to or exactly 100% down to
single digits. The reason is that all regions were completely shuffled
and reassigned to random servers. Upon reading the (yet again
non-trivial) assignment
Doh, https://issues.apache.org/jira/browse/HBASE-15251 addresses this
(though I am not sure exactly how, see below). This should be
backported to all 1.x branches!
As for the patch, I see this
if (!failover) {
// Fresh cluster startup.
- LOG.info("Clean cluster startup.
Guangxu Cheng created HBASE-17784:
-
Summary: Check if the regions are deployed in the correct group,
and move it to the correct group when starting master
Key: HBASE-17784
URL:
Guangxu Cheng created HBASE-17783:
-
Summary: Fix bugs that is still inconsistent after synchronizing
zk and htable data at master startup
Key: HBASE-17783
URL: https://issues.apache.org/jira/browse/HBASE-17783
Yu Li created HBASE-17782:
-
Summary: Introduce new configuration to decide reference type used
in IdReadWriteLock
Key: HBASE-17782
URL: https://issues.apache.org/jira/browse/HBASE-17782
Project: HBase
Anup Halarnkar created HBASE-17781:
--
Summary: TestAcidGuarantees
Key: HBASE-17781
URL: https://issues.apache.org/jira/browse/HBASE-17781
Project: HBase
Issue Type: Bug
Components:
26 matches
Mail list logo