[
https://issues.apache.org/jira/browse/HBASE-7948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13593930#comment-13593930
]
nkeywal commented on HBASE-7948:
--------------------------------
bq. Can you please elaborate about more dangerous parts?
I was thinking about the code that we're slowly removing with HBASE-8002. It
has 3 sides effects:
1) It was decreasing the performances, it has been fixed in numerous patches,
but there are still scary comments and issues (HBASE-7247)
2) It was hiding issues. In the tests we had very low timeout, so master
failover scenarios seemed to be working. In production, we were depending on a
10 minutes timeout but we didn't know.
3) It was causing double assignment issues, i.e. data corruption.
This was exactly the same logic (don't trust the RS), with more dramatic
consequences.
bq. In my experience it's not safe to trust anything forever as a general
principle, not because I think RS code is unreliable.
I'm not against this, but in this case we need to tackle this the standard way:
watchdog the process, and exclude the fuzzy ones from the group. Before doing
this, I could like to see the chaos monkey test working with kill -9 for a
while (I doubt it does today :-( )
But I agree with your point, and we will have this soon or later (BTW, it's
exactly why there are checksums in hdfs: because you can't trust the storage).
bq. But for client there's no data loss potential from flushing the cache, but
there's potential to be stuck forever in case of abnormal RS behavior. With
remote things I prefer to be defensive on all sides if practical
Yeah. I'm likely biased. So, imho
- the patch is an improvement. HBase is better with this patch than without.
- it would be simpler without the RS-trust part
- to me, at the margin on degraded conditions, it would be more efficient
without the RS-trust part as well.
As we need to make progress :-), I propose:
1) Well, if you're not against the idea of removing the RS-trust part, we're
done
2) If you really want to keep it, let's wait a few days if someone wants to
come by. If no one does, let's commit on Friday.
What do you think?
> client doesn't need to refresh meta while the region is opening
> ---------------------------------------------------------------
>
> Key: HBASE-7948
> URL: https://issues.apache.org/jira/browse/HBASE-7948
> Project: HBase
> Issue Type: Improvement
> Reporter: Sergey Shelukhin
> Assignee: Sergey Shelukhin
> Attachments: HBASE-7948-v0.patch, HBASE-7948-v1.patch,
> HBASE-7948-v1.patch, HBASE-7948-v2.patch
>
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira