[
https://issues.apache.org/jira/browse/ACCUMULO-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280369#comment-13280369
]
Keith Turner commented on ACCUMULO-578:
---------------------------------------
Eric,
I was looking at the proposed GC algorithm and trying to think of situations
where the following might occur.
# In use walog is deleted
# Unused walog is never deleted
It looks pretty solid. The following interleaving of events could be
problematic. This is possible because there is time between when a lock is
deleted and when a tablet server kills itself.
# TserverA creates Walog1
# User deletes lock for TserverA
# GC does not see TServerA in zookeeper
# GC does not see any references to Walog1 in !METADATA
# TserverA writes that TabletX is using Walog1
# TserverA notices its lock went away and kills itself
# GC deletes Walog1
# TabletX fails to load because Walog1 does not exists
> consider using hdfs for the walog
> ---------------------------------
>
> Key: ACCUMULO-578
> URL: https://issues.apache.org/jira/browse/ACCUMULO-578
> Project: Accumulo
> Issue Type: Improvement
> Components: logger, tserver
> Affects Versions: 1.5.0-SNAPSHOT
> Reporter: Eric Newton
> Assignee: Eric Newton
> Attachments: HDFS_WAL_states.pdf, comparison.png
>
>
> Using HDFS for walogs would fix:
> * ACCUMULO-84: any node can read the replicated files
> * ACCUMULO-558: wouldn't need to monitor loggers
> * ACCUMULO-544: log references wouldn't include hostnames
> * ACCUMULO-423: wouldn't need to monitor loggers
> * ACCUMULO-258: hdfs has load balancing already
> To implement it, we would need the ability to distribute log sorts.
> Continuing to use loggers helps us avoid:
> * hdfs pipeline strategy
> * we don't have fine-grained insight when a single node makes dfs slow
> * additional namenode pressure
> * flexibility: for example, we can add fadvise() calls to the logger before
> HDFS supports it
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira