[
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16667936#comment-16667936
]
Josh Elser commented on HBASE-21246:
------------------------------------
My take overall: seems like the changes are largely equivalent to
s/Path/WALIdentity/ which is good. I'm not yet convinced that there still isn't
more unwinding of {{Path}} to be done. I feel like I'm missing a critical piece
of the picture, but I just need to look more closely.
Consolidating all of the "dirt" on implementations of WALProvider is really
nice (separate HDFS code out of the guts of everything else), but we have lots
of code like {{provider.createWALIdentity(p.toString())}} now. I wonder if
there's something to be had for asking for WALIdentities from {{WALProvider}}
via {{ServerName}} or {{RegionInfo}} rather than creating them by hand? Need to
find the time to look more closely...
Another over-arching is theme is that I'd like WALFactory to move away from the
static methods and just use the non-static equivalents on WALProvider. I
understand that's probably a WIP (not the immediate goal here).
{code}
@@ -4699,7 +4708,8 @@ public class HRegion implements HeapSize,
PropagatingConfigurationObserver, Regi
status.setStatus("Opening recovered edits");
WAL.Reader reader = null;
try {
- reader = WALFactory.createReader(fs, edits, conf);
+ reader =
walProvider.createReader(walProvider.createWALIdentity(edits.toString()), null,
+ true);
{code}
Is this fully thought out: how does HRegion get the {{Path}} in the first
place? This still banks on the assumption that recovered.edits are always on
the storefile FS? (see below caveat on this one too).
{{hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SplitLogWorker.java}},
why not pass in WALProvider instead of WALFactory? Doesn't look like
WALFactory is used by SplitLogWorker in any other fashion than to obtain a
WALProvider instance.
{code}
- public static class LogsComparator implements Comparator<Path> {
-
+ public static class LogsComparator implements Comparator<WALIdentity> {
+ private static final Pattern WAL_FILE_NAME_PATTERN =
+ Pattern.compile("(.+)\\.(\\d+)(\\.[0-9A-Za-z]+)?");
@Override
- public int compare(Path o1, Path o2) {
+ public int compare(WALIdentity o1, WALIdentity o2) {
return Long.compare(getTS(o1), getTS(o2));
}
+ private long getTS(WALIdentity id) {
+ Matcher matcher = WAL_FILE_NAME_PATTERN.matcher(id.getName());
+ if (matcher.matches()) {
+ return Long.parseLong(matcher.group(2), 2);
+ } else {
+ throw new IllegalArgumentException(id.getName() + " is not a valid wal
file name");
+ }
+ }
{code}
This depends on the WALs having a specific name which is a smell. This is
extracting the creation time, right? I would think a method on WALProvider to
get this information would be better than relying on a specific naming.
{code}
---
a/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceFactory.java
+++
b/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceFactory.java
@@ -47,7 +47,7 @@ public class ReplicationSourceFactory {
LOG.warn("Passed replication source implementation throws errors, "
+ "defaulting to ReplicationSource",
e);
- src = isQueueRecovered ? new RecoveredReplicationSource() : new
ReplicationSource();
+ src = isQueueRecovered ? new FSRecoveredReplicationSource() : new
ReplicationSource();
{code}
IIRC, the plan was to have recovered edits on the filesystem (even if the WALs
are elsewhere). Two thoughts: first, we will have to look at Stephen Wu's
latest changes around here to make sure we aren't breaking what he and Zach are
doing; second, I could see a day where we need indirection around
recovered.edits. I don't think we have to tackle that now, but something to
keep in mind...
{code}
@@ -485,7 +492,8 @@ public class ReplicationSourceManager implements
ReplicationListener {
toRemove.terminate(terminateMessage);
}
for (NavigableSet<String> walsByGroup : walsById.get(peerId).values()) {
- walsByGroup.forEach(wal -> src.enqueueLog(new Path(this.logDir, wal)));
+ walsByGroup.forEach(wal -> src.enqueueLog(
+
((SyncReplicationWALProvider)this.walProvider).getFullPath(serverName, wal)));
}
{code}
{code}
- src.enqueueLog(new Path(oldLogDir, wal));
+ WALIdentity archivedWal =
((SyncReplicationWALProvider)walProvider)
+ .getWalFromArchivePath(wal);
+ if (archivedWal != null) {
+ src.enqueueLog(archivedWal);
+ }
{code}
Is this somethign that needs to be addressed? You couldn't possibly know that
this is a {{SyncReplicationWALProvider}}? (could you?)
{code}
+ @Override
+ public WALIdentity createWALIdentity(String wal) {
+ return new FSWALIdentity(wal);
{code}
So, RegionGroupingProvider will only work with HDFS-based implementations?
Couldn't you use the delegate WALProvider some-how?
Looks like the same for {{SyncReplicationWALProvider}}.
{code}
- public static Reader createReader(final FileSystem fs, final Path path,
+ public static Reader createReader(final WALIdentity walId,
final Configuration configuration) throws IOException {
...
+ WALProvider provider = getInstance(configuration).getWALProvider();
+ return provider.createReader(walId, null, true);
{code}
Is there a reason to keep the static "helper" methods around? Or, just not
something you took on?
{code}
+ public static Writer createWALWriter(final WALIdentity walId,
+ final Configuration configuration) throws IOException {
+ return FSHLogProvider.createWriter(configuration,
((FSWALIdentity)walId).getPath(), false);
}
{code}
Missed push-down into WALProvider.
> Introduce WALIdentity interface
> -------------------------------
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
> Issue Type: Sub-task
> Reporter: Ted Yu
> Assignee: Ted Yu
> Priority: Major
> Fix For: HBASE-20952
>
> Attachments: 21246.003.patch, 21246.20.txt,
> 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch,
> 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch,
> 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent
> filename in distributed filesystem environment or, the name of the stream
> when the WAL is backed by log stream.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)