[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16667936#comment-16667936
 ] 

Josh Elser commented on HBASE-21246:
------------------------------------

My take overall: seems like the changes are largely equivalent to 
s/Path/WALIdentity/ which is good. I'm not yet convinced that there still isn't 
more unwinding of {{Path}} to be done. I feel like I'm missing a critical piece 
of the picture, but I just need to look more closely.

Consolidating all of the "dirt" on implementations of WALProvider is really 
nice (separate HDFS code out of the guts of everything else), but we have lots 
of code like {{provider.createWALIdentity(p.toString())}} now. I wonder if 
there's something to be had for asking for WALIdentities from {{WALProvider}} 
via {{ServerName}} or {{RegionInfo}} rather than creating them by hand? Need to 
find the time to look more closely...

Another over-arching is theme is that I'd like WALFactory to move away from the 
static methods and just use the non-static equivalents on WALProvider. I 
understand that's probably a WIP (not the immediate goal here).

{code}
@@ -4699,7 +4708,8 @@ public class HRegion implements HeapSize, 
PropagatingConfigurationObserver, Regi
     status.setStatus("Opening recovered edits");
     WAL.Reader reader = null;
     try {
-      reader = WALFactory.createReader(fs, edits, conf);
+      reader = 
walProvider.createReader(walProvider.createWALIdentity(edits.toString()), null,
+          true);
{code}

Is this fully thought out: how does HRegion get the {{Path}} in the first 
place? This still banks on the assumption that recovered.edits are always on 
the storefile FS? (see below caveat on this one too).

{{hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SplitLogWorker.java}},
 why not pass in WALProvider instead of WALFactory? Doesn't look like 
WALFactory is used by SplitLogWorker in any other fashion than to obtain a 
WALProvider instance.

{code}
-  public static class LogsComparator implements Comparator<Path> {
-
+  public static class LogsComparator implements Comparator<WALIdentity> {
+    private static final Pattern WAL_FILE_NAME_PATTERN =
+        Pattern.compile("(.+)\\.(\\d+)(\\.[0-9A-Za-z]+)?");
     @Override
-    public int compare(Path o1, Path o2) {
+    public int compare(WALIdentity o1, WALIdentity o2) {
       return Long.compare(getTS(o1), getTS(o2));
     }
 
+    private long getTS(WALIdentity id) {
+      Matcher matcher = WAL_FILE_NAME_PATTERN.matcher(id.getName());
+      if (matcher.matches()) {
+        return Long.parseLong(matcher.group(2), 2);
+      } else {
+        throw new IllegalArgumentException(id.getName() + " is not a valid wal 
file name");
+      }
+    }
{code}

This depends on the WALs having a specific name which is a smell. This is 
extracting the creation time, right? I would think a method on WALProvider to 
get this information would be better than relying on a specific naming.

{code}
--- 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceFactory.java
+++ 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceFactory.java
@@ -47,7 +47,7 @@ public class ReplicationSourceFactory {
       LOG.warn("Passed replication source implementation throws errors, "
           + "defaulting to ReplicationSource",
         e);
-      src = isQueueRecovered ? new RecoveredReplicationSource() : new 
ReplicationSource();
+      src = isQueueRecovered ? new FSRecoveredReplicationSource() : new 
ReplicationSource();
{code}

IIRC, the plan was to have recovered edits on the filesystem (even if the WALs 
are elsewhere). Two thoughts: first, we will have to look at Stephen Wu's 
latest changes around here to make sure we aren't breaking what he and Zach are 
doing; second, I could see a day where we need indirection around 
recovered.edits. I don't think we have to tackle that now, but something to 
keep in mind...

{code}
@@ -485,7 +492,8 @@ public class ReplicationSourceManager implements 
ReplicationListener {
         toRemove.terminate(terminateMessage);
       }
       for (NavigableSet<String> walsByGroup : walsById.get(peerId).values()) {
-        walsByGroup.forEach(wal -> src.enqueueLog(new Path(this.logDir, wal)));
+        walsByGroup.forEach(wal -> src.enqueueLog(
+            
((SyncReplicationWALProvider)this.walProvider).getFullPath(serverName, wal)));
       }
{code}
{code}
-              src.enqueueLog(new Path(oldLogDir, wal));
+              WALIdentity archivedWal = 
((SyncReplicationWALProvider)walProvider)
+                  .getWalFromArchivePath(wal);
+              if (archivedWal != null) {
+                src.enqueueLog(archivedWal);
+              }
{code}

Is this somethign that needs to be addressed? You couldn't possibly know that 
this is a {{SyncReplicationWALProvider}}? (could you?)

{code}
+  @Override
+  public WALIdentity createWALIdentity(String wal) {
+    return new FSWALIdentity(wal);
{code}

So, RegionGroupingProvider will only work with HDFS-based implementations? 
Couldn't you use the delegate WALProvider some-how?

Looks like the same for {{SyncReplicationWALProvider}}.

{code}
-  public static Reader createReader(final FileSystem fs, final Path path,
+  public static Reader createReader(final WALIdentity walId,
       final Configuration configuration) throws IOException {
...
+    WALProvider provider = getInstance(configuration).getWALProvider();
+    return provider.createReader(walId, null, true);
{code}

Is there a reason to keep the static "helper" methods around? Or, just not 
something you took on?

{code}
+  public static Writer createWALWriter(final WALIdentity walId,
+      final Configuration configuration) throws IOException {
+    return FSHLogProvider.createWriter(configuration, 
((FSWALIdentity)walId).getPath(), false);
   }
{code}

Missed push-down into WALProvider.

> Introduce WALIdentity interface
> -------------------------------
>
>                 Key: HBASE-21246
>                 URL: https://issues.apache.org/jira/browse/HBASE-21246
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>            Priority: Major
>             Fix For: HBASE-20952
>
>         Attachments: 21246.003.patch, 21246.20.txt, 
> 21246.HBASE-20952.001.patch, 21246.HBASE-20952.002.patch, 
> 21246.HBASE-20952.004.patch, 21246.HBASE-20952.005.patch, 
> 21246.HBASE-20952.007.patch, 21246.HBASE-20952.008.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to