[ 
https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14103048#comment-14103048
 ] 

Colin Patrick McCabe commented on HDFS-6634:
--------------------------------------------

{code}
+  @Idempotent
+  public long getCurrentTxid() throws IOException;
+
+  @Idempotent
+  public EventsList getEditsFromTxid(long txid) throws IOException;
{code}
Some JavaDoc would be nice here.  Also, what about renaming "getCurrentTxid" to 
"getCurrentEditLogTxId"?

It seems like you're now returning a {{JournalNodeEditLogManifest}} object in 
most places that formerly returned a {{RemoteEditLogManifest}}.

{code}
+    if (state == State.CLOSED) {
+      throw new IllegalStateException("Cannot call selectInputStreams() on " +
+      "closed FSEditLog");
+    }
{code}
{{Preconditions.checkState}} might look nicer here

{code}
+  private static FSEditLogOp readOp(EditLogInputStream elis)
+      throws IOException {
+    try {
+      return elis.readOp();
+      // we can get the below two exceptions if a segment is deleted
+      // (because we have accumulated too many edits) or (for the local 
journal/
+      // no-QJM case only) if a in-progress segment is finalized under us ...
+      // no need to throw an exception back to the client in this case
+    } catch (FileNotFoundException e) {
+      return null;
+    } catch (TransferFsImage.HttpGetFailedException e) {
+      return null;
+    }
+  }
{code}
Let's add some {{LOG.debug}} messages here in the exception cases.

{{inotify.proto}}: can you add a constant, INVALID_TXID or something to this 
file?  That way you could use this instead of a raw -1 everywhere.  Also, 
you're sometimes treating txid as uint64, sometimes as int64... since Java 
doesn't have unsigned, and we're never gonna get to 2**63 txids anyway, let's 
just make it signed everywhere to avoid confusion.

It seems like HDFS-6777 is rolled into this patch.  Are you planning on doing 
HDFS-6777 first, or as part of this patch?  It might be nice to do it 
separately.

{code}
-        long accFirstTxId = acc.get(0).getFirstTxId();
+        EditLogInputStream accFirst = acc.get(0);
+        long accFirstTxId = accFirst.getFirstTxId();
         if (accFirstTxId == elis.getFirstTxId()) {
-          acc.add(elis);
+          // if we have a finalized log segment available at this txid,
+          // we should throw out all in-progress segments at this txid
+          if (elis.isInProgress()) {
+            if (accFirst.isInProgress()) {
+              acc.add(elis);
+            }
+          } else {
+            if (accFirst.isInProgress()) {
+              acc.clear();
+            }
+            acc.add(elis);
+          }
{code}
Why is this necessary?  It seems like your caller code could filter out 
in-progress edit log segments if that's what you want.  Isn't this also 
problematic from an HA perspective (i.e. if directory X has a finalized log 
segment, but it's unreadable, you *want* to start reading from the unfinalized 
one.)

{code}
+  public static Event[] translate(FSEditLogOp op) {
+    if (op.opCode == FSEditLogOpCodes.OP_ADD) {
...
{code}
Should use a switch statement here

{code}
+  public static enum EventType {
+    CREATE, CLOSE, REOPEN, RENAME, METADATA, UNLINK
+  }
{code}
Maybe "REOPEN" would be better as "APPEND".  Or at least "OPEN_FOR_APPEND".  
It's not my favorite terminology, but it's what we've been using for a while.  
Also, we should document somewhere that open-for-read operations don't show up 
here.

> inotify in HDFS
> ---------------
>
>                 Key: HDFS-6634
>                 URL: https://issues.apache.org/jira/browse/HDFS-6634
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: hdfs-client, namenode, qjm
>            Reporter: James Thomas
>            Assignee: James Thomas
>         Attachments: HDFS-6634.2.patch, HDFS-6634.3.patch, HDFS-6634.4.patch, 
> HDFS-6634.5.patch, HDFS-6634.patch, inotify-design.2.pdf, 
> inotify-design.3.pdf, inotify-design.pdf, inotify-intro.2.pdf, 
> inotify-intro.pdf
>
>
> Design a mechanism for applications like search engines to access the HDFS 
> edit stream.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to