[ 
https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085051#comment-14085051
 ] 

Andrew Wang commented on HDFS-6634:
-----------------------------------

Hey James, thanks for posting this. I took a quick look, have some surface 
level comments:

Nits:
- Need license headers on new files
- Class javadoc for new classes would be nice
- InterfaceAudience and InterfaceStability annotations on new classes
- Could we pack each SubEvent as a nested class in Event? Then we can javadoc 
just Event. Could also squeeze EventList in there if you're feeling it.
- Javadoc has things like @link for referring to other classes and methods, 
@return and @param too. Might clarify things, especially for user-facing classes
- It'd be good to have an EventsList PB definition so it matches up with the 
Java classes.

Meta:
- Need a public class where we can get the new EventStream. My advice is 
putting it in DistributedFileSystem and then exposing to users in HdfsAdmin.
- Slapping a big fat javadoc on the new user method in HdfsAdmin would be good, 
since it'll need to explain its intended purpose and how to properly use the 
API.
- If you included an "event code" or "event type" enum for each event, you 
could switch over that instead of instaceof in PBHelper. It'd be kind of like 
how FSEditLogOpCodes work.

DFSInotifyEventInputStream
- I think it'd be better if we had an API that did a get-with-timeout like 
{{Future#get}}, seems like what an app would want
- I also don't follow why we need resync. If we get an IOException in next(), 
what can the app do besides resync and retry?
- At a high-level, it's be good to think how people will be using this API and 
dealing with failures. This is a good thing to experiment with in the test 
cases, forcing certain error conditions and seeing what the user code has to do 
to handle them.

NNRpcServer
- We very likely need to do getEditsFromTxid with the FSNamesystem read lock 
held. However, we also don't want to do any kind of RPC with the FSN lock held.
- That while loop condition looks hairy, could you split it up and add comments?
- What if a client is way behind when this is called? It looks like it ends up 
getting all the edits logs, which could be GBs. This API should be batched to 
only send some configurable # of events.

IFSELOTranslator:
- Can we do a switch/case on the op's opcode? It's an enum. Faster/better than 
doing instanceof.


> inotify in HDFS
> ---------------
>
>                 Key: HDFS-6634
>                 URL: https://issues.apache.org/jira/browse/HDFS-6634
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: hdfs-client, namenode, qjm
>            Reporter: James Thomas
>            Assignee: James Thomas
>         Attachments: HDFS-6634.2.patch, HDFS-6634.patch, 
> inotify-design.2.pdf, inotify-design.pdf, inotify-intro.2.pdf, 
> inotify-intro.pdf
>
>
> Design a mechanism for applications like search engines to access the HDFS 
> edit stream.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to