[ 
https://issues.apache.org/jira/browse/HDFS-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028114#comment-13028114
 ] 

Ivan Kelly commented on HDFS-1799:
----------------------------------

{quote}
I agree that your design avoids introducing a new class by collapsing the file 
management (ie "log lifecycle") code into EditLogOutputStream. However, I don't 
feel like "fewer code changes" or "fewer classes" are particularly appropriate 
design goals. 
{quote}

More code changes and more classes indicate that more complexity is being added 
which I think is what is happening here. Instead of FSEditLog being a manager 
of log objects, it becomes a manager of managers of log objects. More 
complexity is fine as long as it adds clarity to the design, or the ability to 
do things previously not possible. I don't think JournalManager adds either.

{quote}
The nice thing about this design is that EditLogOutputStream needs no 
conception of StorageDirectories, for example. In fact, it is barely coupled to 
HDFS at all with the exception of using a few static constants.

In contrast, your design collapses layout management into the output stream, 
thus making EditLogFileOutputStream depend on StorageDirectory, NNStorage, etc.
{quote}

I don't see any harm in EditLogFileOutputStream knowing about StorageDirectory 
and NNStorage as long as this doesn't pollute EditLogOutputStream. It is the 
implementation of EditLogOutputStream for files after all, and files in HDFS 
should be dealt with through StorageDirectory.

My design only collapses the API into the output stream, not the actual 
implement. #nextSegment() could make a call to a 
JournalManager/Journal/JournalFactory which would actually do the file 
management. Collapsing the API isn't that different to what happens with 
JournalManager anyhow, as all accesses to the stream go through the manager.

{quote}
Lastly, I want point out that HDFS-1580 will require a class like this anyway 
(called Journal in your design doc}}. Though this current patch doesn't address 
it, it will be a clear extension of JournalManager to add the input-side calls, 
the purging calls, etc.
{quote}
The difference between JournalManager and Journal/JournalFactory in HDFS-1580 
is one of persistence. 

For output, JournalManager objects must be created and held by FSEditLog for 
the duration of it's existence. Journal/JournalFactory objects only exists for 
the creation of the stream after which the object can go out of scope/be GC'd 
(unless the creates streams keep a reference).

{quote}
Clearly we have a difference of opinion on this design, but could you please 
indicate how strong your objections are? i.e. are you -1ing this design or just 
proposing another option? Given that I already have a bunch of work lined up 
(and blocked) behind this, I'd really like to close this out in the next day or 
two.
{quote}
I'm -1. I really don't like the call to getCurrentStream() having to be called 
every time you want an output stream. However if the concensus goes the other 
way I'll acquiesce. 

> Refactor log rolling and filename management out of FSEditLog
> -------------------------------------------------------------
>
>                 Key: HDFS-1799
>                 URL: https://issues.apache.org/jira/browse/HDFS-1799
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: Edit log branch (HDFS-1073)
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: Edit log branch (HDFS-1073)
>
>         Attachments: 0001-Added-state-management-to-FSEditLog.patch, 
> 0002-Standardised-error-pattern.patch, 
> 0003-Add-JournalFactory-and-move-divert-revert-out-of-FSE.patch, 
> HDFS-1799-all.diff, hdfs-1799.txt, hdfs-1799.txt, hdfs-1799.txt, hdfs-1799.txt
>
>
> This is somewhat similar to HDFS-1580, but less ambitious. While that JIRA 
> focuses on pluggability, this task is simply the minimum needed for HDFS-1073:
> - Refactor the filename-specific code for rolling, diverting, and reverting 
> log streams out of FSEditLog into a new class
> - Clean up the related code in FSEditLog a bit
> Notably, this JIRA is going to temporarily break the BackupNode. I plan to 
> circle back on the BackupNode later on this branch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to