[
https://issues.apache.org/jira/browse/HADOOP-12702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15118451#comment-15118451
]
Karthik Kambatla commented on HADOOP-12702:
-------------------------------------------
Thanks for filing and working on this, [~templedf]. Comments on the latest
patch:
# FileSystemSink:
## Given the sink we are adding here as some quirks to its behavior - new
directory every hour etc., the class name FileSystemSink seems too simple. Can
we capture more of the behavior in the name?
## Rename currentPath to currentDirPath, currentFile to currentFilePath,
currentOut to currentOutStream for clarity?
## While reading {{BASEPATH_KEY}} from conf, there is no default value?
## {{checkAppend}}: If appending throws an IOE that is not because of not being
supported, should we allow appending? I would think not.
## {{rollLogDirIfNeeded}}: For readability, should we split it into two ifs -
the first is when the directories don't match. Also, the comment in the method
is wrongly indented and slightly confusing.
{code}
if (!path.equals(currentPath)) {
if (currentOut != null) {
currentOut.close();
currentOut = null;
}
currentPath = path;
}
if (currentOut == null) {
// rest of the code
}
{code}
## Typo in the javadoc for createLogFile - nonExistant
## {{putMetrics}}: When throwing MetricsException, no need for a new line
between setting the message and actually throwing the exception. Also, should
just have a method that takes a message (String) and throws an exception if
ignore error is not turned on. The only downside would be the intern objects
for the strings here.
## Should {{flush}} be also invoking {{currentFSOut.hflush}}?
The tests look good. Should we play around with the allowed configs also? I am
fine with not doing that or following up in another JIRA.
> Add an HDFS metrics sink
> ------------------------
>
> Key: HADOOP-12702
> URL: https://issues.apache.org/jira/browse/HADOOP-12702
> Project: Hadoop Common
> Issue Type: Improvement
> Components: metrics
> Affects Versions: 2.7.1
> Reporter: Daniel Templeton
> Assignee: Daniel Templeton
> Attachments: HADOOP-12702.001.patch, HADOOP-12702.002.patch,
> HADOOP-12702.003.patch
>
>
> We need a metrics2 sink that can write metrics to HDFS. The sink should
> accept as configuration a "directory prefix" and do the following in
> {{putMetrics()}}
> * Get yyyyMMddHH from current timestamp.
> * If HDFS dir "dir prefix" + yyyyMMddHH doesn't exist, create it. Close any
> currently open file and create a new file called <hostname>.log in the new
> directory.
> * Write metrics to the current log file.
> * If a write fails, it should be fatal to the process running the sink.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)