[
https://issues.apache.org/jira/browse/METRON-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Justin Leet updated METRON-2005:
--------------------------------
Fix Version/s: 0.7.1
> Batch Writer writes 0-byte files to HDFS on rotation
> ----------------------------------------------------
>
> Key: METRON-2005
> URL: https://issues.apache.org/jira/browse/METRON-2005
> Project: Metron
> Issue Type: Bug
> Reporter: Justin Leet
> Assignee: Justin Leet
> Priority: Major
> Fix For: 0.7.1
>
> Time Spent: 3h
> Remaining Estimate: 0h
>
> This results from https://github.com/apache/metron/pull/505
> That PR breaks the standard convention of just choose a file name and rotate
> the file repeatedly, because now any message could get routed to a different
> file based on a Stellar statement. This break was noted in the PR, because
> we didn't care about the rotation number anyway.
> This works fine for the 0th rotation (a new file is opened, data is written,
> file is closed), but on the first rotation we signal to the HdfsWriter that
> the file has been closed in order to limit the maximum number of open files,
> but still create a new file with rotation 1. This file never receives any
> data (because we no longer maintain an open file reference to it), and the
> SourceHandler for it stays open with the Timer still attempting further
> (pointless rotations). Note that no data is lost, any data that would go into
> this file just instead goes into a new 0 rotation file.
> This becomes more obvious the longer the cluster is running or the shorter
> the timeout on a file is. As each open file attempts rotations, eventually
> large numbers of 0-byte files are created.
> An easy fix for this is to remove the creation of new files during rotations
> (but still perform RotationActions). This means that every file will have a 0
> rotation (which we don't actually use for anything anyway). More complicated
> things could be done (e.g. evict oldest file from a cache), but it seems
> heavy handed for maintaining a rotation count we don't care about anyway.
> Additionally, the Timer should be cancelled when the reference is removed
> from HdfsWriter.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)