[
https://issues.apache.org/jira/browse/CHUKWA-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eric Yang updated CHUKWA-744:
-----------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
I just committed this.
> Refactor ETL process for HBaseWriter
> ------------------------------------
>
> Key: CHUKWA-744
> URL: https://issues.apache.org/jira/browse/CHUKWA-744
> Project: Chukwa
> Issue Type: Task
> Components: Data Processors
> Affects Versions: 0.6.0
> Reporter: Eric Yang
> Assignee: Eric Yang
> Attachments: CHUKWA-744.patch
>
>
> The current ETL classes are based on Demux MapProcessor and ReduceProcessor.
> The processors were designed to pass in archive key embedded in the processor
> as well as ChunkSaver to preserve chunks that can not be parsed. This is
> fine when running map reduce based demux job for processing data. The short
> lived task will spill out ChunkSaver into separate file for examination
> later. However, the processors can generate memory leaks for long period of
> time in Chukwa agent because Chunks are saved in ChukwaSaver without clean up.
> It would be better to redesign the parser classes with well defined behavior.
> If the chunk can not be parsed, it should throw ParseException to upper
> layer for retry or log to agent log.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)