[ 
https://issues.apache.org/jira/browse/CHUKWA-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang reopened CHUKWA-744:
------------------------------

Missed ETL files for HBase parsing.

> Refactor ETL process for HBaseWriter
> ------------------------------------
>
>                 Key: CHUKWA-744
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-744
>             Project: Chukwa
>          Issue Type: Task
>          Components: Data Processors
>    Affects Versions: 0.6.0
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>
> The current ETL classes are based on Demux MapProcessor and ReduceProcessor.  
> The processors were designed to pass in archive key embedded in the processor 
> as well as ChunkSaver to preserve chunks that can not be parsed.  This is 
> fine when running map reduce based demux job for processing data.  The short 
> lived task will spill out ChunkSaver into separate file for examination 
> later.  However, the processors can generate memory leaks for long period of 
> time in Chukwa agent because Chunks are saved in ChukwaSaver without clean up.
> It would be better to redesign the parser classes with well defined behavior. 
>  If the chunk can not be parsed, it should throw ParseException to upper 
> layer for retry or log to agent log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to