Eric Yang created CHUKWA-744:
--------------------------------

             Summary: Refactor ETL process for HBaseWriter
                 Key: CHUKWA-744
                 URL: https://issues.apache.org/jira/browse/CHUKWA-744
             Project: Chukwa
          Issue Type: Task
          Components: Data Processors
    Affects Versions: 0.6.0
            Reporter: Eric Yang
            Assignee: Eric Yang


The current ETL classes are based on Demux MapProcessor and ReduceProcessor.  
The processors were designed to pass in archive key embedded in the processor 
as well as ChunkSaver to preserve chunks that can not be parsed.  This is fine 
when running map reduce based demux job for processing data.  The short lived 
task will spill out ChunkSaver into separate file for examination later.  
However, the processors can generate memory leaks for long period of time in 
Chukwa agent because Chunks are saved in ChukwaSaver without clean up.

It would be better to redesign the parser classes with well defined behavior.  
If the chunk can not be parsed, it should throw ParseException to upper layer 
for retry or log to agent log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to