Standalone (non-daemon) Chukwa operation
----------------------------------------

                 Key: CHUKWA-306
                 URL: https://issues.apache.org/jira/browse/CHUKWA-306
             Project: Hadoop Chukwa
          Issue Type: Wish
            Reporter: Jiaqi Tan
            Priority: Critical


This is an articulation of a possible alternative use of Chukwa as a standalone 
log analysis pipeline. This would enable users to read in existing logs from 
files, process (Demux) and perform analysis (e.g. current SALSA/Mochi 
toolchain) on them, and visualize them, without requiring the user to setup or 
run any daemons, nor database servers. 

This can be presented as an alternative interface to Chukwa for the user, where 
the main architectural parts (Chunks, post-Demux SequenceFiles of 
ChukwaRecords, post-Demux-processing SequenceFiles of ChukwaRecords, and 
finally time-aggregated database entries for fast visualization) remain 
unchanged, and Chukwa is manifest as a set of files in HDFS. The main value 
that Chukwa then provides to users is 1. centralized one-stop-shop for log 
processing+analysis+anomaly detection, 2. the ability to use MapReduce to 
process logs, regardless of whether they had used Chukwa to collect the logs. 

That way, the ability to process logs and analyze/do diagnosis is not tied to 
having to run the entire Chukwa daemon infrastructure, since many users who use 
Hadoop clusters may not have superuser access to those machines, e.g. users at 
universities using shared clusters.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to