[ 
https://issues.apache.org/jira/browse/EAGLE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15029567#comment-15029567
 ] 

ASF GitHub Bot commented on EAGLE-2:
------------------------------------

GitHub user sunlibin opened a pull request:

    https://github.com/apache/incubator-eagle/pull/8

    [EAGLE-2][EAGLE-24][EAGLE-50][EAGLE-52]Add eagle offline metric collection 
topology and do online balance partition based on the statistic metric

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sunlibin/incubator-eagle 
Eagle-Metric-And-Balance-Partition

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-eagle/pull/8.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #8
    
----
commit f69d8ab806764a96417ceedac621aaf23f84cc78
Author: sunlibin <[email protected]>
Date:   2015-11-21T10:04:54Z

    add metric topology for offline metric collection

commit 71533df709c4e519c38645162531c52563bdc5c9
Author: sunlibin <[email protected]>
Date:   2015-11-26T10:45:46Z

    balance events partition based on greedy parition algorithm

----


> watch message process backlog in Eagle UI
> -----------------------------------------
>
>                 Key: EAGLE-2
>                 URL: https://issues.apache.org/jira/browse/EAGLE-2
>             Project: Eagle
>          Issue Type: Improvement
>         Environment: production
>            Reporter: Edward Zhang
>            Assignee: Hao Chen
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Message latency is a key factor for Eagle to enable realtime security 
> monitoring. For hdfs audit log monitoring, kafka is used as datasource. So 
> there is always some gap between current max offset in kafka and processed 
> offset in eagle. The gap is the backlog which eagle should consume quickly as 
> much as quickly. If the gap can be sampled for every minute or 20 seconds, 
> then we understand if eagle is catching up or is lagging behind more.
> The command to get current max offset in kafka is 
> bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list xxxx --topic 
> hdfs_audit_log --time -1
> and Storm-kafka spout would store processed offset in zookeeper, in the 
> following znode:
> /consumers/hdfs_audit_log/eagle.hdfsaudit.consumer/partition_0 
> So technically we can get the gap and write that to eagle service then in UI 
> we can watch the backlog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to