[ 
https://issues.apache.org/jira/browse/EAGLE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15043776#comment-15043776
 ] 

ASF GitHub Bot commented on EAGLE-2:
------------------------------------

Github user haoch commented on a diff in the pull request:

    https://github.com/apache/incubator-eagle/pull/8#discussion_r46768426
  
    --- Diff: 
eagle-security/eagle-metric-collection/src/main/java/org/apache/eagle/metric/kafka/EagleMetricCollectorMain.java
 ---
    @@ -0,0 +1,127 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.eagle.metric.kafka;
    +
    +import backtype.storm.spout.SchemeAsMultiScheme;
    +import backtype.storm.topology.base.BaseRichSpout;
    +import com.typesafe.config.Config;
    +import com.typesafe.config.ConfigFactory;
    +import 
org.apache.eagle.dataproc.impl.storm.kafka.KafkaSourcedSpoutProvider;
    +import org.apache.eagle.dataproc.impl.storm.kafka.KafkaSourcedSpoutScheme;
    +import org.apache.eagle.dataproc.util.ConfigOptionParser;
    +import org.apache.eagle.datastream.ExecutionEnvironmentFactory;
    +import org.apache.eagle.datastream.StormExecutionEnvironment;
    +import org.slf4j.Logger;
    +import org.slf4j.LoggerFactory;
    +import storm.kafka.BrokerHosts;
    +import storm.kafka.KafkaSpout;
    +import storm.kafka.SpoutConfig;
    +import storm.kafka.ZkHosts;
    +
    +import java.util.ArrayList;
    +import java.util.Arrays;
    +import java.util.List;
    +import java.util.Map;
    +
    +public class EagleMetricCollectorMain {
    +
    +    private static final Logger LOG = 
LoggerFactory.getLogger(EagleMetricCollectorMain.class);
    +
    +    public static void main(String[] args) throws Exception {
    +        new ConfigOptionParser().load(args);
    +        //System.setProperty("config.resource", "/application.local.conf");
    +
    +        Config config = ConfigFactory.load();
    +
    +        StormExecutionEnvironment env = 
ExecutionEnvironmentFactory.getStorm(config);
    +
    +        String deserClsName = 
config.getString("dataSourceConfig.deserializerClass");
    +        final KafkaSourcedSpoutScheme scheme = new 
KafkaSourcedSpoutScheme(deserClsName, config) {
    +            @Override
    +            public List<Object> deserialize(byte[] ser) {
    +                Object tmp = deserializer.deserialize(ser);
    +                Map<String, Object> map = (Map<String, Object>)tmp;
    +                if(tmp == null) return null;
    +                return Arrays.asList(map.get("user"), 
map.get("timestamp"));
    +            }
    +        };
    +
    +        KafkaSourcedSpoutProvider kafkaMessageSpoutProvider = new 
KafkaSourcedSpoutProvider() {
    --- End diff --
    
    Never pull too long anonymous method in main entry method, especially when 
the design is not very clean


> watch message process backlog in Eagle UI
> -----------------------------------------
>
>                 Key: EAGLE-2
>                 URL: https://issues.apache.org/jira/browse/EAGLE-2
>             Project: Eagle
>          Issue Type: Improvement
>         Environment: production
>            Reporter: Edward Zhang
>            Assignee: Libin, Sun
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Message latency is a key factor for Eagle to enable realtime security 
> monitoring. For hdfs audit log monitoring, kafka is used as datasource. So 
> there is always some gap between current max offset in kafka and processed 
> offset in eagle. The gap is the backlog which eagle should consume quickly as 
> much as quickly. If the gap can be sampled for every minute or 20 seconds, 
> then we understand if eagle is catching up or is lagging behind more.
> The command to get current max offset in kafka is 
> bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list xxxx --topic 
> hdfs_audit_log --time -1
> and Storm-kafka spout would store processed offset in zookeeper, in the 
> following znode:
> /consumers/hdfs_audit_log/eagle.hdfsaudit.consumer/partition_0 
> So technically we can get the gap and write that to eagle service then in UI 
> we can watch the backlog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to