Hi All,

Here is a high level overview of the main components of the log analysis
tool.  Appreciate your input on improving the use case.


*Log publishing agent*



   -

   These agents will connect to the analysis server (which can be
   considered as the master) -  agents will be installed on the host machine
   with only a minimal configurations where it has information about the
   master and, authentication information


   -

   Master will list down the available agents, and will provide a wizard to
   perform the configurations - Here master is the central point to distribute
   configurations to the clients, this is similar to the puppet approach of
   configuring clients, a group identification configuration can be used in a
   clustered scenario where all the members of the grupId will receive a
   similar set of configuration.


   -

   Configurations wizard will walk the user through two main stages



   1. Input configurations
   -

      Initially this will be a file source, configurations related to file
      input will be provided here.
      -

      Moving forward this will be expanded to a logstash kind of approach
      [2] , where users can define different kinds of sources. (Ex : syslog,
      twitter, websockets, etc.. )



   1. Filtering configurations
   -

      At this stage user can configure, how should the log events to be
      filtered/ what sort of filters needs to be applied to the event. These
      filters will apply a regex pattern/s in to the event and extract fields.
      Pre-configured set of filters will be provided which can be applied to
      generic log events like log4j, apache httpd.
      -

      Also, user can search and pick a sample event from the previously
      configured file input and extract formats by highlighting the fields and
      defining regex patterns.
      -

      Output format of the filtered event will be a json which contains
      user defined fields along with a field of the raw event (see the sample
      [1]).


   - Single agent can have multiples of the above kind of configuration
    (file{}, filter{}, output {}) in order to publish more than one source
   events.


Search


   -

   Search will primarily be based on Lucene queries.
   -

   Results will be shown in a tabular view where users can drill down
   further using the extracted fields.
   -

   There will also be an option to apply time constraints to the search
   query.
   -

      Relative to current time - Ex : Last 12 hours, Last 5 days
      -

   There will be a side panel which will have stats about the event fields.
   -

      Once a field is clicked user will get a pop up pane with the top
      matches related to that field (field is basically a regex pattern), with
      the frequency of occurrence both as a percentage and a count.


Dashboards and Reporting


   -

   The popup described in the search section will also have a section for
   Reporting and Dashboards based on metrics like;

Average over time, Top values, Maximum over time, Minimum over time, Top
values, Top values by time, Rare values, etc…

   -

   There will be two types of dashboards, Lucene query based and real time
   (Siddhi query based). Idea is to provide a unified language for both the
   real time and historic searches as improvements.
   -

   Users will be provided with the option to export reports & dashboards in
   CSV, PDF, Excel formats.


Alerts

   -

   Both the real time and Lucene searches can be used to create alerts.
   (save as alerts)
   -

   Alerts can be configured based on three main categories
   -

      Per-result alerts          -  Based on a continuous real-time search
      (Siddhi).
      -

      Scheduled alerts         - Runs a search according to a schedule that
      user specifies when creating the alert. (Periodic Lucene queries)
      -

      Rolling window alerts  -  Based on a continuous real-time search
      happens in a user defined time window (Siddhi)



   -

   Can have multiple trigger actions based on the following trigger
   conditions
   -

      Number of results
      -

      Number of Hosts
      -

      Custom search



      Trigger actions -

   -

      Send email / SMS / Phone call
      -

      Web hook - POST to a user defined web service with the alert payload.
      -

      Run a custom script


[1] {
       "message" => "127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] \"GET
/xampp/status.php HTTP/1.1\" 200 3891 \"http://cadenza/xampp/navi.php\";
\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101
Firefox/25.0\"",
    "@timestamp" => "2013-12-11T08:01:45.000Z",
      "@version" => "1",
          "host" => "cadenza",
      "clientip" => "127.0.0.1",
         "ident" => "-",
          "auth" => "-",
     "timestamp" => "11/Dec/2013:00:01:45 -0800",
          "verb" => "GET",
       "request" => "/xampp/status.php",
   "httpversion" => "1.1",
      "response" => "200",
         "bytes" => "3891",
      "referrer" => "\"http://cadenza/xampp/navi.php\"";,
         "agent" => "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9;
rv:25.0) Gecko/20100101 Firefox/25.0\""
}

[2] https://www.elastic.co/guide/en/logstash/current/config-examples.html



Regards,
-- 
*Anuruddha Premalal*
Software Eng. | WSO2 Inc.
Mobile : +94717213122
Web site : www.anuruddha.org
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to