Hi Srinath,

Please find my comments inline.

On Tue, Nov 10, 2015 at 10:50 AM, Srinath Perera <[email protected]> wrote:

> Had a chat with Malith. Have few questions?
>
>
>    1. Why are we using Python? a) if we use different program language,
>    we need to discuss it and approve it due to support cost etc ( not as a
>    sentence buried in a long mail). Are we going to rewrite data bridge in
>    python? that will take lot of time. IMO we should keep it simple and go
>    with Java.
>
> The main reason for using python is the light memory footprint, as opposed
to java where it cost a jvm to run the agent. This was discussed at the
initial project meetings as well (+Anjana). We don't need to re-write the
data-bridge agent in python, Stratos team has already implemented that [1],
we can make use of it and add missing features if/as needed. IMO we should
keep it simple as well as do it in the proper way, keeping it simple
doesn't mean using java.

Motivations behind suggesting python for agent implementation are amazon
cloud watch[2] and Apache Stratos[3]

Regarding support cost of python, have we discussed this already in Stratos
case? can we make use of that model (since they have already released)?

>
>    1. Regarding JMS, I think we do not need it.
>
> Do you have any suggestion for distributing configurations over a large
agent cluster? or don't we need to consider that use-cases?.

>
>    1. Log formats does not change often. IMO just point to point
>    connections should do.
>
> "Log agent configurations (Log formats), doesn't change often" - we cannot
implement a system based on this kind of hypothesis, log agent
configurations can get changed, doesn't matter how often that is. IMO it's
better to consider that scenario as well.

Ex : User wants to get the syslog for a certain time of period and then
after observing the logs, he decides to disable this log stream. There can
be many other use cases, where log configurations can get change.

What do you mean by use of point to point connection? is it use of thrift
to distribute configs?

>
>    1. Our current analytics model is splitting at the client. I think we
>    should start with that. Then, agent first has to send few hundred raw
>    lines, what is shown to user and use to configure things. Then actual event
>    are splitted at the agent.
>
> Yes

>
>    1. If log stash log configuration files are well done, can we do the
>    same formats?
>
> Yes,  this has already been discussed in  architecture mail "Component
level description of the log analyzer tool"

Thanks
> Srinath
>
> p.s. above are opinions only, please shout if disagree.
>
>
>
>
> On Fri, Nov 6, 2015 at 6:33 PM, Malith Dhanushka <[email protected]> wrote:
> >
> > Yes I agree with the complication on applying agent configs in large
> clusters. But centralized config management using a message broker is a
> critical decision to take as it weighs maintenance effort. That decision
> depends on how big the cluster is and how frequently the log configs are
> getting changed.
> >
> > On Fri, Nov 6, 2015 at 3:22 PM, Inosh Goonewardena <[email protected]>
> wrote:
> >>
> >> Hi Anurudda,
> >>
> >>
> >> On Fri, Nov 6, 2015 at 3:06 PM, Anuruddha Premalal <[email protected]>
> wrote:
> >>>
> >>> Hi Inosh,
> >>>
> >>> Can you be specific on the added complexities of managed configuration
> mode? I have explained in the sequence diagram how this will function.
> Manage configuration mode is actually a user choice, if the deployment is
> quite simple user can use default agent side configurations (as in
> logstash).
> >>
> >>
> >> As Malith pointed out, my idea was to avoiding configuring the log
> agent remotely and publishing the config. But yes, in a larger cluster,
> configuring each of the agent won't be practical and managed config mode is
> the better approach. If the user has the choice he/she can select depending
> on his/her preference.
> >>
> >>>
> >>>
> >>> Managed config mode addresses a major lacking feature which agent
> config mode doesn't have; If a user needs to change/ update configs for a
> large cluster, configuring them each won't be practical.
> >>>
> >>> In terms of the overhead concern of splitting an event at the agent
> side over master side, since a single log event usually have less amount of
> characters, it won't cost much to perform the filtering; if we consider
> master side, there won't only be a single log stream so it obviously adds
> more overhead to the master. Because of this we shouldn't do filtering
> never on master side.
> >>>
> >>> We are writing the agent using python, which doesn't consume more
> resources as a jvm, and it will absolutely be an advantage for a smooth run.
> >>>
> >>>
> >>> On Fri, Nov 6, 2015 at 2:43 PM, Inosh Goonewardena <[email protected]>
> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> On Fri, Nov 6, 2015 at 1:48 PM, Sachith Withana <[email protected]>
> wrote:
> >>>>>
> >>>>> Hi Malith,
> >>>>>
> >>>>> In terms of the 1st option,
> >>>>> - the overhead of publishing the whole log line might be an issue,
> you are essentially publishing the whole log file split into lines
> >>>>> - the overhead on the analyzer would be high too, since it has to do
> an extra step of splitting.
> >>>>
> >>>>
> >>>> I too agree on above points, but if we are going with option 2 we
> will have carefully analyze the overhead it is adding to the solution.
> >>>>
> >>>> First of all, managed config mode requires centralized configuration
> management where Log Analysis Server has to manage and process the configs
> of all log publishing agents and push these configs back to the agents.
> Pushing configurations to agents, also has to happen when there is any
> update to the configurations as well. IMO, this process will add some
> complexity to the overall solution.
> >>>>
> >>>> On the other hand, we have to analyze the overhead of adding extra
> step of splitting to the log agents as well. Since these log agents run in
> the servers where productions systems are running, these agents should be
> able to function smoothly with minimum amount of resources.
> >>>>
> >>>>>
> >>>>> On Fri, Nov 6, 2015 at 12:30 PM, Malith Dhanushka <[email protected]>
> wrote:
> >>>>>>
> >>>>>> Hi Anuruddha,
> >>>>>>
> >>>>>> Here the log agent creates the channel between log source and log
> analyzer. When it comes to publishing log part, we have two options,
> >>>>>>
> >>>>>> 1. Log agent publishes raw log line(log event) without splitting
> and then log analyzer splits the message and index
> >>>>>> - Here we don't need to keep centralized configurations as agent
> just simply publish raw log line
> >>>>>>
> >>>>>> 2. Log analyzer configures the log agent and agent will split the
> raw log line and publish to log analyzer then analyzer will do the indexing
> >>>>>> - Here we have to keep centralized configurations but less
> processing in log analyzer side as it doesn't split raw log lines
> >>>>>>
> >>>>>> I believe option 1 is simple and cleaner than option 2.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Malith
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Nov 5, 2015 at 5:54 PM, Anuruddha Premalal <
> [email protected]> wrote:
> >>>>>>>
> >>>>>>> Hi All,
> >>>>>>>
> >>>>>>> Below are the suggested  ways of distributing  configurations
> among log publishing agents. Appreciate your feedback on this.
> >>>>>>>
> >>>>>>> There are two modes which a log agent can be configured. User have
> to define this mode beforehand, default is the "client based config mode",
> there can only be a singe mode  for an agent at a given time.
> >>>>>>>
> >>>>>>> 1.) client config mode - users will configure the log streams from
> client side. (classical logstash way)
> >>>>>>>
> >>>>>>>    user can define the configurations in agent.conf [1]
> >>>>>>>
> >>>>>>> 2.) managed config mode - user doesn't have to configure stream
> specific configurations at the agent side instead user should define  the
> log-groups which it needs to get configured on. managedagent.conf [2]
> >>>>>>>
> >>>>>>>    This mode is useful for a large cluster of nodes, as user can
> perform all the configurations at a central location.
> >>>>>>>
> >>>>>>> Following sequence diagram shows how the managedconfig mode behave.
> >>>>>>>
> >>>>>>> Followings are few possible use-cases explained in Q&A manner.
> >>>>>>>
> >>>>>>> * What will happen if the user chose to switch the configuration
> mode?
> >>>>>>>   - This will make the previous configurations obsolete and will
> always honor the latest config mode.
> >>>>>>>
> >>>>>>> * How can we distinguish agents?
> >>>>>>> Based on the agentID defined by the users. Users can make use of
> instance privateip/publicip to generate unique names, ip will be picked at
> the run time and replace the id accordingly (agentid : "esb-${privateip}").
> Final agentID will have the following format.
> >>>>>>> agentid : "<userdefinedID>-<mastergeneratedID>", this master
> generated  ID is used to make sure the uniqueness of the agentID.
> >>>>>>>
> >>>>>>> * What will happen if the defined agent group is not already
> configured?
> >>>>>>>  - A new log-group will be created in the master side with empty
> configurations. No logs will get published since there's no configurations.
> >>>>>>>
> >>>>>>> * Is it possible to add/delete log-groups to an agent from the
> master side?
> >>>>>>>  - yes, once agent registered in master, all the stream specific
> configurations can only be done at the master side.
> >>>>>>>
> >>>>>>> managedagent.conf will get read only once in the agent life-cycle,
> once the agent establish a proper connection with master all the
> configurations will be handled from there. If the user change the
> managedagent.conf and restart, it won't get affected to the existing way
> the agent is configured.
> >>>>>>>
> >>>>>>> Feel free to raise any other use-cases which I have missed here.
> >>>>>>>
> >>>>>>> [1] agent.conf
> >>>>>>> {
> >>>>>>>     "agentid": "awsinstance-23",
> >>>>>>>     "authid": "sDe334#q2",
> >>>>>>>     "authsecret": "defr34w3qq#@Qd",
> >>>>>>>     "groups": [
> >>>>>>>         {
> >>>>>>>             "name": "httpd",
> >>>>>>>             "config": {
> >>>>>>>                 "input": {
> >>>>>>>                     "file": {
> >>>>>>>                         "path": "/tmp/access_log",
> >>>>>>>                         "start_position": "beginning"
> >>>>>>>                     }
> >>>>>>>                 },
> >>>>>>>                 "filter": {
> >>>>>>>                     "date": {
> >>>>>>>                         "match": [
> >>>>>>>                             "timestamp",
> >>>>>>>                             "dd/MMM/yyyy:HH:mm:ss Z"
> >>>>>>>                         ]
> >>>>>>>                     }
> >>>>>>>                 },
> >>>>>>>                 "output": {
> >>>>>>>                     "loganalyzer": {
> >>>>>>>                         "binhosts": "192.168.12.2",
> >>>>>>>                         "bindport": 9200
> >>>>>>>                     }
> >>>>>>>                 }
> >>>>>>>             }
> >>>>>>>         }
> >>>>>>>     ]
> >>>>>>> }
> >>>>>>>
> >>>>>>> [2] managedagent.conf
> >>>>>>>
> >>>>>>> {
> >>>>>>>   "agentid": "awsinstance-23",
> >>>>>>>   "authid"    : "sDe334#q2",
> >>>>>>>   "authsecret": "defr34w3qq#@Qd",
> >>>>>>>   "groups": ["httpd", "esb" ]
> >>>>>>> }
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> --
> >>>>>>> Anuruddha Premalal
> >>>>>>> Software Eng. | WSO2 Inc.
> >>>>>>> Mobile : +94717213122
> >>>>>>> Web site : www.anuruddha.org
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Malith Dhanushka
> >>>>>> Senior Software Engineer - Data Technologies
> >>>>>> WSO2, Inc. : wso2.com
> >>>>>> Mobile          : +94 716 506 693
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> Architecture mailing list
> >>>>>> [email protected]
> >>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Sachith Withana
> >>>>> Software Engineer; WSO2 Inc.; http://wso2.com
> >>>>> E-mail: sachith AT wso2.com
> >>>>> M: +94715518127
> >>>>> Linked-In: https://lk.linkedin.com/in/sachithwithana
> >>>>>
> >>>>> _______________________________________________
> >>>>> Architecture mailing list
> >>>>> [email protected]
> >>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Thanks & Regards,
> >>>>
> >>>> Inosh Goonewardena
> >>>> Associate Technical Lead- WSO2 Inc.
> >>>> Mobile: +94779966317
> >>>>
> >>>> _______________________________________________
> >>>> Architecture mailing list
> >>>> [email protected]
> >>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Anuruddha Premalal
> >>> Software Eng. | WSO2 Inc.
> >>> Mobile : +94717213122
> >>> Web site : www.anuruddha.org
> >>>
> >>>
> >>> _______________________________________________
> >>> Architecture mailing list
> >>> [email protected]
> >>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
> >>>
> >>
> >>
> >>
> >> --
> >> Thanks & Regards,
> >>
> >> Inosh Goonewardena
> >> Associate Technical Lead- WSO2 Inc.
> >> Mobile: +94779966317
> >>
> >> _______________________________________________
> >> Architecture mailing list
> >> [email protected]
> >> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
> >>
> >
> >
> >
> > --
> > Malith Dhanushka
> > Senior Software Engineer - Data Technologies
> > WSO2, Inc. : wso2.com
> > Mobile          : +94 716 506 693
> >
> > _______________________________________________
> > Architecture mailing list
> > [email protected]
> > https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
> >
>
>
>
> --
> ============================
> Srinath Perera, Ph.D.
>    http://people.apache.org/~hemapani/
>    http://srinathsview.blogspot.com/
>
> _______________________________________________
> Architecture mailing list
> [email protected]
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>
[1]
https://github.com/apache/stratos/tree/master/components/org.apache.stratos.python.cartridge.agent/src/main/python/cartridge.agent/cartridge.agent/modules/databridge
[2]
http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/CWL_GettingStarted.html
[3]
https://cwiki.apache.org/confluence/display/STRATOS/4.1.x+Python+Cartridge+Agent+Guide
-- 
*Anuruddha Premalal*
Software Eng. | WSO2 Inc.
Mobile : +94717213122
Web site : www.anuruddha.org
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to