Had a chat with Malith. Have few questions?
1. Why are we using Python? a) if we use different program language, we need to discuss it and approve it due to support cost etc ( not as a sentence buried in a long mail). Are we going to rewrite data bridge in python? that will take lot of time. IMO we should keep it simple and go with Java. 2. Regarding JMS, I think we do not need it. Log formats does not change often. IMO just point to point connections should do. 3. Our current analytics model is splitting at the client. I think we should start with that. Then, agent first has to send few hundred raw lines, what is shown to user and use to configure things. Then actual event are splitted at the agent. 4. If log stash log configuration files are well done, can we do the same formats? Thanks Srinath p.s. above are opinions only, please shout if disagree. On Fri, Nov 6, 2015 at 6:33 PM, Malith Dhanushka <[email protected]> wrote: > > Yes I agree with the complication on applying agent configs in large clusters. But centralized config management using a message broker is a critical decision to take as it weighs maintenance effort. That decision depends on how big the cluster is and how frequently the log configs are getting changed. > > On Fri, Nov 6, 2015 at 3:22 PM, Inosh Goonewardena <[email protected]> wrote: >> >> Hi Anurudda, >> >> >> On Fri, Nov 6, 2015 at 3:06 PM, Anuruddha Premalal <[email protected]> wrote: >>> >>> Hi Inosh, >>> >>> Can you be specific on the added complexities of managed configuration mode? I have explained in the sequence diagram how this will function. Manage configuration mode is actually a user choice, if the deployment is quite simple user can use default agent side configurations (as in logstash). >> >> >> As Malith pointed out, my idea was to avoiding configuring the log agent remotely and publishing the config. But yes, in a larger cluster, configuring each of the agent won't be practical and managed config mode is the better approach. If the user has the choice he/she can select depending on his/her preference. >> >>> >>> >>> Managed config mode addresses a major lacking feature which agent config mode doesn't have; If a user needs to change/ update configs for a large cluster, configuring them each won't be practical. >>> >>> In terms of the overhead concern of splitting an event at the agent side over master side, since a single log event usually have less amount of characters, it won't cost much to perform the filtering; if we consider master side, there won't only be a single log stream so it obviously adds more overhead to the master. Because of this we shouldn't do filtering never on master side. >>> >>> We are writing the agent using python, which doesn't consume more resources as a jvm, and it will absolutely be an advantage for a smooth run. >>> >>> >>> On Fri, Nov 6, 2015 at 2:43 PM, Inosh Goonewardena <[email protected]> wrote: >>>> >>>> Hi, >>>> >>>> On Fri, Nov 6, 2015 at 1:48 PM, Sachith Withana <[email protected]> wrote: >>>>> >>>>> Hi Malith, >>>>> >>>>> In terms of the 1st option, >>>>> - the overhead of publishing the whole log line might be an issue, you are essentially publishing the whole log file split into lines >>>>> - the overhead on the analyzer would be high too, since it has to do an extra step of splitting. >>>> >>>> >>>> I too agree on above points, but if we are going with option 2 we will have carefully analyze the overhead it is adding to the solution. >>>> >>>> First of all, managed config mode requires centralized configuration management where Log Analysis Server has to manage and process the configs of all log publishing agents and push these configs back to the agents. Pushing configurations to agents, also has to happen when there is any update to the configurations as well. IMO, this process will add some complexity to the overall solution. >>>> >>>> On the other hand, we have to analyze the overhead of adding extra step of splitting to the log agents as well. Since these log agents run in the servers where productions systems are running, these agents should be able to function smoothly with minimum amount of resources. >>>> >>>>> >>>>> On Fri, Nov 6, 2015 at 12:30 PM, Malith Dhanushka <[email protected]> wrote: >>>>>> >>>>>> Hi Anuruddha, >>>>>> >>>>>> Here the log agent creates the channel between log source and log analyzer. When it comes to publishing log part, we have two options, >>>>>> >>>>>> 1. Log agent publishes raw log line(log event) without splitting and then log analyzer splits the message and index >>>>>> - Here we don't need to keep centralized configurations as agent just simply publish raw log line >>>>>> >>>>>> 2. Log analyzer configures the log agent and agent will split the raw log line and publish to log analyzer then analyzer will do the indexing >>>>>> - Here we have to keep centralized configurations but less processing in log analyzer side as it doesn't split raw log lines >>>>>> >>>>>> I believe option 1 is simple and cleaner than option 2. >>>>>> >>>>>> Thanks, >>>>>> Malith >>>>>> >>>>>> >>>>>> On Thu, Nov 5, 2015 at 5:54 PM, Anuruddha Premalal < [email protected]> wrote: >>>>>>> >>>>>>> Hi All, >>>>>>> >>>>>>> Below are the suggested ways of distributing configurations among log publishing agents. Appreciate your feedback on this. >>>>>>> >>>>>>> There are two modes which a log agent can be configured. User have to define this mode beforehand, default is the "client based config mode", there can only be a singe mode for an agent at a given time. >>>>>>> >>>>>>> 1.) client config mode - users will configure the log streams from client side. (classical logstash way) >>>>>>> >>>>>>> user can define the configurations in agent.conf [1] >>>>>>> >>>>>>> 2.) managed config mode - user doesn't have to configure stream specific configurations at the agent side instead user should define the log-groups which it needs to get configured on. managedagent.conf [2] >>>>>>> >>>>>>> This mode is useful for a large cluster of nodes, as user can perform all the configurations at a central location. >>>>>>> >>>>>>> Following sequence diagram shows how the managedconfig mode behave. >>>>>>> >>>>>>> Followings are few possible use-cases explained in Q&A manner. >>>>>>> >>>>>>> * What will happen if the user chose to switch the configuration mode? >>>>>>> - This will make the previous configurations obsolete and will always honor the latest config mode. >>>>>>> >>>>>>> * How can we distinguish agents? >>>>>>> Based on the agentID defined by the users. Users can make use of instance privateip/publicip to generate unique names, ip will be picked at the run time and replace the id accordingly (agentid : "esb-${privateip}"). Final agentID will have the following format. >>>>>>> agentid : "<userdefinedID>-<mastergeneratedID>", this master generated ID is used to make sure the uniqueness of the agentID. >>>>>>> >>>>>>> * What will happen if the defined agent group is not already configured? >>>>>>> - A new log-group will be created in the master side with empty configurations. No logs will get published since there's no configurations. >>>>>>> >>>>>>> * Is it possible to add/delete log-groups to an agent from the master side? >>>>>>> - yes, once agent registered in master, all the stream specific configurations can only be done at the master side. >>>>>>> >>>>>>> managedagent.conf will get read only once in the agent life-cycle, once the agent establish a proper connection with master all the configurations will be handled from there. If the user change the managedagent.conf and restart, it won't get affected to the existing way the agent is configured. >>>>>>> >>>>>>> Feel free to raise any other use-cases which I have missed here. >>>>>>> >>>>>>> [1] agent.conf >>>>>>> { >>>>>>> "agentid": "awsinstance-23", >>>>>>> "authid": "sDe334#q2", >>>>>>> "authsecret": "defr34w3qq#@Qd", >>>>>>> "groups": [ >>>>>>> { >>>>>>> "name": "httpd", >>>>>>> "config": { >>>>>>> "input": { >>>>>>> "file": { >>>>>>> "path": "/tmp/access_log", >>>>>>> "start_position": "beginning" >>>>>>> } >>>>>>> }, >>>>>>> "filter": { >>>>>>> "date": { >>>>>>> "match": [ >>>>>>> "timestamp", >>>>>>> "dd/MMM/yyyy:HH:mm:ss Z" >>>>>>> ] >>>>>>> } >>>>>>> }, >>>>>>> "output": { >>>>>>> "loganalyzer": { >>>>>>> "binhosts": "192.168.12.2", >>>>>>> "bindport": 9200 >>>>>>> } >>>>>>> } >>>>>>> } >>>>>>> } >>>>>>> ] >>>>>>> } >>>>>>> >>>>>>> [2] managedagent.conf >>>>>>> >>>>>>> { >>>>>>> "agentid": "awsinstance-23", >>>>>>> "authid" : "sDe334#q2", >>>>>>> "authsecret": "defr34w3qq#@Qd", >>>>>>> "groups": ["httpd", "esb" ] >>>>>>> } >>>>>>> >>>>>>> Regards, >>>>>>> -- >>>>>>> Anuruddha Premalal >>>>>>> Software Eng. | WSO2 Inc. >>>>>>> Mobile : +94717213122 >>>>>>> Web site : www.anuruddha.org >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Malith Dhanushka >>>>>> Senior Software Engineer - Data Technologies >>>>>> WSO2, Inc. : wso2.com >>>>>> Mobile : +94 716 506 693 >>>>>> >>>>>> _______________________________________________ >>>>>> Architecture mailing list >>>>>> [email protected] >>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Sachith Withana >>>>> Software Engineer; WSO2 Inc.; http://wso2.com >>>>> E-mail: sachith AT wso2.com >>>>> M: +94715518127 >>>>> Linked-In: https://lk.linkedin.com/in/sachithwithana >>>>> >>>>> _______________________________________________ >>>>> Architecture mailing list >>>>> [email protected] >>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>> >>>> >>>> >>>> >>>> -- >>>> Thanks & Regards, >>>> >>>> Inosh Goonewardena >>>> Associate Technical Lead- WSO2 Inc. >>>> Mobile: +94779966317 >>>> >>>> _______________________________________________ >>>> Architecture mailing list >>>> [email protected] >>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>> >>> >>> >>> >>> -- >>> Anuruddha Premalal >>> Software Eng. | WSO2 Inc. >>> Mobile : +94717213122 >>> Web site : www.anuruddha.org >>> >>> >>> _______________________________________________ >>> Architecture mailing list >>> [email protected] >>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>> >> >> >> >> -- >> Thanks & Regards, >> >> Inosh Goonewardena >> Associate Technical Lead- WSO2 Inc. >> Mobile: +94779966317 >> >> _______________________________________________ >> Architecture mailing list >> [email protected] >> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >> > > > > -- > Malith Dhanushka > Senior Software Engineer - Data Technologies > WSO2, Inc. : wso2.com > Mobile : +94 716 506 693 > > _______________________________________________ > Architecture mailing list > [email protected] > https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture > -- ============================ Srinath Perera, Ph.D. http://people.apache.org/~hemapani/ http://srinathsview.blogspot.com/
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
