Had a chat with Malith. Have few questions?

   1. Why are we using Python? a) if we use different program language, we
   need to discuss it and approve it due to support cost etc ( not as a
   sentence buried in a long mail). Are we going to rewrite data bridge in
   python? that will take lot of time. IMO we should keep it simple and go
   with Java.
   2. Regarding JMS, I think we do not need it. Log formats does not change
   often. IMO just point to point connections should do.
   3. Our current analytics model is splitting at the client. I think we
   should start with that. Then, agent first has to send few hundred raw
   lines, what is shown to user and use to configure things. Then actual event
   are splitted at the agent.
   4. If log stash log configuration files are well done, can we do the
   same formats?

Thanks
Srinath

p.s. above are opinions only, please shout if disagree.




On Fri, Nov 6, 2015 at 6:33 PM, Malith Dhanushka <[email protected]> wrote:
>
> Yes I agree with the complication on applying agent configs in large
clusters. But centralized config management using a message broker is a
critical decision to take as it weighs maintenance effort. That decision
depends on how big the cluster is and how frequently the log configs are
getting changed.
>
> On Fri, Nov 6, 2015 at 3:22 PM, Inosh Goonewardena <[email protected]> wrote:
>>
>> Hi Anurudda,
>>
>>
>> On Fri, Nov 6, 2015 at 3:06 PM, Anuruddha Premalal <[email protected]>
wrote:
>>>
>>> Hi Inosh,
>>>
>>> Can you be specific on the added complexities of managed configuration
mode? I have explained in the sequence diagram how this will function.
Manage configuration mode is actually a user choice, if the deployment is
quite simple user can use default agent side configurations (as in
logstash).
>>
>>
>> As Malith pointed out, my idea was to avoiding configuring the log agent
remotely and publishing the config. But yes, in a larger cluster,
configuring each of the agent won't be practical and managed config mode is
the better approach. If the user has the choice he/she can select depending
on his/her preference.
>>
>>>
>>>
>>> Managed config mode addresses a major lacking feature which agent
config mode doesn't have; If a user needs to change/ update configs for a
large cluster, configuring them each won't be practical.
>>>
>>> In terms of the overhead concern of splitting an event at the agent
side over master side, since a single log event usually have less amount of
characters, it won't cost much to perform the filtering; if we consider
master side, there won't only be a single log stream so it obviously adds
more overhead to the master. Because of this we shouldn't do filtering
never on master side.
>>>
>>> We are writing the agent using python, which doesn't consume more
resources as a jvm, and it will absolutely be an advantage for a smooth run.
>>>
>>>
>>> On Fri, Nov 6, 2015 at 2:43 PM, Inosh Goonewardena <[email protected]>
wrote:
>>>>
>>>> Hi,
>>>>
>>>> On Fri, Nov 6, 2015 at 1:48 PM, Sachith Withana <[email protected]>
wrote:
>>>>>
>>>>> Hi Malith,
>>>>>
>>>>> In terms of the 1st option,
>>>>> - the overhead of publishing the whole log line might be an issue,
you are essentially publishing the whole log file split into lines
>>>>> - the overhead on the analyzer would be high too, since it has to do
an extra step of splitting.
>>>>
>>>>
>>>> I too agree on above points, but if we are going with option 2 we will
have carefully analyze the overhead it is adding to the solution.
>>>>
>>>> First of all, managed config mode requires centralized configuration
management where Log Analysis Server has to manage and process the configs
of all log publishing agents and push these configs back to the agents.
Pushing configurations to agents, also has to happen when there is any
update to the configurations as well. IMO, this process will add some
complexity to the overall solution.
>>>>
>>>> On the other hand, we have to analyze the overhead of adding extra
step of splitting to the log agents as well. Since these log agents run in
the servers where productions systems are running, these agents should be
able to function smoothly with minimum amount of resources.
>>>>
>>>>>
>>>>> On Fri, Nov 6, 2015 at 12:30 PM, Malith Dhanushka <[email protected]>
wrote:
>>>>>>
>>>>>> Hi Anuruddha,
>>>>>>
>>>>>> Here the log agent creates the channel between log source and log
analyzer. When it comes to publishing log part, we have two options,
>>>>>>
>>>>>> 1. Log agent publishes raw log line(log event) without splitting and
then log analyzer splits the message and index
>>>>>> - Here we don't need to keep centralized configurations as agent
just simply publish raw log line
>>>>>>
>>>>>> 2. Log analyzer configures the log agent and agent will split the
raw log line and publish to log analyzer then analyzer will do the indexing
>>>>>> - Here we have to keep centralized configurations but less
processing in log analyzer side as it doesn't split raw log lines
>>>>>>
>>>>>> I believe option 1 is simple and cleaner than option 2.
>>>>>>
>>>>>> Thanks,
>>>>>> Malith
>>>>>>
>>>>>>
>>>>>> On Thu, Nov 5, 2015 at 5:54 PM, Anuruddha Premalal <
[email protected]> wrote:
>>>>>>>
>>>>>>> Hi All,
>>>>>>>
>>>>>>> Below are the suggested  ways of distributing  configurations among
log publishing agents. Appreciate your feedback on this.
>>>>>>>
>>>>>>> There are two modes which a log agent can be configured. User have
to define this mode beforehand, default is the "client based config mode",
there can only be a singe mode  for an agent at a given time.
>>>>>>>
>>>>>>> 1.) client config mode - users will configure the log streams from
client side. (classical logstash way)
>>>>>>>
>>>>>>>    user can define the configurations in agent.conf [1]
>>>>>>>
>>>>>>> 2.) managed config mode - user doesn't have to configure stream
specific configurations at the agent side instead user should define  the
log-groups which it needs to get configured on. managedagent.conf [2]
>>>>>>>
>>>>>>>    This mode is useful for a large cluster of nodes, as user can
perform all the configurations at a central location.
>>>>>>>
>>>>>>> Following sequence diagram shows how the managedconfig mode behave.
>>>>>>>
>>>>>>> Followings are few possible use-cases explained in Q&A manner.
>>>>>>>
>>>>>>> * What will happen if the user chose to switch the configuration
mode?
>>>>>>>   - This will make the previous configurations obsolete and will
always honor the latest config mode.
>>>>>>>
>>>>>>> * How can we distinguish agents?
>>>>>>> Based on the agentID defined by the users. Users can make use of
instance privateip/publicip to generate unique names, ip will be picked at
the run time and replace the id accordingly (agentid : "esb-${privateip}").
Final agentID will have the following format.
>>>>>>> agentid : "<userdefinedID>-<mastergeneratedID>", this master
generated  ID is used to make sure the uniqueness of the agentID.
>>>>>>>
>>>>>>> * What will happen if the defined agent group is not already
configured?
>>>>>>>  - A new log-group will be created in the master side with empty
configurations. No logs will get published since there's no configurations.
>>>>>>>
>>>>>>> * Is it possible to add/delete log-groups to an agent from the
master side?
>>>>>>>  - yes, once agent registered in master, all the stream specific
configurations can only be done at the master side.
>>>>>>>
>>>>>>> managedagent.conf will get read only once in the agent life-cycle,
once the agent establish a proper connection with master all the
configurations will be handled from there. If the user change the
managedagent.conf and restart, it won't get affected to the existing way
the agent is configured.
>>>>>>>
>>>>>>> Feel free to raise any other use-cases which I have missed here.
>>>>>>>
>>>>>>> [1] agent.conf
>>>>>>> {
>>>>>>>     "agentid": "awsinstance-23",
>>>>>>>     "authid": "sDe334#q2",
>>>>>>>     "authsecret": "defr34w3qq#@Qd",
>>>>>>>     "groups": [
>>>>>>>         {
>>>>>>>             "name": "httpd",
>>>>>>>             "config": {
>>>>>>>                 "input": {
>>>>>>>                     "file": {
>>>>>>>                         "path": "/tmp/access_log",
>>>>>>>                         "start_position": "beginning"
>>>>>>>                     }
>>>>>>>                 },
>>>>>>>                 "filter": {
>>>>>>>                     "date": {
>>>>>>>                         "match": [
>>>>>>>                             "timestamp",
>>>>>>>                             "dd/MMM/yyyy:HH:mm:ss Z"
>>>>>>>                         ]
>>>>>>>                     }
>>>>>>>                 },
>>>>>>>                 "output": {
>>>>>>>                     "loganalyzer": {
>>>>>>>                         "binhosts": "192.168.12.2",
>>>>>>>                         "bindport": 9200
>>>>>>>                     }
>>>>>>>                 }
>>>>>>>             }
>>>>>>>         }
>>>>>>>     ]
>>>>>>> }
>>>>>>>
>>>>>>> [2] managedagent.conf
>>>>>>>
>>>>>>> {
>>>>>>>   "agentid": "awsinstance-23",
>>>>>>>   "authid"    : "sDe334#q2",
>>>>>>>   "authsecret": "defr34w3qq#@Qd",
>>>>>>>   "groups": ["httpd", "esb" ]
>>>>>>> }
>>>>>>>
>>>>>>> Regards,
>>>>>>> --
>>>>>>> Anuruddha Premalal
>>>>>>> Software Eng. | WSO2 Inc.
>>>>>>> Mobile : +94717213122
>>>>>>> Web site : www.anuruddha.org
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Malith Dhanushka
>>>>>> Senior Software Engineer - Data Technologies
>>>>>> WSO2, Inc. : wso2.com
>>>>>> Mobile          : +94 716 506 693
>>>>>>
>>>>>> _______________________________________________
>>>>>> Architecture mailing list
>>>>>> [email protected]
>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Sachith Withana
>>>>> Software Engineer; WSO2 Inc.; http://wso2.com
>>>>> E-mail: sachith AT wso2.com
>>>>> M: +94715518127
>>>>> Linked-In: https://lk.linkedin.com/in/sachithwithana
>>>>>
>>>>> _______________________________________________
>>>>> Architecture mailing list
>>>>> [email protected]
>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks & Regards,
>>>>
>>>> Inosh Goonewardena
>>>> Associate Technical Lead- WSO2 Inc.
>>>> Mobile: +94779966317
>>>>
>>>> _______________________________________________
>>>> Architecture mailing list
>>>> [email protected]
>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>
>>>
>>>
>>>
>>> --
>>> Anuruddha Premalal
>>> Software Eng. | WSO2 Inc.
>>> Mobile : +94717213122
>>> Web site : www.anuruddha.org
>>>
>>>
>>> _______________________________________________
>>> Architecture mailing list
>>> [email protected]
>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>
>>
>>
>>
>> --
>> Thanks & Regards,
>>
>> Inosh Goonewardena
>> Associate Technical Lead- WSO2 Inc.
>> Mobile: +94779966317
>>
>> _______________________________________________
>> Architecture mailing list
>> [email protected]
>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>
>
>
>
> --
> Malith Dhanushka
> Senior Software Engineer - Data Technologies
> WSO2, Inc. : wso2.com
> Mobile          : +94 716 506 693
>
> _______________________________________________
> Architecture mailing list
> [email protected]
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>



--
============================
Srinath Perera, Ph.D.
   http://people.apache.org/~hemapani/
   http://srinathsview.blogspot.com/
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to