Hi all,

It was agreed that the User-Agent header will be published separately to
the DAS since it is an important header. Once published, a Spark UDF will
be used to extract necessary information out of the User-Agent header. I
have written a Spark UDF based on the Java implementation of the ua_Parser
library [1] to extract the user agent family, operating system and device
category information of the User-Agent header.

[1]https://github.com/ua-parser/uap-java
<https://github.com/ua-parser/uap-java>

Appreciate your feedback on this matter.

Regards,

On Wed, Mar 9, 2016 at 7:13 PM, Kishanthan Thangarajah <[email protected]>
wrote:

> Yes, we need to minimize such overhead at data publishing side and do
> these type of processing during summarization as Janaka suggested.
>
> On Wed, Mar 9, 2016 at 10:58 AM, Manoj Kumara <[email protected]> wrote:
>
>> I too think its a valid concern. +1 to publish the complete header as it
>> is.
>>
>> @Lochana,
>> Please note this during HTTP Monitoring Dashboard task when you are
>> extracting the information.
>>
>> Regards,
>> Manoj
>>
>> *Manoj Kumara*
>> WSO2 Inc. *| **lean. enterprise. middleware.*
>> *Mobile:* +94 713 448188
>>
>> On Wed, Mar 9, 2016 at 10:42 AM, Nathasha Naranpanawa <[email protected]>
>> wrote:
>>
>>> Hi all,
>>>
>>> The user-agent information were extracted at event publishing time
>>> mainly considering that data analyzing using scripts will be made easier at
>>> the Dashboard Server.
>>>
>>> We are going to change the current implementation by publishing the
>>> whole user-agent string considering all the performance issues and other
>>> concerns.
>>>
>>> Thanks,
>>>
>>>
>>>
>>>
>>> On Tue, Mar 8, 2016 at 10:53 PM, Janaka Ranabahu <[email protected]>
>>> wrote:
>>>
>>>> Hi App Server team,
>>>>
>>>> According to the code in [1], the user-agent string is parsed and some
>>>> of the information are extracted from the user-agent at event publishing
>>>> time. Could you guys please clarify why you guys haven't published the
>>>> whole user-agent string to DAS and use a UDF to extract the corresponding
>>>> data at data summarization time?
>>>>
>>>> There are several concerns I see in the current approach.
>>>> 1. This will add additional overhead to the server when processing each
>>>> request as it has to process the user-agent string to filter out these 
>>>> data.
>>>> 2. We are currently limiting the information that can be extracted from
>>>> the user-agent at the data publishing time. If we publish the whole
>>>> user-agent string, then the users have the option of coming up with a new
>>>> analytics script to extract any data from the user-agent.
>>>> 3. If we encounter a bug/limitation or upgrade/replace in the
>>>> user-agent processing library, then we have to change/update the event
>>>> publisher code. Having a user defined function in DAS to extract the
>>>> information from the user-agent would address this scenario as we do not
>>>> have to do any changes to the data publishers.
>>>> 4. We need to parse the user-agent from all the places where we publish
>>>> the HTTP data. Based on the current plans, if we are going to integrate the
>>>> HTTP Monitoring dashboard to API Manager, then from the API Manager side,
>>>> we also have to parse the user-agent and extract the data from the gateway
>>>> nodes before publishing the data.
>>>>
>>>> Therefore I see that the better approach would be to publish the whole
>>>> user-agent string and extract data from DAS data summarization time.
>>>>
>>>> WDYT?
>>>>
>>>> Thanks,
>>>> Janaka
>>>>
>>>> [1]
>>>> https://github.com/wso2/product-as/blob/wso2as-6.0.0/modules/http-statistics-monitoring/src/main/java/org/wso2/appserver/monitoring/utils/EventBuilder.java
>>>>
>>>> --
>>>> *Janaka Ranabahu*
>>>> Associate Technical Lead, WSO2 Inc.
>>>> http://wso2.com
>>>>
>>>>
>>>> *E-mail: [email protected] <http://wso2.com>**M: **+94 718370861
>>>> <%2B94%20718370861>*
>>>>
>>>> Lean . Enterprise . Middleware
>>>>
>>>> _______________________________________________
>>>> Architecture mailing list
>>>> [email protected]
>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>
>>>>
>>>
>>>
>>> --
>>> Nathasha Naranpanawa
>>> Software Engineering Intern
>>> WSO2 Inc.
>>>
>>> Email: [email protected]
>>> Mobile: +94775496142
>>> LinkedIn: https://lk.linkedin.com/in/nathashanaranpanawa
>>>
>>>
>>> _______________________________________________
>>> Architecture mailing list
>>> [email protected]
>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>
>>>
>>
>> _______________________________________________
>> Architecture mailing list
>> [email protected]
>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>
>>
>
>
> --
> *Kishanthan Thangarajah*
> Associate Technical Lead,
> Platform Technologies Team,
> WSO2, Inc.
> lean.enterprise.middleware
>
> Mobile - +94773426635
> Blog - *http://kishanthan.wordpress.com <http://kishanthan.wordpress.com>*
> Twitter - *http://twitter.com/kishanthan <http://twitter.com/kishanthan>*
>
> _______________________________________________
> Architecture mailing list
> [email protected]
> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>
>


-- 
Lochana Ranaweera
Intern Software Engineer
WSO2 Inc: http://wso2.com
Blog: https://lochanaranaweera.wordpress.com/
Mobile: +94716487055 <http://tel%2B716487055>
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to