Hi all, It was agreed that the User-Agent header will be published separately to the DAS since it is an important header. Once published, a Spark UDF will be used to extract necessary information out of the User-Agent header. I have written a Spark UDF based on the Java implementation of the ua_Parser library [1] to extract the user agent family, operating system and device category information of the User-Agent header.
[1]https://github.com/ua-parser/uap-java <https://github.com/ua-parser/uap-java> Appreciate your feedback on this matter. Regards, On Wed, Mar 9, 2016 at 7:13 PM, Kishanthan Thangarajah <[email protected]> wrote: > Yes, we need to minimize such overhead at data publishing side and do > these type of processing during summarization as Janaka suggested. > > On Wed, Mar 9, 2016 at 10:58 AM, Manoj Kumara <[email protected]> wrote: > >> I too think its a valid concern. +1 to publish the complete header as it >> is. >> >> @Lochana, >> Please note this during HTTP Monitoring Dashboard task when you are >> extracting the information. >> >> Regards, >> Manoj >> >> *Manoj Kumara* >> WSO2 Inc. *| **lean. enterprise. middleware.* >> *Mobile:* +94 713 448188 >> >> On Wed, Mar 9, 2016 at 10:42 AM, Nathasha Naranpanawa <[email protected]> >> wrote: >> >>> Hi all, >>> >>> The user-agent information were extracted at event publishing time >>> mainly considering that data analyzing using scripts will be made easier at >>> the Dashboard Server. >>> >>> We are going to change the current implementation by publishing the >>> whole user-agent string considering all the performance issues and other >>> concerns. >>> >>> Thanks, >>> >>> >>> >>> >>> On Tue, Mar 8, 2016 at 10:53 PM, Janaka Ranabahu <[email protected]> >>> wrote: >>> >>>> Hi App Server team, >>>> >>>> According to the code in [1], the user-agent string is parsed and some >>>> of the information are extracted from the user-agent at event publishing >>>> time. Could you guys please clarify why you guys haven't published the >>>> whole user-agent string to DAS and use a UDF to extract the corresponding >>>> data at data summarization time? >>>> >>>> There are several concerns I see in the current approach. >>>> 1. This will add additional overhead to the server when processing each >>>> request as it has to process the user-agent string to filter out these >>>> data. >>>> 2. We are currently limiting the information that can be extracted from >>>> the user-agent at the data publishing time. If we publish the whole >>>> user-agent string, then the users have the option of coming up with a new >>>> analytics script to extract any data from the user-agent. >>>> 3. If we encounter a bug/limitation or upgrade/replace in the >>>> user-agent processing library, then we have to change/update the event >>>> publisher code. Having a user defined function in DAS to extract the >>>> information from the user-agent would address this scenario as we do not >>>> have to do any changes to the data publishers. >>>> 4. We need to parse the user-agent from all the places where we publish >>>> the HTTP data. Based on the current plans, if we are going to integrate the >>>> HTTP Monitoring dashboard to API Manager, then from the API Manager side, >>>> we also have to parse the user-agent and extract the data from the gateway >>>> nodes before publishing the data. >>>> >>>> Therefore I see that the better approach would be to publish the whole >>>> user-agent string and extract data from DAS data summarization time. >>>> >>>> WDYT? >>>> >>>> Thanks, >>>> Janaka >>>> >>>> [1] >>>> https://github.com/wso2/product-as/blob/wso2as-6.0.0/modules/http-statistics-monitoring/src/main/java/org/wso2/appserver/monitoring/utils/EventBuilder.java >>>> >>>> -- >>>> *Janaka Ranabahu* >>>> Associate Technical Lead, WSO2 Inc. >>>> http://wso2.com >>>> >>>> >>>> *E-mail: [email protected] <http://wso2.com>**M: **+94 718370861 >>>> <%2B94%20718370861>* >>>> >>>> Lean . Enterprise . Middleware >>>> >>>> _______________________________________________ >>>> Architecture mailing list >>>> [email protected] >>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>> >>>> >>> >>> >>> -- >>> Nathasha Naranpanawa >>> Software Engineering Intern >>> WSO2 Inc. >>> >>> Email: [email protected] >>> Mobile: +94775496142 >>> LinkedIn: https://lk.linkedin.com/in/nathashanaranpanawa >>> >>> >>> _______________________________________________ >>> Architecture mailing list >>> [email protected] >>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>> >>> >> >> _______________________________________________ >> Architecture mailing list >> [email protected] >> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >> >> > > > -- > *Kishanthan Thangarajah* > Associate Technical Lead, > Platform Technologies Team, > WSO2, Inc. > lean.enterprise.middleware > > Mobile - +94773426635 > Blog - *http://kishanthan.wordpress.com <http://kishanthan.wordpress.com>* > Twitter - *http://twitter.com/kishanthan <http://twitter.com/kishanthan>* > > _______________________________________________ > Architecture mailing list > [email protected] > https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture > > -- Lochana Ranaweera Intern Software Engineer WSO2 Inc: http://wso2.com Blog: https://lochanaranaweera.wordpress.com/ Mobile: +94716487055 <http://tel%2B716487055>
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
