Having to publish a single event after collecting all possible data records from the server would be good in terms of scalability aspects of the DAS/Analytics platform. However I see that it introduces new challenges for which we would need solutions.
1. How to guarantee a event is always published to DAS? In the case of API Manager, a request has multiple exit points. Such as auth failures, throttling out, back-end failures, message processing failures, etc. So we need a way to guarantee that an event is always sent out whatever the state. 2. With this model, I'm assuming we only have 1 stream definition. Is this correct? If so would this not make the analytics part complicated? For example, say I have a spark query to summarize the throttled out events from an App, since I can only see a single stream the query would have to deal with null fields and have to deal with the whole bulk of data even if in reality it might only have to deal with a few. The same complexity would arise for the CEP based throttling engine and the new alerts we're building as well. Thanks, NuwanD. On Sat, Mar 26, 2016 at 1:22 AM, Inosh Goonewardena <[email protected]> wrote: > +1. With combined event approach we can avoid sending duplicate > information to some level as well. For example, in API analytics scenario > both request and response streams have consumerKey, context, api_version, > api, resourcePath, etc properties which the values will be same for both > request event and corresponding response event. With single event approach > we can avoid such. > > On Fri, Mar 25, 2016 at 1:23 AM, Gihan Anuruddha <[email protected]> wrote: > >> Hi Janaka, >> >> We do have event batching at the moment as well. You can configure that >> in data-agent-config.xml [1]. AFAIU, what we are trying to do here is to >> combine several events into a single event. Apart from that, wouldn't be a >> good idea to compress the event after we merge and before we send to DAS? >> >> [1] - >> https://github.com/wso2/carbon-analytics-common/blob/master/features/data-bridge/org.wso2.carbon.databridge.agent.server.feature/src/main/resources/conf/data-agent-config.xml >> >> On Fri, Mar 25, 2016 at 11:39 AM, Janaka Ranabahu <[email protected]> >> wrote: >> >>> Hi Srinath, >>> >>> On Fri, Mar 25, 2016 at 11:26 AM, Srinath Perera <[email protected]> >>> wrote: >>> >>>> As per meeting ( Paricipants: Sanjiva, Shankar, Sumedha, Anjana, >>>> Miyuru, Seshika, Suho, Nirmal, Nuwan) >>>> >>>> Currently we generate several events per message from our products. For >>>> example, when a message hits APIM, following events will be generated. >>>> >>>> >>>> 1. One from HTTP level >>>> 2. 1-2 from authentication and authorization logic >>>> 3. 1 from Throttling >>>> 4. 1 for ESB level stats >>>> 5. 2 for request and response >>>> >>>> If APIM is handling 10K TPS, that means DAS is receiving events in >>>> about 80K TPS. Although data bridge that transfers events are fast, writing >>>> to Disk ( via RDBMS or Hbase) is a problem. We can scale Hbase. However, >>>> that will run to a scenario where APIM deployment will need a very large >>>> deployment of DAS. >>>> >>>> We decided to figure out a way to collect all the events and send a >>>> single event to DAS. Basically idea is to extend the data publisher library >>>> such that user can keep adding readings to the library, and it will collect >>>> the readings and send them over as a single event to the server. >>>> >>>> However, some flows might terminated in the middle due to failures. >>>> There are two solutions. >>>> >>>> >>>> 1. Get the product to call a flush from a finally block >>>> 2. Get the library to auto flush collected reading every few seconds >>>> >>>> I feel #2 is simpler. >>>> >>>> Do we have any concerns about going to this model? >>>> >>>> Suho, Anjana we need to think how to do this with our stream definition >>>> as we force you to define the streams before hand. >>>> >>> Can't we write something similar to JDBC batch processing where the >>> code would only do a publisher.addBatch() or something similar. The data >>> publisher can be configured to flush the batched requests to DAS when they >>> hit a certain threshold. >>> >>> Ex:- We define the batch size as 10(using code or config xml). Then if >>> we have 5 streams, the publisher would send 5 requests to DAS(for each >>> stream) instead of 50. >>> >>> IMO, this would allow us to keep the existing stream definitions and >>> reduce the number of calls from a server to DAS. >>> >>> WDYT? >>> >>> Thanks, >>> Janaka >>> >>>> >>>> --Srinath >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> ============================ >>>> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera >>>> Site: http://home.apache.org/~hemapani/ >>>> Photos: http://www.flickr.com/photos/hemapani/ >>>> Phone: 0772360902 >>>> >>>> _______________________________________________ >>>> Architecture mailing list >>>> [email protected] >>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>> >>>> >>> >>> >>> -- >>> *Janaka Ranabahu* >>> Associate Technical Lead, WSO2 Inc. >>> http://wso2.com >>> >>> >>> *E-mail: [email protected] <http://wso2.com>**M: **+94 718370861 >>> <%2B94%20718370861>* >>> >>> Lean . Enterprise . Middleware >>> >>> _______________________________________________ >>> Architecture mailing list >>> [email protected] >>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>> >>> >> >> >> -- >> W.G. Gihan Anuruddha >> Senior Software Engineer | WSO2, Inc. >> M: +94772272595 >> >> _______________________________________________ >> Architecture mailing list >> [email protected] >> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >> >> > > > -- > Thanks & Regards, > > Inosh Goonewardena > Associate Technical Lead- WSO2 Inc. > Mobile: +94779966317 > > _______________________________________________ > Architecture mailing list > [email protected] > https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture > > -- Nuwan Dias Technical Lead - WSO2, Inc. http://wso2.com email : [email protected] Phone : +94 777 775 729
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
