Nuwan, regarding Q1, we can setup such a way that we publisher auto publisher the events after timeout or after N events are accumelated.
Nuwan, Chathura ( regarding Q2), We already do event batching. Above numbers are after event batching. There are two bottlenecks. One is sending events over the network and the other is writing them to DB. Batching helps a lot in moving it over the network, but does not help much when writing to DB. Regarding null, one option is to group event generated by a single message together, which will avoid most nulls. I think our main concern is single message triggering multiple events. We also need to write queries to copy the values from single big events to different streams and use those streams to write queries. e.g. We can copy values from Big stream to HTTPStream, using which we will write HTTP analytics queries. --Srinath On Tue, Mar 29, 2016 at 1:29 PM, Chathura Ekanayake <[email protected]> wrote: > As we can reduce the number of event transfers with event batching, I > think the advantage of using a single event stream is to reduce number of > disk writes at DAS side. But as Nuwan mentioned, dealing with null fields > can be a problem in writing analytics scripts. > > Regards, > Chathura > > On Tue, Mar 29, 2016 at 10:40 AM, Nuwan Dias <[email protected]> wrote: > >> Having to publish a single event after collecting all possible data >> records from the server would be good in terms of scalability aspects of >> the DAS/Analytics platform. However I see that it introduces new challenges >> for which we would need solutions. >> >> 1. How to guarantee a event is always published to DAS? In the case of >> API Manager, a request has multiple exit points. Such as auth failures, >> throttling out, back-end failures, message processing failures, etc. So we >> need a way to guarantee that an event is always sent out whatever the state. >> >> 2. With this model, I'm assuming we only have 1 stream definition. Is >> this correct? If so would this not make the analytics part complicated? For >> example, say I have a spark query to summarize the throttled out events >> from an App, since I can only see a single stream the query would have to >> deal with null fields and have to deal with the whole bulk of data even if >> in reality it might only have to deal with a few. The same complexity would >> arise for the CEP based throttling engine and the new alerts we're building >> as well. >> >> Thanks, >> NuwanD. >> >> On Sat, Mar 26, 2016 at 1:22 AM, Inosh Goonewardena <[email protected]> >> wrote: >> >>> +1. With combined event approach we can avoid sending duplicate >>> information to some level as well. For example, in API analytics scenario >>> both request and response streams have consumerKey, context, api_version, >>> api, resourcePath, etc properties which the values will be same for both >>> request event and corresponding response event. With single event approach >>> we can avoid such. >>> >>> On Fri, Mar 25, 2016 at 1:23 AM, Gihan Anuruddha <[email protected]> wrote: >>> >>>> Hi Janaka, >>>> >>>> We do have event batching at the moment as well. You can configure that >>>> in data-agent-config.xml [1]. AFAIU, what we are trying to do here is to >>>> combine several events into a single event. Apart from that, wouldn't be a >>>> good idea to compress the event after we merge and before we send to DAS? >>>> >>>> [1] - >>>> https://github.com/wso2/carbon-analytics-common/blob/master/features/data-bridge/org.wso2.carbon.databridge.agent.server.feature/src/main/resources/conf/data-agent-config.xml >>>> >>>> On Fri, Mar 25, 2016 at 11:39 AM, Janaka Ranabahu <[email protected]> >>>> wrote: >>>> >>>>> Hi Srinath, >>>>> >>>>> On Fri, Mar 25, 2016 at 11:26 AM, Srinath Perera <[email protected]> >>>>> wrote: >>>>> >>>>>> As per meeting ( Paricipants: Sanjiva, Shankar, Sumedha, Anjana, >>>>>> Miyuru, Seshika, Suho, Nirmal, Nuwan) >>>>>> >>>>>> Currently we generate several events per message from our products. >>>>>> For example, when a message hits APIM, following events will be >>>>>> generated. >>>>>> >>>>>> >>>>>> 1. One from HTTP level >>>>>> 2. 1-2 from authentication and authorization logic >>>>>> 3. 1 from Throttling >>>>>> 4. 1 for ESB level stats >>>>>> 5. 2 for request and response >>>>>> >>>>>> If APIM is handling 10K TPS, that means DAS is receiving events in >>>>>> about 80K TPS. Although data bridge that transfers events are fast, >>>>>> writing >>>>>> to Disk ( via RDBMS or Hbase) is a problem. We can scale Hbase. However, >>>>>> that will run to a scenario where APIM deployment will need a very large >>>>>> deployment of DAS. >>>>>> >>>>>> We decided to figure out a way to collect all the events and send a >>>>>> single event to DAS. Basically idea is to extend the data publisher >>>>>> library >>>>>> such that user can keep adding readings to the library, and it will >>>>>> collect >>>>>> the readings and send them over as a single event to the server. >>>>>> >>>>>> However, some flows might terminated in the middle due to failures. >>>>>> There are two solutions. >>>>>> >>>>>> >>>>>> 1. Get the product to call a flush from a finally block >>>>>> 2. Get the library to auto flush collected reading every few >>>>>> seconds >>>>>> >>>>>> I feel #2 is simpler. >>>>>> >>>>>> Do we have any concerns about going to this model? >>>>>> >>>>>> Suho, Anjana we need to think how to do this with our stream >>>>>> definition as we force you to define the streams before hand. >>>>>> >>>>> Can't we write something similar to JDBC batch processing where the >>>>> code would only do a publisher.addBatch() or something similar. The data >>>>> publisher can be configured to flush the batched requests to DAS when they >>>>> hit a certain threshold. >>>>> >>>>> Ex:- We define the batch size as 10(using code or config xml). Then if >>>>> we have 5 streams, the publisher would send 5 requests to DAS(for each >>>>> stream) instead of 50. >>>>> >>>>> IMO, this would allow us to keep the existing stream definitions and >>>>> reduce the number of calls from a server to DAS. >>>>> >>>>> WDYT? >>>>> >>>>> Thanks, >>>>> Janaka >>>>> >>>>>> >>>>>> --Srinath >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> ============================ >>>>>> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera >>>>>> Site: http://home.apache.org/~hemapani/ >>>>>> Photos: http://www.flickr.com/photos/hemapani/ >>>>>> Phone: 0772360902 >>>>>> >>>>>> _______________________________________________ >>>>>> Architecture mailing list >>>>>> [email protected] >>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> *Janaka Ranabahu* >>>>> Associate Technical Lead, WSO2 Inc. >>>>> http://wso2.com >>>>> >>>>> >>>>> *E-mail: [email protected] <http://wso2.com>**M: **+94 718370861 >>>>> <%2B94%20718370861>* >>>>> >>>>> Lean . Enterprise . Middleware >>>>> >>>>> _______________________________________________ >>>>> Architecture mailing list >>>>> [email protected] >>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>> >>>>> >>>> >>>> >>>> -- >>>> W.G. Gihan Anuruddha >>>> Senior Software Engineer | WSO2, Inc. >>>> M: +94772272595 >>>> >>>> _______________________________________________ >>>> Architecture mailing list >>>> [email protected] >>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>> >>>> >>> >>> >>> -- >>> Thanks & Regards, >>> >>> Inosh Goonewardena >>> Associate Technical Lead- WSO2 Inc. >>> Mobile: +94779966317 >>> >>> _______________________________________________ >>> Architecture mailing list >>> [email protected] >>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>> >>> >> >> >> -- >> Nuwan Dias >> >> Technical Lead - WSO2, Inc. http://wso2.com >> email : [email protected] >> Phone : +94 777 775 729 >> >> _______________________________________________ >> Architecture mailing list >> [email protected] >> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >> >> > > _______________________________________________ > Architecture mailing list > [email protected] > https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture > > -- ============================ Blog: http://srinathsview.blogspot.com twitter:@srinath_perera Site: http://home.apache.org/~hemapani/ Photos: http://www.flickr.com/photos/hemapani/ Phone: 0772360902
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
