Hi Sanjiva,

That might work, but we need to try it out with a workload. ( CEP joins are
bit slower than other operations, so have to see).

DAS, when writing treat data as name-value pairs. It only tries to
understand it when it is processing data. So storage model should be OK.

My belief is network load is not the bottleneck ( again have to verify).

--Srinath


On Wed, Mar 30, 2016 at 8:19 AM, Sanjiva Weerawarana <[email protected]>
wrote:

> Srinath what if we come up with a way in the event receiver side to
> aggregate a set of events to one based on some correlation field? We can do
> this in an embedded Siddhi in the receiver ... basically keep like a 5 sec
> window to aggregate all events that carry the same correlation field into
> one, combine and then send forward for storage + processing. Sometimes we
> will miss but most of the time it won't. The storage model needs to be
> sufficiently flexible but HBase should be fine (?). The real time feed must
> not have this feature of course.
>
> With multiple servers firing events related to one interaction its not
> possible to do this from the source ends without distributed caching and
> that's not a good model.
>
> It does not address the network load issue of course.
>
> Sanjiva.
>
> On Tue, Mar 29, 2016 at 2:49 PM, Srinath Perera <[email protected]> wrote:
>
>> Nuwan, regarding Q1, we can setup such a way that we publisher auto
>> publisher the events after timeout or after N events are accumelated.
>>
>> Nuwan, Chathura ( regarding Q2),
>>
>> We already do event batching. Above numbers are after event batching.
>> There are two bottlenecks. One is sending events over the network and the
>> other is writing them to DB. Batching helps a lot in moving it over the
>> network, but does not help much when writing to DB.
>>
>> Regarding null, one option is to group event generated by a single
>> message together, which will avoid most nulls. I think our main concern is
>> single message triggering multiple events. We also need to write queries to
>> copy the values from single big events to different streams and use those
>> streams to write queries.
>>
>> e.g. We can copy values from Big stream to HTTPStream, using which we
>> will write HTTP analytics queries.
>>
>> --Srinath
>>
>>
>>
>>
>> On Tue, Mar 29, 2016 at 1:29 PM, Chathura Ekanayake <[email protected]>
>> wrote:
>>
>>> As we can reduce the number of event transfers with event batching, I
>>> think the advantage of using a single event stream is to reduce number of
>>> disk writes at DAS side. But as Nuwan mentioned, dealing with null fields
>>> can be a problem in writing analytics scripts.
>>>
>>> Regards,
>>> Chathura
>>>
>>> On Tue, Mar 29, 2016 at 10:40 AM, Nuwan Dias <[email protected]> wrote:
>>>
>>>> Having to publish a single event after collecting all possible data
>>>> records from the server would be good in terms of scalability aspects of
>>>> the DAS/Analytics platform. However I see that it introduces new challenges
>>>> for which we would need solutions.
>>>>
>>>> 1. How to guarantee a event is always published to DAS? In the case of
>>>> API Manager, a request has multiple exit points. Such as auth failures,
>>>> throttling out, back-end failures, message processing failures, etc. So we
>>>> need a way to guarantee that an event is always sent out whatever the 
>>>> state.
>>>>
>>>> 2. With this model, I'm assuming we only have 1 stream definition. Is
>>>> this correct? If so would this not make the analytics part complicated? For
>>>> example, say I have a spark query to summarize the throttled out events
>>>> from an App, since I can only see a single stream the query would have to
>>>> deal with null fields and have to deal with the whole bulk of data even if
>>>> in reality it might only have to deal with a few. The same complexity would
>>>> arise for the CEP based throttling engine and the new alerts we're building
>>>> as well.
>>>>
>>>> Thanks,
>>>> NuwanD.
>>>>
>>>> On Sat, Mar 26, 2016 at 1:22 AM, Inosh Goonewardena <[email protected]>
>>>> wrote:
>>>>
>>>>> +1. With combined event approach we can avoid sending duplicate
>>>>> information to some level as well. For example, in API analytics scenario
>>>>> both request and response streams have consumerKey, context, api_version,
>>>>> api, resourcePath, etc properties which the values will be same for both
>>>>> request event and corresponding response event. With single event approach
>>>>> we can avoid such.
>>>>>
>>>>> On Fri, Mar 25, 2016 at 1:23 AM, Gihan Anuruddha <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hi Janaka,
>>>>>>
>>>>>> We do have event batching at the moment as well. You can configure
>>>>>> that in data-agent-config.xml [1]. AFAIU, what we are trying to do here 
>>>>>> is
>>>>>> to combine several events into a single event.  Apart from that, wouldn't
>>>>>> be a good idea to compress the event after we merge and before we send to
>>>>>> DAS?
>>>>>>
>>>>>> [1] -
>>>>>> https://github.com/wso2/carbon-analytics-common/blob/master/features/data-bridge/org.wso2.carbon.databridge.agent.server.feature/src/main/resources/conf/data-agent-config.xml
>>>>>>
>>>>>> On Fri, Mar 25, 2016 at 11:39 AM, Janaka Ranabahu <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Srinath,
>>>>>>>
>>>>>>> On Fri, Mar 25, 2016 at 11:26 AM, Srinath Perera <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> As per meeting ( Paricipants: Sanjiva, Shankar, Sumedha, Anjana,
>>>>>>>> Miyuru, Seshika, Suho, Nirmal, Nuwan)
>>>>>>>>
>>>>>>>> Currently we generate several events per message from our products.
>>>>>>>> For example, when a message hits APIM, following events will be 
>>>>>>>> generated.
>>>>>>>>
>>>>>>>>
>>>>>>>>    1. One from HTTP level
>>>>>>>>    2. 1-2 from authentication and authorization logic
>>>>>>>>    3. 1 from Throttling
>>>>>>>>    4. 1 for ESB level stats
>>>>>>>>    5. 2 for request and response
>>>>>>>>
>>>>>>>> If APIM is handling 10K TPS, that means DAS is receiving events in
>>>>>>>> about 80K TPS. Although data bridge that transfers events are fast, 
>>>>>>>> writing
>>>>>>>> to Disk ( via RDBMS or Hbase) is a problem. We can scale Hbase. 
>>>>>>>> However,
>>>>>>>> that will run to a scenario where APIM deployment will need a very 
>>>>>>>> large
>>>>>>>> deployment of DAS.
>>>>>>>>
>>>>>>>> We decided to figure out a way to collect all the events and send a
>>>>>>>> single event to DAS. Basically idea is to extend the data publisher 
>>>>>>>> library
>>>>>>>> such that user can keep adding readings to the library, and it will 
>>>>>>>> collect
>>>>>>>> the readings and send them over as a single event to the server.
>>>>>>>>
>>>>>>>> However, some flows might terminated in the middle due to failures.
>>>>>>>> There are two solutions.
>>>>>>>>
>>>>>>>>
>>>>>>>>    1. Get the product to call a flush from a finally block
>>>>>>>>    2. Get the library to auto flush collected reading every few
>>>>>>>>    seconds
>>>>>>>>
>>>>>>>> I feel #2 is simpler.
>>>>>>>>
>>>>>>>> Do we have any concerns about going to this model?
>>>>>>>>
>>>>>>>> Suho, Anjana we need to think how to do this with our stream
>>>>>>>> definition as we force you to define the streams before hand.
>>>>>>>>
>>>>>>> ​Can't we write something similar to JDBC batch processing where the
>>>>>>> code would only do a publisher.addBatch() or something similar. The data
>>>>>>> publisher can be configured to flush the batched requests to DAS when 
>>>>>>> they
>>>>>>> hit a certain threshold.
>>>>>>>
>>>>>>> Ex:- We define the batch size as 10(using code or config xml). Then
>>>>>>> if we have 5 streams, the publisher would send 5 requests to DAS(for 
>>>>>>> each
>>>>>>> stream) instead of 50.
>>>>>>>
>>>>>>> IMO, this would allow us to keep the existing stream definitions and
>>>>>>> reduce the number of calls from a server to DAS.
>>>>>>>
>>>>>>> WDYT?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Janaka​
>>>>>>>
>>>>>>>>
>>>>>>>> --Srinath
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> ============================
>>>>>>>> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
>>>>>>>> Site: http://home.apache.org/~hemapani/
>>>>>>>> Photos: http://www.flickr.com/photos/hemapani/
>>>>>>>> Phone: 0772360902
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Architecture mailing list
>>>>>>>> [email protected]
>>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> *Janaka Ranabahu*
>>>>>>> Associate Technical Lead, WSO2 Inc.
>>>>>>> http://wso2.com
>>>>>>>
>>>>>>>
>>>>>>> *E-mail: [email protected] <http://wso2.com>**M: **+94 718370861
>>>>>>> <%2B94%20718370861>*
>>>>>>>
>>>>>>> Lean . Enterprise . Middleware
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Architecture mailing list
>>>>>>> [email protected]
>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> W.G. Gihan Anuruddha
>>>>>> Senior Software Engineer | WSO2, Inc.
>>>>>> M: +94772272595
>>>>>>
>>>>>> _______________________________________________
>>>>>> Architecture mailing list
>>>>>> [email protected]
>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks & Regards,
>>>>>
>>>>> Inosh Goonewardena
>>>>> Associate Technical Lead- WSO2 Inc.
>>>>> Mobile: +94779966317
>>>>>
>>>>> _______________________________________________
>>>>> Architecture mailing list
>>>>> [email protected]
>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Nuwan Dias
>>>>
>>>> Technical Lead - WSO2, Inc. http://wso2.com
>>>> email : [email protected]
>>>> Phone : +94 777 775 729
>>>>
>>>> _______________________________________________
>>>> Architecture mailing list
>>>> [email protected]
>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Architecture mailing list
>>> [email protected]
>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>
>>>
>>
>>
>> --
>> ============================
>> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
>> Site: http://home.apache.org/~hemapani/
>> Photos: http://www.flickr.com/photos/hemapani/
>> Phone: 0772360902
>>
>> _______________________________________________
>> Architecture mailing list
>> [email protected]
>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>
>>
>
>
> --
> Sanjiva Weerawarana, Ph.D.
> Founder, CEO & Chief Architect; WSO2, Inc.;  http://wso2.com/
> email: [email protected]; office: (+1 650 745 4499 | +94  11 214 5345)
> x5700; cell: +94 77 787 6880 | +1 408 466 5099; voip: +1 650 265 8311
> blog: http://sanjiva.weerawarana.org/; twitter: @sanjiva
> Lean . Enterprise . Middleware
>



-- 
============================
Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
Site: http://home.apache.org/~hemapani/
Photos: http://www.flickr.com/photos/hemapani/
Phone: 0772360902
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to