HI,

Ran some more performance tests to contrast between publishing Aggregated
events Vs Multiple single events, and follow are the results:

*Results:*

No of concurrent publishers (to DAS): 10
Back-end DB: MySQL

Single Events Aggregated Events* Single Events Aggregated Events*
No of events: 160,000 10,000 1,600,000 100,000
Event payload size: 1.9 KB 21.6 KB 1.9 KB 21.6 KB
Time Consumed** (mm:ss): 1:55 0:30 19:46 4:31

*An aggregated event contains payloads of 16 single events.
**Time consumed = time to complete all DB transactions.

Please note that these times were monitored while DB trace logs were on. So
that too have some effect on the performance in overall.

Regards,
Supun

On Wed, Feb 17, 2016 at 5:37 PM, Viraj Senevirathne <[email protected]> wrote:

> Hi All,
>
> We got a simple sample payload for a actual message flow (attached).
>
> This have about 16 mediators. The payload file size is ~27.4kB. With
> different payload size and large number of mediators in the flow , single
> payload size can get even bigger. So if ESB is serving 1000 request per
> second, ESB will transfer payloads to DAS with data rate ~27Mb/s. With
> large payload sizes and large number of mediators in the flow this data
> rate can be go up very high.
>
> As strings have high repeatably compression works well wtih them. After
> compressing above payload its size ~2kB. (93% reduction from original size).
>
> Large Json File with 1.3MB was reduced to 14.3kB after compression.
>
> Therefore will it be possible to send compressed json string to DAS
> instead of uncompressed one. Then DAS can decompress the file and use the
> actual json payload.
>
> I think this will reduce the data rate drastically and ease data
> communication.
>
> Will it be possible to define new type like "commpressedJSON" to achive
> this? WDYT about this idea?
>
> Thank You,
>
> On Wed, Feb 17, 2016 at 9:38 AM, Supun Sethunga <[email protected]> wrote:
>
>> Hi Dushan,
>>
>> Supun, according to the stream definition ""children": 1," what it
>>> represents ?
>>
>>
>> Here, each event basically represent a mediator/proxy. So "children"
>> represents the child mediator(s) in the message flow. This info is used to
>> draw the message flow diagram.
>>
>> For eg, if we consider the first event in the array, "children":1 means
>> event at index 1 is the first mediator after Test Proxy. and so on.
>> Sorry, the values I have put for the "children" in second and third
>> events are misleading. They should be  "children":2 and "children":null,
>> respectively. So, null means its the end of the message flow.
>>
>> Regards,
>> Supun
>>
>> On Wed, Feb 17, 2016 at 2:34 AM, Dushan Abeyruwan <[email protected]>
>> wrote:
>>
>>> Hi
>>>
>>>    - If we publish events from each mediator then, we can certainly
>>>    group each event from unique parentID can't we? (I mean this would allow 
>>> us
>>>    to prepare a aggregated view per  incoming message and visualize 
>>> different
>>>    stages of each message representation and other meta information, think 
>>> of
>>>    complex mediation)
>>>    - Can't we record payload as according to Content-Type, therefore,
>>>    shall we get rid of SOAP way of representing?
>>>    - If we have non-content aware mediation flow with
>>>    "application/json", can we find the way to get json string rather rather
>>>    explicitly build  i.e  "org.apache.synapse.commons.json.Constants.
>>>    JSON_STRING"
>>>    - Supun, according to the stream definition ""children": 1," what it
>>>    represents ?
>>>
>>>
>>> On Mon, Feb 15, 2016 at 9:15 PM, Supun Sethunga <[email protected]> wrote:
>>>
>>>> Hi Dunith, Gihan,
>>>>
>>>> As per the offline chat had with Buddhima and Viraj, follow is a sample
>>>> payload to be published from ESB to DAS. Do we need any other information
>>>> for the plots/tables in dashboard?
>>>>
>>>> Here we added a new field "entryPoint" to indicate inside which
>>>> Proxy/API did the mediator get executed. So that it would be easy to drill
>>>> down from proxy view to mediator view. Please add if there is any other
>>>> similar field that would be needed for drill-downs, if we have missed any.
>>>>
>>>> {
>>>> "events": [{
>>>> "compotentType": "ProxyService",
>>>> "compotentId": "Test Proxy",
>>>> "startTime": 1455531027,
>>>> "endTime": 1455531041,
>>>> "duration": 3.321,
>>>> "beforePayload": null,
>>>> "afterPayload": null,
>>>> "contextPropertyMap":
>>>> "{\"MESSAGE_FLOW_ID\":\"urn_uuid_e4251abb-8ff5-433b-8dcb-24f251c3e30d\"}",
>>>> "transportPropertyMap": "{\"Content-Type\":\"application\/soap+xml;
>>>> charset=UTF-8; action=\"urn:renewLicense\"\",\"Host\":\"localhost\"}",
>>>> "children": 1,
>>>> "entryPoint": "Test Proxy"
>>>> }, {
>>>> "compotentType": "Mediator",
>>>> "compotentId": "mediator_1",
>>>> "startTime": 1455531041,
>>>> "endTime": 1455531052,
>>>> "duration": 3.321,
>>>> "beforePayload": null,
>>>> "afterPayload": null,
>>>> "contextPropertyMap":
>>>> "{\"MESSAGE_FLOW_ID\":\"urn_uuid_e4251abb-8ff5-433b-8dcb-24f251c3e30d\"}",
>>>> "transportPropertyMap": "{\"Content-Type\":\"application\/soap+xml;
>>>> charset=UTF-8; action=\"urn:renewLicense\"\",\"Host\":\"localhost\"}",
>>>> "children": 0,
>>>> "entryPoint": "Test Proxy"
>>>> }, {
>>>> "compotentType": "Mediator",
>>>> "compotentId": "mediator_2",
>>>> "startTime": 1455531052,
>>>> "endTime": 1455531074,
>>>> "duration": 3.321,
>>>> "beforePayload": null,
>>>> "afterPayload": null,
>>>> "contextPropertyMap": null,
>>>> "transportPropertyMap": null,
>>>> "children": 0,
>>>> "entryPoint": "Test Proxy"
>>>> }],
>>>>
>>>> "payloads": [{
>>>> "payload": "<?xml version=\"1.0\" encoding=\"utf-8\"?><soapenv:Envelope
>>>> xmlns:soapenv=\"http://www.w3.org/2003/05/soap-envelope\";><soapenv:Body><sam:getCertificateID
>>>> xmlns:sam=\"http://sample.esb.org
>>>> \"><sam:vehicleNumber>123456</sam:vehicleNumber></sam:getCertificateID></soapenv:Body></soapenv:Envelope>",
>>>> "events": [{
>>>> "eventIndex": 0,
>>>> "attributes": "beforePayload"
>>>> }, {
>>>> "eventIndex": 0,
>>>> "attributes": "afterPayload"
>>>> }, {
>>>> "eventIndex": 1,
>>>> "attributes": "beforePayload"
>>>> }]
>>>> }, {
>>>> "payload": "<?xml version=\"1.0\" encoding=\"utf-8\"?><soapenv:Envelope
>>>> xmlns:soapenv=\"http://www.w3.org/2003/05/soap-envelope\";><soapenv:Body><sam:getCertificateID
>>>> xmlns:sam=\"http://sample.esb.org
>>>> \"><sam:vehicleNumber>123123</sam:vehicleNumber><sam:vehicleType>car</sam:vehicleType></sam:getCertificateID></soapenv:Body></soapenv:Envelope>",
>>>> "events": [{
>>>> "eventIndex": 1,
>>>> "attributes": "afterPayload"
>>>> }, {
>>>> "eventIndex": 2,
>>>> "attributes": "beforePayload"
>>>> }, {
>>>> "eventIndex": 2,
>>>> "attributes": "afterPayload"
>>>> }]
>>>> }]
>>>> }
>>>>
>>>> Thanks,
>>>> Supun
>>>>
>>>> On Wed, Feb 10, 2016 at 11:57 AM, Supun Sethunga <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Sinthuja,
>>>>>
>>>>>
>>>>>> IMHO we could solve this issue as having conversions. Basically we
>>>>>> could use $payloads:payload1 to reference the elements as a convention. 
>>>>>> If
>>>>>> the element starts with '$' then it's the reference, not the actual
>>>>>> payload. In that case if there is a new element introduced, let's say foo
>>>>>> and you need to access the property property1, then it will have the
>>>>>> reference as $foo:property1.
>>>>>
>>>>>
>>>>> Yes, that's possible as well. But again, if the value for the
>>>>> property, say 'foo', has an actual value starting with some special
>>>>> character.. (in this case '$'), we may run in to ambiguity. (true, the
>>>>> chances are pretty less, but still possible).
>>>>>
>>>>>
>>>>>  Also this json event format is being sent as event payload in wso2
>>>>>> event, and wso2 event is being published by the data publisher right?
>>>>>> Correct me if i'm wrong.
>>>>>
>>>>>
>>>>> Yes.
>>>>>
>>>>> Thanks,
>>>>> Supun
>>>>>
>>>>>
>>>>> On Wed, Feb 10, 2016 at 11:35 AM, Sinthuja Ragendran <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi Supun,
>>>>>>
>>>>>> Also this json event format is being sent as event payload in wso2
>>>>>> event, and wso2 event is being published by the data publisher right?
>>>>>> Correct me if i'm wrong.
>>>>>>
>>>>>> Thanks,
>>>>>> Sinthuja.
>>>>>>
>>>>>> On Wed, Feb 10, 2016 at 11:26 AM, Sinthuja Ragendran <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hi Supun,
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Feb 10, 2016 at 11:14 AM, Supun Sethunga <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Sinthuja,
>>>>>>>>
>>>>>>>> Agree on the possibility of simplifying the json. We also discussed
>>>>>>>> on the same matter yesterday, but the complication came up was, by an 
>>>>>>>> event
>>>>>>>> in the "events" list, payload could be either referenced,
>>>>>>>> or defined in-line.(made as it is, so that it can be generalized for 
>>>>>>>> other
>>>>>>>> fields as well if needed, other than payloads.).
>>>>>>>>
>>>>>>> In such a case, if we had defined as 'payload': '*payload1**', *we
>>>>>>>> would not know if its the actual payload, or a reference to the 
>>>>>>>> payload in
>>>>>>>> the "payloads" section.
>>>>>>>>
>>>>>>>> With the suggested format, DAS will only go and map the payload if
>>>>>>>> its null.
>>>>>>>>
>>>>>>>>
>>>>>>> IMHO we could solve this issue as having conversions. Basically we
>>>>>>> could use $payloads:payload1 to reference the elements as a convention. 
>>>>>>> If
>>>>>>> the element starts with '$' then it's the reference, not the actual
>>>>>>> payload. In that case if there is a new element introduced, let's say 
>>>>>>> foo
>>>>>>> and you need to access the property property1, then it will have the
>>>>>>> reference as $foo:property1.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Sinthuja.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Supun
>>>>>>>>
>>>>>>>> On Wed, Feb 10, 2016 at 10:52 AM, Sinthuja Ragendran <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> Hi Supun,
>>>>>>>>>
>>>>>>>>> I think we could simplify the json message bit more. Instead of
>>>>>>>>> 'null' for the payload attributes in the events section, you could 
>>>>>>>>> use the
>>>>>>>>> actual payload name directly if there is a payload for that event. 
>>>>>>>>> And in
>>>>>>>>> that case, we could eliminate the 'events' section from the 'payloads'
>>>>>>>>> section. For the given example, it could be altered as below.
>>>>>>>>>
>>>>>>>>> {
>>>>>>>>> 'events': [{
>>>>>>>>> 'messageId': 'aaa',
>>>>>>>>> 'componentId': '111',
>>>>>>>>> 'payload': '*payload1*',
>>>>>>>>> 'componentName': 'Proxy:TestProxy',
>>>>>>>>> 'output-payload':null
>>>>>>>>> }, {
>>>>>>>>> 'messageId': 'bbb',
>>>>>>>>> 'componentId': '222',
>>>>>>>>> 'componentName': 'Proxy:TestProxy',
>>>>>>>>> 'payload': '*payload1*',
>>>>>>>>> 'output-payload':null
>>>>>>>>> }, {
>>>>>>>>> 'messageId': 'ccc',
>>>>>>>>> 'componentId': '789',
>>>>>>>>> 'payload': '*payload2*',
>>>>>>>>> 'componentName': 'Proxy:TestProxy',
>>>>>>>>> 'output-payload':'*payload2*'
>>>>>>>>> }],
>>>>>>>>>
>>>>>>>>> 'payloads': {
>>>>>>>>> '*payload1*': 'xml-payload-1',
>>>>>>>>> '*payload2*': 'xml-payload-2',
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Sinthuja.
>>>>>>>>>
>>>>>>>>> On Wed, Feb 10, 2016 at 10:18 AM, Supun Sethunga <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Budhdhima/Viraj,
>>>>>>>>>>
>>>>>>>>>> As per the discussion we had yesterday, follow is the format of
>>>>>>>>>> the json contains aggregated event details, to be sent to DAS. (you 
>>>>>>>>>> may
>>>>>>>>>> change the attribute names of events).
>>>>>>>>>>
>>>>>>>>>> To explain it further, "events" contains the details about each
>>>>>>>>>> event sent by each mediator. Payload may or may not be populated.
>>>>>>>>>> "Payloads" section contains unique payloads and the mapping to the 
>>>>>>>>>> events
>>>>>>>>>> their fields. (eg:  'xml-payload-2' maps to the 'payload' and
>>>>>>>>>> 'output-payload' fields of the 3rd event).
>>>>>>>>>>
>>>>>>>>>> {
>>>>>>>>>> 'events': [{
>>>>>>>>>> 'messageId': 'aaa',
>>>>>>>>>> 'componentId': '111',
>>>>>>>>>> 'payload': null,
>>>>>>>>>>
>>>>>>>>> 'componentName': 'Proxy:TestProxy',
>>>>>>>>>> 'output-payload':null
>>>>>>>>>> }, {
>>>>>>>>>> 'messageId': 'bbb',
>>>>>>>>>> 'componentId': '222',
>>>>>>>>>> 'componentName': 'Proxy:TestProxy',
>>>>>>>>>> 'payload': null,
>>>>>>>>>> 'output-payload':null
>>>>>>>>>> }, {
>>>>>>>>>> 'messageId': 'ccc',
>>>>>>>>>> 'componentId': '789',
>>>>>>>>>> 'payload': null,
>>>>>>>>>> 'componentName': 'Proxy:TestProxy',
>>>>>>>>>> 'output-payload':null
>>>>>>>>>> }],
>>>>>>>>>>
>>>>>>>>>> 'payloads': [{
>>>>>>>>>> 'payload': 'xml-payload-1',
>>>>>>>>>> 'events': [{
>>>>>>>>>> 'eventIndex': 0,
>>>>>>>>>> 'attributes':['payload']
>>>>>>>>>> }, {
>>>>>>>>>> 'eventIndex': 1,
>>>>>>>>>> 'attributes':['payload']
>>>>>>>>>> }]
>>>>>>>>>> }, {
>>>>>>>>>> 'payload': 'xml-payload-2',
>>>>>>>>>> 'events': [{
>>>>>>>>>> 'eventIndex': 2,
>>>>>>>>>> 'attributes':['payload','output-payload']
>>>>>>>>>> }]
>>>>>>>>>> }]
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> Please let us know any further clarifications is needed, or if
>>>>>>>>>> there's anything to be modified/improved.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Supun
>>>>>>>>>>
>>>>>>>>>> On Tue, Feb 9, 2016 at 11:05 AM, Isuru Udana <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Kasun,
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Feb 9, 2016 at 10:10 AM, Kasun Indrasiri <[email protected]
>>>>>>>>>>> > wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I think for trancing use case we need to publish events one by
>>>>>>>>>>>> one from each mediator (we can't aggregate all such events as it 
>>>>>>>>>>>> also
>>>>>>>>>>>> contains the message payload)
>>>>>>>>>>>>
>>>>>>>>>>> I think we can still do that with some extra effort.
>>>>>>>>>>> Most of the mediators in a sequence flow does not alter the
>>>>>>>>>>> message payload. We can store the payload only for the mediators 
>>>>>>>>>>> which
>>>>>>>>>>> alter the message payload. And for others, we can put a reference 
>>>>>>>>>>> to the
>>>>>>>>>>> previous entry. By doing that we can save the memory to a great 
>>>>>>>>>>> extent.
>>>>>>>>>>>
>>>>>>>>>>> Thanks.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ---------- Forwarded message ----------
>>>>>>>>>>>> From: Supun Sethunga <[email protected]>
>>>>>>>>>>>> Date: Mon, Feb 8, 2016 at 2:54 PM
>>>>>>>>>>>> Subject: Re: ESB Analytics Mediation Event Publishing Mechanism
>>>>>>>>>>>> To: Anjana Fernando <[email protected]>
>>>>>>>>>>>> Cc: "[email protected]" <[email protected]>,
>>>>>>>>>>>> Srinath Perera <[email protected]>, Sanjiva Weerawarana <
>>>>>>>>>>>> [email protected]>, Kasun Indrasiri <[email protected]>, Isuru
>>>>>>>>>>>> Udana <[email protected]>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> Ran some simple performance tests against the new relational
>>>>>>>>>>>> provider, in comparison with the existing one. Follow are the 
>>>>>>>>>>>> results:
>>>>>>>>>>>>
>>>>>>>>>>>> *Records in Backend DB Table*: *1,054,057*
>>>>>>>>>>>>
>>>>>>>>>>>> *Conversion:*
>>>>>>>>>>>> Spark Table
>>>>>>>>>>>> id a b c
>>>>>>>>>>>> Backend DB Table 1 xxx yyy zzz
>>>>>>>>>>>> id data 1 ppp qqq rrr
>>>>>>>>>>>> 1
>>>>>>>>>>>> [{'a':'aaa','b':'bbb','c':'ccc'},{'a':'xxx','b':'yyy','c':'zzz'},{'a':'ppp','b':'qqq','c':'rrr'}]
>>>>>>>>>>>>  --
>>>>>>>>>>>> To --> 1 aaa bbb ccc
>>>>>>>>>>>> 2
>>>>>>>>>>>> [{'a':'aaa','b':'bbb','c':'ccc'},{'a':'xxx','b':'yyy','c':'zzz'},{'a':'ppp','b':'qqq','c':'rrr'}]
>>>>>>>>>>>> 2 xxx yyy zzz
>>>>>>>>>>>> 2 aaa bbb ccc
>>>>>>>>>>>> 2 ppp qqq rrr
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *Avg Time for Query Execution:*
>>>>>>>>>>>>
>>>>>>>>>>>> Querry
>>>>>>>>>>>> Execution time (~ sec)
>>>>>>>>>>>> Existing Analytics Relation Provider New (ESB) Analytics
>>>>>>>>>>>> Relation Provider* * New relational provider split a single
>>>>>>>>>>>> row to multiple rows. Hence the number of rows in the table 
>>>>>>>>>>>> equivalent to 3
>>>>>>>>>>>> times (as each row is split to 3 rows) as the original table.
>>>>>>>>>>>> SELECT COUNT(*) FROM <Table>; 13 16
>>>>>>>>>>>> SELECT * FROM <Table> ORDER BY id ASC; 13 16
>>>>>>>>>>>> SELECT * FROM <Table> WHERE id=98435; 13 16
>>>>>>>>>>>> SELECT id,a,first(b),first(c) FROM <Table> GROUP BY id,a ORDER
>>>>>>>>>>>> BY id ASC; 18 26
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Supun
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Feb 3, 2016 at 3:36 PM, Supun Sethunga <[email protected]
>>>>>>>>>>>> > wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have started working on implementing a new "relation" / 
>>>>>>>>>>>>> "relation
>>>>>>>>>>>>> provider", to serve the above requirement. This basically is a 
>>>>>>>>>>>>> modified
>>>>>>>>>>>>> version of the existing "Carbon Analytics" relation provider.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here I have assumed that the encapsulated data for a single 
>>>>>>>>>>>>> execution
>>>>>>>>>>>>> flow are stored in a single row, and the data about the
>>>>>>>>>>>>> mediators invoked during the flow are stored in a known column of 
>>>>>>>>>>>>> each row
>>>>>>>>>>>>> (say "data"), as an array (say a json array). When each row is 
>>>>>>>>>>>>> read in to
>>>>>>>>>>>>> spark, this relational provider create separate rows for each of 
>>>>>>>>>>>>> the
>>>>>>>>>>>>> element in the array stored in "data" column. I have tested this 
>>>>>>>>>>>>> with some
>>>>>>>>>>>>> mocked data, and works as expected.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Need to test with the real data/data-formats, and modify the
>>>>>>>>>>>>> mapping accordingly. Will update the thread with the details.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Supun
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Feb 2, 2016 at 2:36 AM, Anjana Fernando <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In a meeting I'd with Kasun and the ESB team, I got to know
>>>>>>>>>>>>>> that, for their tracing mechanism, they were instructed to 
>>>>>>>>>>>>>> publish one
>>>>>>>>>>>>>> event for each of the mediator invocations, where, earlier they 
>>>>>>>>>>>>>> had an
>>>>>>>>>>>>>> approach, they publish one event, which encapsulated data of a 
>>>>>>>>>>>>>> whole
>>>>>>>>>>>>>> execution flow. I would actually like to support the latter 
>>>>>>>>>>>>>> approach,
>>>>>>>>>>>>>> mainly due to performance / resource requirements. And also 
>>>>>>>>>>>>>> considering the
>>>>>>>>>>>>>> fact, this is a feature that could be enabled in production. So 
>>>>>>>>>>>>>> simply, if
>>>>>>>>>>>>>> we do one event per mediator, this does not scale that well. For 
>>>>>>>>>>>>>> example,
>>>>>>>>>>>>>> if the ESB is doing 1k TPS, for a sequence that has 20 
>>>>>>>>>>>>>> mediators, that is
>>>>>>>>>>>>>> 20k TPS for analytics traffic. Combine that with a possible ESB 
>>>>>>>>>>>>>> cluster
>>>>>>>>>>>>>> hitting a DAS cluster with a single backend database, this maybe 
>>>>>>>>>>>>>> too many
>>>>>>>>>>>>>> rows per second written to the database. Where the main problem 
>>>>>>>>>>>>>> here is,
>>>>>>>>>>>>>> one event is, a single row/record in the backend database in 
>>>>>>>>>>>>>> DAS, so it may
>>>>>>>>>>>>>> come to a state, where the frequency of row creations by events 
>>>>>>>>>>>>>> coming from
>>>>>>>>>>>>>> ESBs cannot be sustained.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If we create a single event from the 20 mediators, then it is
>>>>>>>>>>>>>> just 1k TPS for DAS event receivers and the database too, event 
>>>>>>>>>>>>>> though the
>>>>>>>>>>>>>> message size is bigger. It is not necessarily same performance, 
>>>>>>>>>>>>>> if you
>>>>>>>>>>>>>> publish lots of small events to publishing bigger events. 
>>>>>>>>>>>>>> Throughput wise,
>>>>>>>>>>>>>> comparatively bigger events will win (even though if we consider 
>>>>>>>>>>>>>> that,
>>>>>>>>>>>>>> small operations will be batched in transport level etc.. still 
>>>>>>>>>>>>>> one event =
>>>>>>>>>>>>>> one database row). So I would suggest, we try out a single 
>>>>>>>>>>>>>> sequence flow =
>>>>>>>>>>>>>> single event, approach, and from the Spark processing side, we 
>>>>>>>>>>>>>> consider one
>>>>>>>>>>>>>> of these big rows as multiple rows in Spark. I was first 
>>>>>>>>>>>>>> thinking, if UDFs
>>>>>>>>>>>>>> can help in splitting a single column to multiple rows, and that 
>>>>>>>>>>>>>> is not
>>>>>>>>>>>>>> possible, and also, a bit troublesome, considering we have to 
>>>>>>>>>>>>>> delete the
>>>>>>>>>>>>>> original data table after we concerted it using a script, and not
>>>>>>>>>>>>>> forgetting, we actually have to schedule and run a separate 
>>>>>>>>>>>>>> script to do
>>>>>>>>>>>>>> this post-processing. So a much cleaner way to do this would be, 
>>>>>>>>>>>>>> to create
>>>>>>>>>>>>>> a new "relation provider" in Spark (which is like a data adapter 
>>>>>>>>>>>>>> for their
>>>>>>>>>>>>>> DataFrames), and in our relation provider, when we are reading 
>>>>>>>>>>>>>> rows, we
>>>>>>>>>>>>>> convert a single row's column to multiple rows and return that 
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>> processing. So Spark will not know, physically it was a single 
>>>>>>>>>>>>>> row from the
>>>>>>>>>>>>>> data layer, and it can summarize the data and all as usual and 
>>>>>>>>>>>>>> write to the
>>>>>>>>>>>>>> target summary tables. [1] is our existing implementation of 
>>>>>>>>>>>>>> Spark relation
>>>>>>>>>>>>>> provider, which directly maps to our DAS analytics tables, we 
>>>>>>>>>>>>>> can create
>>>>>>>>>>>>>> the new one extending / based on it. So I suggest we try out 
>>>>>>>>>>>>>> this approach
>>>>>>>>>>>>>> and see, if everyone is okay with it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>> https://github.com/wso2/carbon-analytics/blob/master/components/analytics-processors/org.wso2.carbon.analytics.spark.core/src/main/java/org/wso2/carbon/analytics/spark/core/sources/AnalyticsRelationProvider.java
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> Anjana.
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> *Anjana Fernando*
>>>>>>>>>>>>>> Senior Technical Lead
>>>>>>>>>>>>>> WSO2 Inc. | http://wso2.com
>>>>>>>>>>>>>> lean . enterprise . middleware
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>>>>>> Google Groups "WSO2 Engineering Group" group.
>>>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from
>>>>>>>>>>>>>> it, send an email to [email protected].
>>>>>>>>>>>>>> For more options, visit
>>>>>>>>>>>>>> https://groups.google.com/a/wso2.com/d/optout.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> *Supun Sethunga*
>>>>>>>>>>>>> Software Engineer
>>>>>>>>>>>>> WSO2, Inc.
>>>>>>>>>>>>> http://wso2.com/
>>>>>>>>>>>>> lean | enterprise | middleware
>>>>>>>>>>>>> Mobile : +94 716546324
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> *Supun Sethunga*
>>>>>>>>>>>> Software Engineer
>>>>>>>>>>>> WSO2, Inc.
>>>>>>>>>>>> http://wso2.com/
>>>>>>>>>>>> lean | enterprise | middleware
>>>>>>>>>>>> Mobile : +94 716546324
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Kasun Indrasiri
>>>>>>>>>>>> Software Architect
>>>>>>>>>>>> WSO2, Inc.; http://wso2.com
>>>>>>>>>>>> lean.enterprise.middleware
>>>>>>>>>>>>
>>>>>>>>>>>> cell: +94 77 556 5206
>>>>>>>>>>>> Blog : http://kasunpanorama.blogspot.com/
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> *Isuru Udana*
>>>>>>>>>>> Associate Technical Lead
>>>>>>>>>>> WSO2 Inc.; http://wso2.com
>>>>>>>>>>> email: [email protected] cell: +94 77 3791887
>>>>>>>>>>> blog: http://mytecheye.blogspot.com/
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> *Supun Sethunga*
>>>>>>>>>> Software Engineer
>>>>>>>>>> WSO2, Inc.
>>>>>>>>>> http://wso2.com/
>>>>>>>>>> lean | enterprise | middleware
>>>>>>>>>> Mobile : +94 716546324
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Architecture mailing list
>>>>>>>>>> [email protected]
>>>>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> *Sinthuja Rajendran*
>>>>>>>>> Associate Technical Lead
>>>>>>>>> WSO2, Inc.:http://wso2.com
>>>>>>>>>
>>>>>>>>> Blog: http://sinthu-rajan.blogspot.com/
>>>>>>>>> Mobile: +94774273955
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Architecture mailing list
>>>>>>>>> [email protected]
>>>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> *Supun Sethunga*
>>>>>>>> Software Engineer
>>>>>>>> WSO2, Inc.
>>>>>>>> http://wso2.com/
>>>>>>>> lean | enterprise | middleware
>>>>>>>> Mobile : +94 716546324
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> *Sinthuja Rajendran*
>>>>>>> Associate Technical Lead
>>>>>>> WSO2, Inc.:http://wso2.com
>>>>>>>
>>>>>>> Blog: http://sinthu-rajan.blogspot.com/
>>>>>>> Mobile: +94774273955
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> *Sinthuja Rajendran*
>>>>>> Associate Technical Lead
>>>>>> WSO2, Inc.:http://wso2.com
>>>>>>
>>>>>> Blog: http://sinthu-rajan.blogspot.com/
>>>>>> Mobile: +94774273955
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> *Supun Sethunga*
>>>>> Software Engineer
>>>>> WSO2, Inc.
>>>>> http://wso2.com/
>>>>> lean | enterprise | middleware
>>>>> Mobile : +94 716546324
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> *Supun Sethunga*
>>>> Software Engineer
>>>> WSO2, Inc.
>>>> http://wso2.com/
>>>> lean | enterprise | middleware
>>>> Mobile : +94 716546324
>>>>
>>>> _______________________________________________
>>>> Architecture mailing list
>>>> [email protected]
>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>
>>>>
>>>
>>>
>>> --
>>> Dushan Abeyruwan | Technical Lead
>>>
>>> PMC Member Apache Synpase
>>> WSO2 Inc. http://wso2.com/
>>> Blog:*http://www.dushantech.com/ <http://www.dushantech.com/>*
>>> Mobile:(001)408-791-9312
>>>
>>>
>>> _______________________________________________
>>> Architecture mailing list
>>> [email protected]
>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>
>>>
>>
>>
>> --
>> *Supun Sethunga*
>> Software Engineer
>> WSO2, Inc.
>> http://wso2.com/
>> lean | enterprise | middleware
>> Mobile : +94 716546324
>>
>> _______________________________________________
>> Architecture mailing list
>> [email protected]
>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>
>>
>
>
> --
> Viraj Senevirathne
> Software Engineer; WSO2, Inc.
>
> Mobile : +94 71 958 0269
> Email : [email protected]
>



-- 
*Supun Sethunga*
Software Engineer
WSO2, Inc.
http://wso2.com/
lean | enterprise | middleware
Mobile : +94 716546324
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to