Why are we inventing a new event format for this? Why not use a stream definition and publish using Thrift?
Sorry if I'm missing something here. On Fri, Feb 19, 2016 at 11:44 AM, Supun Sethunga <[email protected]> wrote: > HI, > > Ran some more performance tests to contrast between publishing Aggregated > events Vs Multiple single events, and follow are the results: > > *Results:* > > No of concurrent publishers (to DAS): 10 > Back-end DB: MySQL > > Single Events Aggregated Events* Single Events Aggregated Events* > No of events: 160,000 10,000 1,600,000 100,000 > Event payload size: 1.9 KB 21.6 KB 1.9 KB 21.6 KB > Time Consumed** (mm:ss): 1:55 0:30 19:46 4:31 > > *An aggregated event contains payloads of 16 single events. > **Time consumed = time to complete all DB transactions. > > Please note that these times were monitored while DB trace logs were on. > So that too have some effect on the performance in overall. > > Regards, > Supun > > On Wed, Feb 17, 2016 at 5:37 PM, Viraj Senevirathne <[email protected]> > wrote: > >> Hi All, >> >> We got a simple sample payload for a actual message flow (attached). >> >> This have about 16 mediators. The payload file size is ~27.4kB. With >> different payload size and large number of mediators in the flow , single >> payload size can get even bigger. So if ESB is serving 1000 request per >> second, ESB will transfer payloads to DAS with data rate ~27Mb/s. With >> large payload sizes and large number of mediators in the flow this data >> rate can be go up very high. >> >> As strings have high repeatably compression works well wtih them. After >> compressing above payload its size ~2kB. (93% reduction from original size). >> >> Large Json File with 1.3MB was reduced to 14.3kB after compression. >> >> Therefore will it be possible to send compressed json string to DAS >> instead of uncompressed one. Then DAS can decompress the file and use the >> actual json payload. >> >> I think this will reduce the data rate drastically and ease data >> communication. >> >> Will it be possible to define new type like "commpressedJSON" to achive >> this? WDYT about this idea? >> >> Thank You, >> >> On Wed, Feb 17, 2016 at 9:38 AM, Supun Sethunga <[email protected]> wrote: >> >>> Hi Dushan, >>> >>> Supun, according to the stream definition ""children": 1," what it >>>> represents ? >>> >>> >>> Here, each event basically represent a mediator/proxy. So "children" >>> represents the child mediator(s) in the message flow. This info is used to >>> draw the message flow diagram. >>> >>> For eg, if we consider the first event in the array, "children":1 means >>> event at index 1 is the first mediator after Test Proxy. and so on. >>> Sorry, the values I have put for the "children" in second and third >>> events are misleading. They should be "children":2 and "children":null, >>> respectively. So, null means its the end of the message flow. >>> >>> Regards, >>> Supun >>> >>> On Wed, Feb 17, 2016 at 2:34 AM, Dushan Abeyruwan <[email protected]> >>> wrote: >>> >>>> Hi >>>> >>>> - If we publish events from each mediator then, we can certainly >>>> group each event from unique parentID can't we? (I mean this would >>>> allow us >>>> to prepare a aggregated view per incoming message and visualize >>>> different >>>> stages of each message representation and other meta information, think >>>> of >>>> complex mediation) >>>> - Can't we record payload as according to Content-Type, therefore, >>>> shall we get rid of SOAP way of representing? >>>> - If we have non-content aware mediation flow with >>>> "application/json", can we find the way to get json string rather rather >>>> explicitly build i.e "org.apache.synapse.commons.json.Constants. >>>> JSON_STRING" >>>> - Supun, according to the stream definition ""children": 1," what >>>> it represents ? >>>> >>>> >>>> On Mon, Feb 15, 2016 at 9:15 PM, Supun Sethunga <[email protected]> >>>> wrote: >>>> >>>>> Hi Dunith, Gihan, >>>>> >>>>> As per the offline chat had with Buddhima and Viraj, follow is a >>>>> sample payload to be published from ESB to DAS. Do we need any other >>>>> information for the plots/tables in dashboard? >>>>> >>>>> Here we added a new field "entryPoint" to indicate inside which >>>>> Proxy/API did the mediator get executed. So that it would be easy to drill >>>>> down from proxy view to mediator view. Please add if there is any other >>>>> similar field that would be needed for drill-downs, if we have missed any. >>>>> >>>>> { >>>>> "events": [{ >>>>> "compotentType": "ProxyService", >>>>> "compotentId": "Test Proxy", >>>>> "startTime": 1455531027, >>>>> "endTime": 1455531041, >>>>> "duration": 3.321, >>>>> "beforePayload": null, >>>>> "afterPayload": null, >>>>> "contextPropertyMap": >>>>> "{\"MESSAGE_FLOW_ID\":\"urn_uuid_e4251abb-8ff5-433b-8dcb-24f251c3e30d\"}", >>>>> "transportPropertyMap": "{\"Content-Type\":\"application\/soap+xml; >>>>> charset=UTF-8; action=\"urn:renewLicense\"\",\"Host\":\"localhost\"}", >>>>> "children": 1, >>>>> "entryPoint": "Test Proxy" >>>>> }, { >>>>> "compotentType": "Mediator", >>>>> "compotentId": "mediator_1", >>>>> "startTime": 1455531041, >>>>> "endTime": 1455531052, >>>>> "duration": 3.321, >>>>> "beforePayload": null, >>>>> "afterPayload": null, >>>>> "contextPropertyMap": >>>>> "{\"MESSAGE_FLOW_ID\":\"urn_uuid_e4251abb-8ff5-433b-8dcb-24f251c3e30d\"}", >>>>> "transportPropertyMap": "{\"Content-Type\":\"application\/soap+xml; >>>>> charset=UTF-8; action=\"urn:renewLicense\"\",\"Host\":\"localhost\"}", >>>>> "children": 0, >>>>> "entryPoint": "Test Proxy" >>>>> }, { >>>>> "compotentType": "Mediator", >>>>> "compotentId": "mediator_2", >>>>> "startTime": 1455531052, >>>>> "endTime": 1455531074, >>>>> "duration": 3.321, >>>>> "beforePayload": null, >>>>> "afterPayload": null, >>>>> "contextPropertyMap": null, >>>>> "transportPropertyMap": null, >>>>> "children": 0, >>>>> "entryPoint": "Test Proxy" >>>>> }], >>>>> >>>>> "payloads": [{ >>>>> "payload": "<?xml version=\"1.0\" >>>>> encoding=\"utf-8\"?><soapenv:Envelope xmlns:soapenv=\" >>>>> http://www.w3.org/2003/05/soap-envelope\"><soapenv:Body><sam:getCertificateID >>>>> xmlns:sam=\"http://sample.esb.org >>>>> \"><sam:vehicleNumber>123456</sam:vehicleNumber></sam:getCertificateID></soapenv:Body></soapenv:Envelope>", >>>>> "events": [{ >>>>> "eventIndex": 0, >>>>> "attributes": "beforePayload" >>>>> }, { >>>>> "eventIndex": 0, >>>>> "attributes": "afterPayload" >>>>> }, { >>>>> "eventIndex": 1, >>>>> "attributes": "beforePayload" >>>>> }] >>>>> }, { >>>>> "payload": "<?xml version=\"1.0\" >>>>> encoding=\"utf-8\"?><soapenv:Envelope xmlns:soapenv=\" >>>>> http://www.w3.org/2003/05/soap-envelope\"><soapenv:Body><sam:getCertificateID >>>>> xmlns:sam=\"http://sample.esb.org >>>>> \"><sam:vehicleNumber>123123</sam:vehicleNumber><sam:vehicleType>car</sam:vehicleType></sam:getCertificateID></soapenv:Body></soapenv:Envelope>", >>>>> "events": [{ >>>>> "eventIndex": 1, >>>>> "attributes": "afterPayload" >>>>> }, { >>>>> "eventIndex": 2, >>>>> "attributes": "beforePayload" >>>>> }, { >>>>> "eventIndex": 2, >>>>> "attributes": "afterPayload" >>>>> }] >>>>> }] >>>>> } >>>>> >>>>> Thanks, >>>>> Supun >>>>> >>>>> On Wed, Feb 10, 2016 at 11:57 AM, Supun Sethunga <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi Sinthuja, >>>>>> >>>>>> >>>>>>> IMHO we could solve this issue as having conversions. Basically we >>>>>>> could use $payloads:payload1 to reference the elements as a convention. >>>>>>> If >>>>>>> the element starts with '$' then it's the reference, not the actual >>>>>>> payload. In that case if there is a new element introduced, let's say >>>>>>> foo >>>>>>> and you need to access the property property1, then it will have the >>>>>>> reference as $foo:property1. >>>>>> >>>>>> >>>>>> Yes, that's possible as well. But again, if the value for the >>>>>> property, say 'foo', has an actual value starting with some special >>>>>> character.. (in this case '$'), we may run in to ambiguity. (true, the >>>>>> chances are pretty less, but still possible). >>>>>> >>>>>> >>>>>> Also this json event format is being sent as event payload in wso2 >>>>>>> event, and wso2 event is being published by the data publisher right? >>>>>>> Correct me if i'm wrong. >>>>>> >>>>>> >>>>>> Yes. >>>>>> >>>>>> Thanks, >>>>>> Supun >>>>>> >>>>>> >>>>>> On Wed, Feb 10, 2016 at 11:35 AM, Sinthuja Ragendran < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi Supun, >>>>>>> >>>>>>> Also this json event format is being sent as event payload in wso2 >>>>>>> event, and wso2 event is being published by the data publisher right? >>>>>>> Correct me if i'm wrong. >>>>>>> >>>>>>> Thanks, >>>>>>> Sinthuja. >>>>>>> >>>>>>> On Wed, Feb 10, 2016 at 11:26 AM, Sinthuja Ragendran < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Hi Supun, >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Feb 10, 2016 at 11:14 AM, Supun Sethunga <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Sinthuja, >>>>>>>>> >>>>>>>>> Agree on the possibility of simplifying the json. We also >>>>>>>>> discussed on the same matter yesterday, but the complication came up >>>>>>>>> was, >>>>>>>>> by an event in the "events" list, payload could be >>>>>>>>> either referenced, or defined in-line.(made as it is, so that it can >>>>>>>>> be >>>>>>>>> generalized for other fields as well if needed, other than payloads.). >>>>>>>>> >>>>>>>> In such a case, if we had defined as 'payload': '*payload1**', *we >>>>>>>>> would not know if its the actual payload, or a reference to the >>>>>>>>> payload in >>>>>>>>> the "payloads" section. >>>>>>>>> >>>>>>>>> With the suggested format, DAS will only go and map the payload if >>>>>>>>> its null. >>>>>>>>> >>>>>>>>> >>>>>>>> IMHO we could solve this issue as having conversions. Basically we >>>>>>>> could use $payloads:payload1 to reference the elements as a >>>>>>>> convention. If >>>>>>>> the element starts with '$' then it's the reference, not the actual >>>>>>>> payload. In that case if there is a new element introduced, let's say >>>>>>>> foo >>>>>>>> and you need to access the property property1, then it will have the >>>>>>>> reference as $foo:property1. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Sinthuja. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Supun >>>>>>>>> >>>>>>>>> On Wed, Feb 10, 2016 at 10:52 AM, Sinthuja Ragendran < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Hi Supun, >>>>>>>>>> >>>>>>>>>> I think we could simplify the json message bit more. Instead of >>>>>>>>>> 'null' for the payload attributes in the events section, you could >>>>>>>>>> use the >>>>>>>>>> actual payload name directly if there is a payload for that event. >>>>>>>>>> And in >>>>>>>>>> that case, we could eliminate the 'events' section from the >>>>>>>>>> 'payloads' >>>>>>>>>> section. For the given example, it could be altered as below. >>>>>>>>>> >>>>>>>>>> { >>>>>>>>>> 'events': [{ >>>>>>>>>> 'messageId': 'aaa', >>>>>>>>>> 'componentId': '111', >>>>>>>>>> 'payload': '*payload1*', >>>>>>>>>> 'componentName': 'Proxy:TestProxy', >>>>>>>>>> 'output-payload':null >>>>>>>>>> }, { >>>>>>>>>> 'messageId': 'bbb', >>>>>>>>>> 'componentId': '222', >>>>>>>>>> 'componentName': 'Proxy:TestProxy', >>>>>>>>>> 'payload': '*payload1*', >>>>>>>>>> 'output-payload':null >>>>>>>>>> }, { >>>>>>>>>> 'messageId': 'ccc', >>>>>>>>>> 'componentId': '789', >>>>>>>>>> 'payload': '*payload2*', >>>>>>>>>> 'componentName': 'Proxy:TestProxy', >>>>>>>>>> 'output-payload':'*payload2*' >>>>>>>>>> }], >>>>>>>>>> >>>>>>>>>> 'payloads': { >>>>>>>>>> '*payload1*': 'xml-payload-1', >>>>>>>>>> '*payload2*': 'xml-payload-2', >>>>>>>>>> } >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Sinthuja. >>>>>>>>>> >>>>>>>>>> On Wed, Feb 10, 2016 at 10:18 AM, Supun Sethunga <[email protected] >>>>>>>>>> > wrote: >>>>>>>>>> >>>>>>>>>>> Hi Budhdhima/Viraj, >>>>>>>>>>> >>>>>>>>>>> As per the discussion we had yesterday, follow is the format of >>>>>>>>>>> the json contains aggregated event details, to be sent to DAS. (you >>>>>>>>>>> may >>>>>>>>>>> change the attribute names of events). >>>>>>>>>>> >>>>>>>>>>> To explain it further, "events" contains the details about each >>>>>>>>>>> event sent by each mediator. Payload may or may not be populated. >>>>>>>>>>> "Payloads" section contains unique payloads and the mapping to the >>>>>>>>>>> events >>>>>>>>>>> their fields. (eg: 'xml-payload-2' maps to the 'payload' and >>>>>>>>>>> 'output-payload' fields of the 3rd event). >>>>>>>>>>> >>>>>>>>>>> { >>>>>>>>>>> 'events': [{ >>>>>>>>>>> 'messageId': 'aaa', >>>>>>>>>>> 'componentId': '111', >>>>>>>>>>> 'payload': null, >>>>>>>>>>> >>>>>>>>>> 'componentName': 'Proxy:TestProxy', >>>>>>>>>>> 'output-payload':null >>>>>>>>>>> }, { >>>>>>>>>>> 'messageId': 'bbb', >>>>>>>>>>> 'componentId': '222', >>>>>>>>>>> 'componentName': 'Proxy:TestProxy', >>>>>>>>>>> 'payload': null, >>>>>>>>>>> 'output-payload':null >>>>>>>>>>> }, { >>>>>>>>>>> 'messageId': 'ccc', >>>>>>>>>>> 'componentId': '789', >>>>>>>>>>> 'payload': null, >>>>>>>>>>> 'componentName': 'Proxy:TestProxy', >>>>>>>>>>> 'output-payload':null >>>>>>>>>>> }], >>>>>>>>>>> >>>>>>>>>>> 'payloads': [{ >>>>>>>>>>> 'payload': 'xml-payload-1', >>>>>>>>>>> 'events': [{ >>>>>>>>>>> 'eventIndex': 0, >>>>>>>>>>> 'attributes':['payload'] >>>>>>>>>>> }, { >>>>>>>>>>> 'eventIndex': 1, >>>>>>>>>>> 'attributes':['payload'] >>>>>>>>>>> }] >>>>>>>>>>> }, { >>>>>>>>>>> 'payload': 'xml-payload-2', >>>>>>>>>>> 'events': [{ >>>>>>>>>>> 'eventIndex': 2, >>>>>>>>>>> 'attributes':['payload','output-payload'] >>>>>>>>>>> }] >>>>>>>>>>> }] >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> Please let us know any further clarifications is needed, or if >>>>>>>>>>> there's anything to be modified/improved. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Supun >>>>>>>>>>> >>>>>>>>>>> On Tue, Feb 9, 2016 at 11:05 AM, Isuru Udana <[email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Kasun, >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Feb 9, 2016 at 10:10 AM, Kasun Indrasiri < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> I think for trancing use case we need to publish events one by >>>>>>>>>>>>> one from each mediator (we can't aggregate all such events as it >>>>>>>>>>>>> also >>>>>>>>>>>>> contains the message payload) >>>>>>>>>>>>> >>>>>>>>>>>> I think we can still do that with some extra effort. >>>>>>>>>>>> Most of the mediators in a sequence flow does not alter the >>>>>>>>>>>> message payload. We can store the payload only for the mediators >>>>>>>>>>>> which >>>>>>>>>>>> alter the message payload. And for others, we can put a reference >>>>>>>>>>>> to the >>>>>>>>>>>> previous entry. By doing that we can save the memory to a great >>>>>>>>>>>> extent. >>>>>>>>>>>> >>>>>>>>>>>> Thanks. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> ---------- Forwarded message ---------- >>>>>>>>>>>>> From: Supun Sethunga <[email protected]> >>>>>>>>>>>>> Date: Mon, Feb 8, 2016 at 2:54 PM >>>>>>>>>>>>> Subject: Re: ESB Analytics Mediation Event Publishing Mechanism >>>>>>>>>>>>> To: Anjana Fernando <[email protected]> >>>>>>>>>>>>> Cc: "[email protected]" <[email protected]>, >>>>>>>>>>>>> Srinath Perera <[email protected]>, Sanjiva Weerawarana < >>>>>>>>>>>>> [email protected]>, Kasun Indrasiri <[email protected]>, Isuru >>>>>>>>>>>>> Udana <[email protected]> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Hi all, >>>>>>>>>>>>> >>>>>>>>>>>>> Ran some simple performance tests against the new relational >>>>>>>>>>>>> provider, in comparison with the existing one. Follow are the >>>>>>>>>>>>> results: >>>>>>>>>>>>> >>>>>>>>>>>>> *Records in Backend DB Table*: *1,054,057* >>>>>>>>>>>>> >>>>>>>>>>>>> *Conversion:* >>>>>>>>>>>>> Spark Table >>>>>>>>>>>>> id a b c >>>>>>>>>>>>> Backend DB Table 1 xxx yyy zzz >>>>>>>>>>>>> id data 1 ppp qqq rrr >>>>>>>>>>>>> 1 >>>>>>>>>>>>> [{'a':'aaa','b':'bbb','c':'ccc'},{'a':'xxx','b':'yyy','c':'zzz'},{'a':'ppp','b':'qqq','c':'rrr'}] >>>>>>>>>>>>> -- >>>>>>>>>>>>> To --> 1 aaa bbb ccc >>>>>>>>>>>>> 2 >>>>>>>>>>>>> [{'a':'aaa','b':'bbb','c':'ccc'},{'a':'xxx','b':'yyy','c':'zzz'},{'a':'ppp','b':'qqq','c':'rrr'}] >>>>>>>>>>>>> 2 xxx yyy zzz >>>>>>>>>>>>> 2 aaa bbb ccc >>>>>>>>>>>>> 2 ppp qqq rrr >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> *Avg Time for Query Execution:* >>>>>>>>>>>>> >>>>>>>>>>>>> Querry >>>>>>>>>>>>> Execution time (~ sec) >>>>>>>>>>>>> Existing Analytics Relation Provider New (ESB) Analytics >>>>>>>>>>>>> Relation Provider* * New relational provider split a single >>>>>>>>>>>>> row to multiple rows. Hence the number of rows in the table >>>>>>>>>>>>> equivalent to 3 >>>>>>>>>>>>> times (as each row is split to 3 rows) as the original table. >>>>>>>>>>>>> SELECT COUNT(*) FROM <Table>; 13 16 >>>>>>>>>>>>> SELECT * FROM <Table> ORDER BY id ASC; 13 16 >>>>>>>>>>>>> SELECT * FROM <Table> WHERE id=98435; 13 16 >>>>>>>>>>>>> SELECT id,a,first(b),first(c) FROM <Table> GROUP BY id,a ORDER >>>>>>>>>>>>> BY id ASC; 18 26 >>>>>>>>>>>>> >>>>>>>>>>>>> Regards, >>>>>>>>>>>>> Supun >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Feb 3, 2016 at 3:36 PM, Supun Sethunga < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have started working on implementing a new "relation" / >>>>>>>>>>>>>> "relation >>>>>>>>>>>>>> provider", to serve the above requirement. This basically is a >>>>>>>>>>>>>> modified >>>>>>>>>>>>>> version of the existing "Carbon Analytics" relation provider. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Here I have assumed that the encapsulated data for a single >>>>>>>>>>>>>> execution >>>>>>>>>>>>>> flow are stored in a single row, and the data about the >>>>>>>>>>>>>> mediators invoked during the flow are stored in a known column >>>>>>>>>>>>>> of each row >>>>>>>>>>>>>> (say "data"), as an array (say a json array). When each row is >>>>>>>>>>>>>> read in to >>>>>>>>>>>>>> spark, this relational provider create separate rows for each of >>>>>>>>>>>>>> the >>>>>>>>>>>>>> element in the array stored in "data" column. I have tested this >>>>>>>>>>>>>> with some >>>>>>>>>>>>>> mocked data, and works as expected. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Need to test with the real data/data-formats, and modify the >>>>>>>>>>>>>> mapping accordingly. Will update the thread with the details. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>> Supun >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Feb 2, 2016 at 2:36 AM, Anjana Fernando < >>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In a meeting I'd with Kasun and the ESB team, I got to know >>>>>>>>>>>>>>> that, for their tracing mechanism, they were instructed to >>>>>>>>>>>>>>> publish one >>>>>>>>>>>>>>> event for each of the mediator invocations, where, earlier they >>>>>>>>>>>>>>> had an >>>>>>>>>>>>>>> approach, they publish one event, which encapsulated data of a >>>>>>>>>>>>>>> whole >>>>>>>>>>>>>>> execution flow. I would actually like to support the latter >>>>>>>>>>>>>>> approach, >>>>>>>>>>>>>>> mainly due to performance / resource requirements. And also >>>>>>>>>>>>>>> considering the >>>>>>>>>>>>>>> fact, this is a feature that could be enabled in production. So >>>>>>>>>>>>>>> simply, if >>>>>>>>>>>>>>> we do one event per mediator, this does not scale that well. >>>>>>>>>>>>>>> For example, >>>>>>>>>>>>>>> if the ESB is doing 1k TPS, for a sequence that has 20 >>>>>>>>>>>>>>> mediators, that is >>>>>>>>>>>>>>> 20k TPS for analytics traffic. Combine that with a possible ESB >>>>>>>>>>>>>>> cluster >>>>>>>>>>>>>>> hitting a DAS cluster with a single backend database, this >>>>>>>>>>>>>>> maybe too many >>>>>>>>>>>>>>> rows per second written to the database. Where the main problem >>>>>>>>>>>>>>> here is, >>>>>>>>>>>>>>> one event is, a single row/record in the backend database in >>>>>>>>>>>>>>> DAS, so it may >>>>>>>>>>>>>>> come to a state, where the frequency of row creations by events >>>>>>>>>>>>>>> coming from >>>>>>>>>>>>>>> ESBs cannot be sustained. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> If we create a single event from the 20 mediators, then it >>>>>>>>>>>>>>> is just 1k TPS for DAS event receivers and the database too, >>>>>>>>>>>>>>> event though >>>>>>>>>>>>>>> the message size is bigger. It is not necessarily same >>>>>>>>>>>>>>> performance, if you >>>>>>>>>>>>>>> publish lots of small events to publishing bigger events. >>>>>>>>>>>>>>> Throughput wise, >>>>>>>>>>>>>>> comparatively bigger events will win (even though if we >>>>>>>>>>>>>>> consider that, >>>>>>>>>>>>>>> small operations will be batched in transport level etc.. still >>>>>>>>>>>>>>> one event = >>>>>>>>>>>>>>> one database row). So I would suggest, we try out a single >>>>>>>>>>>>>>> sequence flow = >>>>>>>>>>>>>>> single event, approach, and from the Spark processing side, we >>>>>>>>>>>>>>> consider one >>>>>>>>>>>>>>> of these big rows as multiple rows in Spark. I was first >>>>>>>>>>>>>>> thinking, if UDFs >>>>>>>>>>>>>>> can help in splitting a single column to multiple rows, and >>>>>>>>>>>>>>> that is not >>>>>>>>>>>>>>> possible, and also, a bit troublesome, considering we have to >>>>>>>>>>>>>>> delete the >>>>>>>>>>>>>>> original data table after we concerted it using a script, and >>>>>>>>>>>>>>> not >>>>>>>>>>>>>>> forgetting, we actually have to schedule and run a separate >>>>>>>>>>>>>>> script to do >>>>>>>>>>>>>>> this post-processing. So a much cleaner way to do this would >>>>>>>>>>>>>>> be, to create >>>>>>>>>>>>>>> a new "relation provider" in Spark (which is like a data >>>>>>>>>>>>>>> adapter for their >>>>>>>>>>>>>>> DataFrames), and in our relation provider, when we are reading >>>>>>>>>>>>>>> rows, we >>>>>>>>>>>>>>> convert a single row's column to multiple rows and return that >>>>>>>>>>>>>>> for >>>>>>>>>>>>>>> processing. So Spark will not know, physically it was a single >>>>>>>>>>>>>>> row from the >>>>>>>>>>>>>>> data layer, and it can summarize the data and all as usual and >>>>>>>>>>>>>>> write to the >>>>>>>>>>>>>>> target summary tables. [1] is our existing implementation of >>>>>>>>>>>>>>> Spark relation >>>>>>>>>>>>>>> provider, which directly maps to our DAS analytics tables, we >>>>>>>>>>>>>>> can create >>>>>>>>>>>>>>> the new one extending / based on it. So I suggest we try out >>>>>>>>>>>>>>> this approach >>>>>>>>>>>>>>> and see, if everyone is okay with it. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>> https://github.com/wso2/carbon-analytics/blob/master/components/analytics-processors/org.wso2.carbon.analytics.spark.core/src/main/java/org/wso2/carbon/analytics/spark/core/sources/AnalyticsRelationProvider.java >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>> Anjana. >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> *Anjana Fernando* >>>>>>>>>>>>>>> Senior Technical Lead >>>>>>>>>>>>>>> WSO2 Inc. | http://wso2.com >>>>>>>>>>>>>>> lean . enterprise . middleware >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>>>>>>> Google Groups "WSO2 Engineering Group" group. >>>>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails >>>>>>>>>>>>>>> from it, send an email to >>>>>>>>>>>>>>> [email protected]. >>>>>>>>>>>>>>> For more options, visit >>>>>>>>>>>>>>> https://groups.google.com/a/wso2.com/d/optout. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> *Supun Sethunga* >>>>>>>>>>>>>> Software Engineer >>>>>>>>>>>>>> WSO2, Inc. >>>>>>>>>>>>>> http://wso2.com/ >>>>>>>>>>>>>> lean | enterprise | middleware >>>>>>>>>>>>>> Mobile : +94 716546324 >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> *Supun Sethunga* >>>>>>>>>>>>> Software Engineer >>>>>>>>>>>>> WSO2, Inc. >>>>>>>>>>>>> http://wso2.com/ >>>>>>>>>>>>> lean | enterprise | middleware >>>>>>>>>>>>> Mobile : +94 716546324 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> Kasun Indrasiri >>>>>>>>>>>>> Software Architect >>>>>>>>>>>>> WSO2, Inc.; http://wso2.com >>>>>>>>>>>>> lean.enterprise.middleware >>>>>>>>>>>>> >>>>>>>>>>>>> cell: +94 77 556 5206 >>>>>>>>>>>>> Blog : http://kasunpanorama.blogspot.com/ >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> *Isuru Udana* >>>>>>>>>>>> Associate Technical Lead >>>>>>>>>>>> WSO2 Inc.; http://wso2.com >>>>>>>>>>>> email: [email protected] cell: +94 77 3791887 >>>>>>>>>>>> blog: http://mytecheye.blogspot.com/ >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> *Supun Sethunga* >>>>>>>>>>> Software Engineer >>>>>>>>>>> WSO2, Inc. >>>>>>>>>>> http://wso2.com/ >>>>>>>>>>> lean | enterprise | middleware >>>>>>>>>>> Mobile : +94 716546324 >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Architecture mailing list >>>>>>>>>>> [email protected] >>>>>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> *Sinthuja Rajendran* >>>>>>>>>> Associate Technical Lead >>>>>>>>>> WSO2, Inc.:http://wso2.com >>>>>>>>>> >>>>>>>>>> Blog: http://sinthu-rajan.blogspot.com/ >>>>>>>>>> Mobile: +94774273955 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Architecture mailing list >>>>>>>>>> [email protected] >>>>>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> *Supun Sethunga* >>>>>>>>> Software Engineer >>>>>>>>> WSO2, Inc. >>>>>>>>> http://wso2.com/ >>>>>>>>> lean | enterprise | middleware >>>>>>>>> Mobile : +94 716546324 >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> *Sinthuja Rajendran* >>>>>>>> Associate Technical Lead >>>>>>>> WSO2, Inc.:http://wso2.com >>>>>>>> >>>>>>>> Blog: http://sinthu-rajan.blogspot.com/ >>>>>>>> Mobile: +94774273955 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *Sinthuja Rajendran* >>>>>>> Associate Technical Lead >>>>>>> WSO2, Inc.:http://wso2.com >>>>>>> >>>>>>> Blog: http://sinthu-rajan.blogspot.com/ >>>>>>> Mobile: +94774273955 >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> *Supun Sethunga* >>>>>> Software Engineer >>>>>> WSO2, Inc. >>>>>> http://wso2.com/ >>>>>> lean | enterprise | middleware >>>>>> Mobile : +94 716546324 >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> *Supun Sethunga* >>>>> Software Engineer >>>>> WSO2, Inc. >>>>> http://wso2.com/ >>>>> lean | enterprise | middleware >>>>> Mobile : +94 716546324 >>>>> >>>>> _______________________________________________ >>>>> Architecture mailing list >>>>> [email protected] >>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>>> >>>>> >>>> >>>> >>>> -- >>>> Dushan Abeyruwan | Technical Lead >>>> >>>> PMC Member Apache Synpase >>>> WSO2 Inc. http://wso2.com/ >>>> Blog:*http://www.dushantech.com/ <http://www.dushantech.com/>* >>>> Mobile:(001)408-791-9312 >>>> >>>> >>>> _______________________________________________ >>>> Architecture mailing list >>>> [email protected] >>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>>> >>>> >>> >>> >>> -- >>> *Supun Sethunga* >>> Software Engineer >>> WSO2, Inc. >>> http://wso2.com/ >>> lean | enterprise | middleware >>> Mobile : +94 716546324 >>> >>> _______________________________________________ >>> Architecture mailing list >>> [email protected] >>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>> >>> >> >> >> -- >> Viraj Senevirathne >> Software Engineer; WSO2, Inc. >> >> Mobile : +94 71 958 0269 >> Email : [email protected] >> > > > > -- > *Supun Sethunga* > Software Engineer > WSO2, Inc. > http://wso2.com/ > lean | enterprise | middleware > Mobile : +94 716546324 > > _______________________________________________ > Architecture mailing list > [email protected] > https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture > > -- Sanjiva Weerawarana, Ph.D. Founder, CEO & Chief Architect; WSO2, Inc.; http://wso2.com/ email: [email protected]; office: (+1 650 745 4499 | +94 11 214 5345) x5700; cell: +94 77 787 6880 | +1 408 466 5099; voip: +1 650 265 8311 blog: http://sanjiva.weerawarana.org/; twitter: @sanjiva Lean . Enterprise . Middleware
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
