Hi,

At the moment DAS support both MyISAM and InnoDB, but configured to use
MyISAM by default.

There are several differences between MYISAM and InnoDB, but what is most
relevant with regard to DAS is the difference in concurrency. Basically,
MyISAM uses table-level locking and InnoDB uses row-level locking. So, with
MyISAM, if we are running Spark queries while publishing data to DAS, in
higher TPS it can lead to issues due to the inability of obtaining the
table lock by DAL layer to insert data to the table while Spark reading
from the same table.

However, on the other hand, with InnoDB write speed is considerably slow
(because it is designed to support transactions), so it will affect the
receiver performance.

One option we have in DAS is, we can use two DBs to to keep incoming
records and processed records, i.e., EVENT_STORE and PROCESSED_DATA_STORE.

For ESB Analytics, we can configure to use MyISAM for EVENT_STORE and
InnoDB for PROCESSED_DATA_STORE. It is because in ESB analytics,
summarizing up to minute level is done by real time analytics and Spark
queries will read and process data using minutely (and higher) tables which
we can keep in PROCESSED_DATA_STORE. Since raw table(which data receiver
writes data) is not being used by Spark queries, the receiver performance
will not be affected.

However, in most cases, Spark queries may written to read data directly
from raw tables. As mentioned above, with MyISAM this could lead to
performance issues if data publishing and spark analytics happens in
parallel. So considering that I think we should change the default
configuration to use InnoDB. WDYT?

-- 
Thanks & Regards,

Inosh Goonewardena
Associate Technical Lead- WSO2 Inc.
Mobile: +94779966317
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to