Hi Srinath, On Thu, May 26, 2016 at 12:09 PM, Srinath Perera <[email protected]> wrote:
> Hi Inosh, > > Good catch!! I am +1. Can we do this just by configs or do we need a > patch? If so can we patch before we release? > We can do this by configuration change. > > Anjana, cannot we use HDFS for EVENT_STORE and used MySQL only for > processed data store? ( long term) > We can. This is the best approach to use without affecting receiver performance while running spark jobs in parallel. > > --Srinath > > On Wed, May 25, 2016 at 8:10 PM, Inosh Goonewardena <[email protected]> > wrote: > >> Hi, >> >> At the moment DAS support both MyISAM and InnoDB, but configured to use >> MyISAM by default. >> >> There are several differences between MYISAM and InnoDB, but what is most >> relevant with regard to DAS is the difference in concurrency. Basically, >> MyISAM uses table-level locking and InnoDB uses row-level locking. So, with >> MyISAM, if we are running Spark queries while publishing data to DAS, in >> higher TPS it can lead to issues due to the inability of obtaining the >> table lock by DAL layer to insert data to the table while Spark reading >> from the same table. >> >> However, on the other hand, with InnoDB write speed is considerably slow >> (because it is designed to support transactions), so it will affect the >> receiver performance. >> >> One option we have in DAS is, we can use two DBs to to keep incoming >> records and processed records, i.e., EVENT_STORE and PROCESSED_DATA_STORE. >> >> For ESB Analytics, we can configure to use MyISAM for EVENT_STORE and >> InnoDB for PROCESSED_DATA_STORE. It is because in ESB analytics, >> summarizing up to minute level is done by real time analytics and Spark >> queries will read and process data using minutely (and higher) tables which >> we can keep in PROCESSED_DATA_STORE. Since raw table(which data receiver >> writes data) is not being used by Spark queries, the receiver performance >> will not be affected. >> >> However, in most cases, Spark queries may written to read data directly >> from raw tables. As mentioned above, with MyISAM this could lead to >> performance issues if data publishing and spark analytics happens in >> parallel. So considering that I think we should change the default >> configuration to use InnoDB. WDYT? >> >> -- >> Thanks & Regards, >> >> Inosh Goonewardena >> Associate Technical Lead- WSO2 Inc. >> Mobile: +94779966317 >> > > > > -- > ============================ > Blog: http://srinathsview.blogspot.com twitter:@srinath_perera > Site: http://home.apache.org/~hemapani/ > Photos: http://www.flickr.com/photos/hemapani/ > Phone: 0772360902 > -- Thanks & Regards, Inosh Goonewardena Associate Technical Lead- WSO2 Inc. Mobile: +94779966317
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
