Hi Srinath,

On Thu, May 26, 2016 at 12:09 PM, Srinath Perera <[email protected]> wrote:

> Hi Inosh,
>
> Good catch!! I am +1. Can we do this just by configs or do we need a
> patch? If so can we patch before we release?
>

We can do this by configuration change.


>
> Anjana, cannot we use HDFS for EVENT_STORE and used MySQL only for
> processed data store? ( long term)
>

We can. This is the best approach to use without affecting receiver
performance while running spark jobs in parallel.


>
> --Srinath
>
> On Wed, May 25, 2016 at 8:10 PM, Inosh Goonewardena <[email protected]>
> wrote:
>
>> Hi,
>>
>> At the moment DAS support both MyISAM and InnoDB, but configured to use
>> MyISAM by default.
>>
>> There are several differences between MYISAM and InnoDB, but what is most
>> relevant with regard to DAS is the difference in concurrency. Basically,
>> MyISAM uses table-level locking and InnoDB uses row-level locking. So, with
>> MyISAM, if we are running Spark queries while publishing data to DAS, in
>> higher TPS it can lead to issues due to the inability of obtaining the
>> table lock by DAL layer to insert data to the table while Spark reading
>> from the same table.
>>
>> However, on the other hand, with InnoDB write speed is considerably slow
>> (because it is designed to support transactions), so it will affect the
>> receiver performance.
>>
>> One option we have in DAS is, we can use two DBs to to keep incoming
>> records and processed records, i.e., EVENT_STORE and PROCESSED_DATA_STORE.
>>
>> For ESB Analytics, we can configure to use MyISAM for EVENT_STORE and
>> InnoDB for PROCESSED_DATA_STORE. It is because in ESB analytics,
>> summarizing up to minute level is done by real time analytics and Spark
>> queries will read and process data using minutely (and higher) tables which
>> we can keep in PROCESSED_DATA_STORE. Since raw table(which data receiver
>> writes data) is not being used by Spark queries, the receiver performance
>> will not be affected.
>>
>> However, in most cases, Spark queries may written to read data directly
>> from raw tables. As mentioned above, with MyISAM this could lead to
>> performance issues if data publishing and spark analytics happens in
>> parallel. So considering that I think we should change the default
>> configuration to use InnoDB. WDYT?
>>
>> --
>> Thanks & Regards,
>>
>> Inosh Goonewardena
>> Associate Technical Lead- WSO2 Inc.
>> Mobile: +94779966317
>>
>
>
>
> --
> ============================
> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
> Site: http://home.apache.org/~hemapani/
> Photos: http://www.flickr.com/photos/hemapani/
> Phone: 0772360902
>



-- 
Thanks & Regards,

Inosh Goonewardena
Associate Technical Lead- WSO2 Inc.
Mobile: +94779966317
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to