+1, we should reuse. I think what ML needs is a subset.

On Mon, Jan 26, 2015 at 11:10 AM, Nirmal Fernando <[email protected]> wrote:

> Hi Anjana,
>
> Isn't this a generic interface to talk to a back-end data store? If so, do
> you think this can be reused in other products?  In ML, we have a similar
> use-case where we need to talk to a generic data layer to store the models
> that are generated.
>
> On Wed, Dec 10, 2014 at 1:37 PM, Anjana Fernando <[email protected]> wrote:
>
>> Hi,
>>
>> I've finished the initial implementation of $subject. This basically
>> contains the standard interfaces we use to plug-in different data sources
>> as the back-end record storage, and for indexing purposes. These pluggable
>> data sources are called "Analytics Data Sources" here, where from a
>> configuration file, you can give the implementation class and the
>> properties required for the initialization. The first implementation of
>> this is done, which is the RDBMS implementation. It basically stores all
>> the records and other data in a relational database, and any type of
>> database can be supported via a configuration file, which gives the query
>> templates used to define a standard set of actions. At the moment, H2 and
>> MySQL query templates have been tested, and we will be adding the rest of
>> popular RDBMS templates as well. The RDBMS AnalyticsDataSource
>> implementation detects the query template by looking at the database
>> connection information, retrieved from the data source (e.g. mentioned in
>> master-datasources.xml), and automatically switches to that mode, so the
>> user basically doesn't have to do anything when configuring.
>>
>> Also, inside the AnalyticsDataSource interface, there is a FileSystem
>> interface you need to implement for your data source implementation, which
>> is basically used for indexing, which is done by Lucene. We use Lucene
>> indexes as index shards for a distributed index and search. So with the
>> sharding approach, we can add more nodes to our cluster to improve the
>> indexing performance, and for storage addition. Basically, provided the
>> backend storage is scalable, the index operations also would be scalable in
>> the same manner. But the limit we first hit is the processing requirements,
>> and the random data access and locking requirements for each shard, so for
>> a typical database system, just by adding new BAM nodes, I'm hoping the
>> indexing performance will almost increase linearly.
>>
>> The AnalyicsDataSource implementations are finally used by a component
>> called AnalyticsDataService, which is the interface seen by clients, and
>> has the indexing related operations with the record store functionality
>> exposed through AnalyticsDataSource. This interface can be looked up as an
>> OSGi service, and we plan on also exposing these functionality as a JAX-RS
>> service.
>>
>> The general design, and documentation on the test cases can be found here
>> at [1] and [2], and the source code at [3]. I will be doing some further
>> performance tests, by integrating this to the product properly, specially
>> the distributed search, and will provide the results here. For the moment,
>> we have a few performance tests as unit tests in the modules. This
>> implementation will be first used by the log analysis implementation done
>> by Gimantha. And we are planning on writing further AnalyticsDataSource
>> implementations for this, such as MongoDB, HBase etc.. There will be
>> separate notes on those.
>>
>> [1]
>> https://docs.google.com/a/wso2.com/spreadsheets/d/10mHRE6FEgF6wDZ-LSBx18zL8ZcIay5ZIhb8MIk7pfeg/edit#gid=0
>> [2]
>> https://docs.google.com/a/wso2.com/spreadsheets/d/1iXoZ8BzaefN3EGOL05y5aUX6SLZH7Bu8YM4bF3xOSvQ/edit#gid=0
>> [3]
>> https://github.com/wso2-dev/carbon-analytics/tree/master/components/xanalytics
>>
>> Cheers,
>> Anjana.
>> --
>> *Anjana Fernando*
>> Senior Technical Lead
>> WSO2 Inc. | http://wso2.com
>> lean . enterprise . middleware
>>
>> _______________________________________________
>> Architecture mailing list
>> [email protected]
>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>
>>
>
>
> --
>
> Thanks & regards,
> Nirmal
>
> Senior Software Engineer- Platform Technologies Team, WSO2 Inc.
> Mobile: +94715779733
> Blog: http://nirmalfdo.blogspot.com/
>
>
>


-- 
============================
Blog: http://srinathsview.blogspot.com twitter:@srinath_perera
Site: http://people.apache.org/~hemapani/
Photos: http://www.flickr.com/photos/hemapani/
Phone: 0772360902
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to