Hi Anjana, Isn't this a generic interface to talk to a back-end data store? If so, do you think this can be reused in other products? In ML, we have a similar use-case where we need to talk to a generic data layer to store the models that are generated.
On Wed, Dec 10, 2014 at 1:37 PM, Anjana Fernando <[email protected]> wrote: > Hi, > > I've finished the initial implementation of $subject. This basically > contains the standard interfaces we use to plug-in different data sources > as the back-end record storage, and for indexing purposes. These pluggable > data sources are called "Analytics Data Sources" here, where from a > configuration file, you can give the implementation class and the > properties required for the initialization. The first implementation of > this is done, which is the RDBMS implementation. It basically stores all > the records and other data in a relational database, and any type of > database can be supported via a configuration file, which gives the query > templates used to define a standard set of actions. At the moment, H2 and > MySQL query templates have been tested, and we will be adding the rest of > popular RDBMS templates as well. The RDBMS AnalyticsDataSource > implementation detects the query template by looking at the database > connection information, retrieved from the data source (e.g. mentioned in > master-datasources.xml), and automatically switches to that mode, so the > user basically doesn't have to do anything when configuring. > > Also, inside the AnalyticsDataSource interface, there is a FileSystem > interface you need to implement for your data source implementation, which > is basically used for indexing, which is done by Lucene. We use Lucene > indexes as index shards for a distributed index and search. So with the > sharding approach, we can add more nodes to our cluster to improve the > indexing performance, and for storage addition. Basically, provided the > backend storage is scalable, the index operations also would be scalable in > the same manner. But the limit we first hit is the processing requirements, > and the random data access and locking requirements for each shard, so for > a typical database system, just by adding new BAM nodes, I'm hoping the > indexing performance will almost increase linearly. > > The AnalyicsDataSource implementations are finally used by a component > called AnalyticsDataService, which is the interface seen by clients, and > has the indexing related operations with the record store functionality > exposed through AnalyticsDataSource. This interface can be looked up as an > OSGi service, and we plan on also exposing these functionality as a JAX-RS > service. > > The general design, and documentation on the test cases can be found here > at [1] and [2], and the source code at [3]. I will be doing some further > performance tests, by integrating this to the product properly, specially > the distributed search, and will provide the results here. For the moment, > we have a few performance tests as unit tests in the modules. This > implementation will be first used by the log analysis implementation done > by Gimantha. And we are planning on writing further AnalyticsDataSource > implementations for this, such as MongoDB, HBase etc.. There will be > separate notes on those. > > [1] > https://docs.google.com/a/wso2.com/spreadsheets/d/10mHRE6FEgF6wDZ-LSBx18zL8ZcIay5ZIhb8MIk7pfeg/edit#gid=0 > [2] > https://docs.google.com/a/wso2.com/spreadsheets/d/1iXoZ8BzaefN3EGOL05y5aUX6SLZH7Bu8YM4bF3xOSvQ/edit#gid=0 > [3] > https://github.com/wso2-dev/carbon-analytics/tree/master/components/xanalytics > > Cheers, > Anjana. > -- > *Anjana Fernando* > Senior Technical Lead > WSO2 Inc. | http://wso2.com > lean . enterprise . middleware > > _______________________________________________ > Architecture mailing list > [email protected] > https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture > > -- Thanks & regards, Nirmal Senior Software Engineer- Platform Technologies Team, WSO2 Inc. Mobile: +94715779733 Blog: http://nirmalfdo.blogspot.com/
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
