+1, we should reuse. I think what ML needs is a subset. On Mon, Jan 26, 2015 at 11:10 AM, Nirmal Fernando <[email protected]> wrote:
> Hi Anjana, > > Isn't this a generic interface to talk to a back-end data store? If so, do > you think this can be reused in other products? In ML, we have a similar > use-case where we need to talk to a generic data layer to store the models > that are generated. > > On Wed, Dec 10, 2014 at 1:37 PM, Anjana Fernando <[email protected]> wrote: > >> Hi, >> >> I've finished the initial implementation of $subject. This basically >> contains the standard interfaces we use to plug-in different data sources >> as the back-end record storage, and for indexing purposes. These pluggable >> data sources are called "Analytics Data Sources" here, where from a >> configuration file, you can give the implementation class and the >> properties required for the initialization. The first implementation of >> this is done, which is the RDBMS implementation. It basically stores all >> the records and other data in a relational database, and any type of >> database can be supported via a configuration file, which gives the query >> templates used to define a standard set of actions. At the moment, H2 and >> MySQL query templates have been tested, and we will be adding the rest of >> popular RDBMS templates as well. The RDBMS AnalyticsDataSource >> implementation detects the query template by looking at the database >> connection information, retrieved from the data source (e.g. mentioned in >> master-datasources.xml), and automatically switches to that mode, so the >> user basically doesn't have to do anything when configuring. >> >> Also, inside the AnalyticsDataSource interface, there is a FileSystem >> interface you need to implement for your data source implementation, which >> is basically used for indexing, which is done by Lucene. We use Lucene >> indexes as index shards for a distributed index and search. So with the >> sharding approach, we can add more nodes to our cluster to improve the >> indexing performance, and for storage addition. Basically, provided the >> backend storage is scalable, the index operations also would be scalable in >> the same manner. But the limit we first hit is the processing requirements, >> and the random data access and locking requirements for each shard, so for >> a typical database system, just by adding new BAM nodes, I'm hoping the >> indexing performance will almost increase linearly. >> >> The AnalyicsDataSource implementations are finally used by a component >> called AnalyticsDataService, which is the interface seen by clients, and >> has the indexing related operations with the record store functionality >> exposed through AnalyticsDataSource. This interface can be looked up as an >> OSGi service, and we plan on also exposing these functionality as a JAX-RS >> service. >> >> The general design, and documentation on the test cases can be found here >> at [1] and [2], and the source code at [3]. I will be doing some further >> performance tests, by integrating this to the product properly, specially >> the distributed search, and will provide the results here. For the moment, >> we have a few performance tests as unit tests in the modules. This >> implementation will be first used by the log analysis implementation done >> by Gimantha. And we are planning on writing further AnalyticsDataSource >> implementations for this, such as MongoDB, HBase etc.. There will be >> separate notes on those. >> >> [1] >> https://docs.google.com/a/wso2.com/spreadsheets/d/10mHRE6FEgF6wDZ-LSBx18zL8ZcIay5ZIhb8MIk7pfeg/edit#gid=0 >> [2] >> https://docs.google.com/a/wso2.com/spreadsheets/d/1iXoZ8BzaefN3EGOL05y5aUX6SLZH7Bu8YM4bF3xOSvQ/edit#gid=0 >> [3] >> https://github.com/wso2-dev/carbon-analytics/tree/master/components/xanalytics >> >> Cheers, >> Anjana. >> -- >> *Anjana Fernando* >> Senior Technical Lead >> WSO2 Inc. | http://wso2.com >> lean . enterprise . middleware >> >> _______________________________________________ >> Architecture mailing list >> [email protected] >> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >> >> > > > -- > > Thanks & regards, > Nirmal > > Senior Software Engineer- Platform Technologies Team, WSO2 Inc. > Mobile: +94715779733 > Blog: http://nirmalfdo.blogspot.com/ > > > -- ============================ Blog: http://srinathsview.blogspot.com twitter:@srinath_perera Site: http://people.apache.org/~hemapani/ Photos: http://www.flickr.com/photos/hemapani/ Phone: 0772360902
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
