Hi Kasper, Thanks for your informative reply. I have read these pages and understand this.
What I intended to ask is if the MetaModel "post processor" which performs operations in-memory would perhaps, have a future implementation to use some engine like MemSQL etc. for "Big Data" scenarios. Regards, Ashish On Mon, Feb 16, 2015 at 4:33 PM, Kasper Sørensen < [email protected]> wrote: > Hi Ashish, > > It depends on the DataContext implementations to determine which operations > are done in memory and which are pushed down to the backing database. The > hierarchy of DataContext [1] is pretty big but can broadly be described as > having two styles: > > 1) The DataContexts that are fully pushing everything down to the database. > An example of this is the JdbcDataContext [2]. > > 2) The DataContexts that utilize the built-in query "post processor" (a > kind of query engine) which allows to override various methods in order to > optimize specific types of queries. See the abstract class > QueryPostprocessDataContext [3] for details. There are many examples of > this. A simple one would be JsonDataContext which only does the bare > minimum (because we don't have any "database" to help with anything) and in > the other end of the scale you have SalesforceDataContext which does a lot > of optimization based on the SOQL language of Salesforce.com. A "common > middle-ground" example would be MongoDbDataContext or > ElasticSearchDataContext which has optimizations for some but not all query > scenarios. > > On this page [4] you can also see a matrix of which implementations has > which query optimizations applied to it. > > [1] > > http://metamodel.apache.org/apidocs/3.4.1/org/apache/metamodel/DataContext.html > [2] > > http://metamodel.apache.org/apidocs/3.4.1/org/apache/metamodel/jdbc/JdbcDataContext.html > [3] > > http://metamodel.apache.org/apidocs/3.4.1/org/apache/metamodel/QueryPostprocessDataContext.html > [4] http://wiki.apache.org/metamodel/QueryExecutionStrategies > > 2015-02-16 10:53 GMT+01:00 Ashish Mukherjee <[email protected]>: > > > Hi, > > > > I was thinking of a specific scenario of Composite Data Context wrt > > MetaModel. > > > > I understand that MetaModel performs number of functions in-memory after > > querying the respective data sources. However, if the intermediate > > data-sets are large, this operation could be memory intensive and slow. > Is > > there a thought about tackling such a scenario through a clustered > approach > > in some future release? > > > > If that is not in the roadmap, what classes should one look at to work on > > this? > > > > Regards, > > Ashish > > >
