Re: Composite Data Context on Big Data

Ashish Mukherjee Wed, 18 Feb 2015 02:06:13 -0800

Hi Kasper,

Thanks for your informative reply. I have read these pages and understand
this.


What I intended to ask is if the MetaModel "post processor" which performs
operations in-memory would perhaps, have a future implementation to use
some engine like MemSQL etc. for "Big Data" scenarios.

Regards,
Ashish

On Mon, Feb 16, 2015 at 4:33 PM, Kasper Sørensen <
[email protected]> wrote:

> Hi Ashish,
>
> It depends on the DataContext implementations to determine which operations
> are done in memory and which are pushed down to the backing database. The
> hierarchy of DataContext [1] is pretty big but can broadly be described as
> having two styles:
>
> 1) The DataContexts that are fully pushing everything down to the database.
> An example of this is the JdbcDataContext [2].
>
> 2) The DataContexts that utilize the built-in query "post processor" (a
> kind of query engine) which allows to override various methods in order to
> optimize specific types of queries. See the abstract class
> QueryPostprocessDataContext [3] for details. There are many examples of
> this. A simple one would be JsonDataContext which only does the bare
> minimum (because we don't have any "database" to help with anything) and in
> the other end of the scale you have SalesforceDataContext which does a lot
> of optimization based on the SOQL language of Salesforce.com. A "common
> middle-ground" example would be MongoDbDataContext or
> ElasticSearchDataContext which has optimizations for some but not all query
> scenarios.
>
> On this page [4] you can also see a matrix of which implementations has
> which query optimizations applied to it.
>
> [1]
>
> http://metamodel.apache.org/apidocs/3.4.1/org/apache/metamodel/DataContext.html
> [2]
>
> http://metamodel.apache.org/apidocs/3.4.1/org/apache/metamodel/jdbc/JdbcDataContext.html
> [3]
>
> http://metamodel.apache.org/apidocs/3.4.1/org/apache/metamodel/QueryPostprocessDataContext.html
> [4] http://wiki.apache.org/metamodel/QueryExecutionStrategies
>
> 2015-02-16 10:53 GMT+01:00 Ashish Mukherjee <[email protected]>:
>
> > Hi,
> >
> > I was thinking of a specific scenario of Composite Data Context wrt
> > MetaModel.
> >
> > I understand that MetaModel performs number of functions in-memory after
> > querying the respective data sources. However, if the intermediate
> > data-sets are large, this operation could be memory intensive and slow.
> Is
> > there a thought about tackling such a scenario through a clustered
> approach
> > in some future release?
> >
> > If that is not in the roadmap, what classes should one look at to work on
> > this?
> >
> > Regards,
> > Ashish
> >
>

Re: Composite Data Context on Big Data

Reply via email to