Hello,

I was looking through some of the DataContext classes like ElasticSearch
etc.  for my needs and was working on the Solr class recently. It seems
that it is not uncommon not to push-down many querying operations because
implementing executeQuery() can be tricky, as every kind of SQL is
supported by executeQuery(). Therefore, some of the connectors may not
scale very well due to lot of in-memory processing of large data-sets on
the application side.

I was wondering if the following design would simplify implementation of
executeQuery() in each DataContext implementation -

Let QueryPostProcessDataContext expose few more granular hooks (like
executeCountQuery() already exists), such as createFilters(),
createHaving() etc.

These can have default implementations in QueryPostProcessDataContext class
which are called in a pipeline pattern for the entire query construction
and execution.

Those implementing classes which can/want to implement joins etc. natively
can do so else the functionality can be satisfied by the base class method
call in the pipeline.

By following such a design, it may be easier to support as many query
functions natively as possible while leaving the rest to the MM framework.
Then, it would not be an all or nothing implementation for a data connector
and keep memory footprint of the application manageable even for large
data-sets.

These are just high level thoughts as of now, which I wanted to bounce off
with the group.

Regards,
Ashish

Reply via email to