Re: query planner

Diogo FC Patrao Thu, 29 Aug 2013 07:26:59 -0700

Hi Claude

It is great insight you brought here indeed, thanks for sharing! Were the
endpoints triplestores? I'm using OBDA endpoints (like D2R), so performance
is definitely an issue to me.


In my case, endpoints contain complementary data, so when looking for all
instances of a class C I had to look at all endpoints e1..n, so I ended up
with n services with UNION, which is horribly slow and requires lots of
memory on the client.

The planner I was thinking of would benefit from some profiling of the
endpoints on time and number of instances each one contains. For instance,
when querying for instances of classes C1 AND C2 ( OpJoin( C1, C2 ) ) it is
faster to start by solving C1 or C2? it depends both on response time of
each endpoint and the number of results that each would yield.

Cheers!




--
diogo patrão




On Thu, Aug 29, 2013 at 10:47 AM, Claude Warren <[email protected]> wrote:

> The DERI work is supposed to be open source, however, I don't believe the
> papers have been written yet so I suspect they are not yet releasing the
> code.
>
> In my case we were building queries across multiple endpoints based upon
> the topics in an original query, so I was more concerned about writing a
> properly constructed federated query than about efficiency of the specific
> query.  That being said, there were a couple of issues that we had to solve
> (most notably for optional values shared across the endpoints) to make the
> response time acceptable.
>
> In any case we simply passed the query transformed query to the default
> optimiser after we were done.  The only optimisation that we did was to
> look at statements and run the ones with the fewest unknowns before queries
> with multiple unknowns.  We probably could have added another layer of
> efficiency by placing statements with unknown predicates later in the
> query.
>
> We also did some work on response times from endpoints (we had cases where
> multiple SPARQL endpoints contained the same graphs so we had a choice of
> query points) and picking the most efficient for our query.  We did the
> evaluation of responses on a schedule so we always had reasonably fresh
> data.  This also allowed us to determine if an endpoint was down and switch
> to a backup or simply not generate the query for that endpoint.
>
> Hope this helps,
> Claude
>
>
> On Thu, Aug 29, 2013 at 1:52 PM, Diogo FC Patrao <[email protected]
> >wrote:
>
> > Hi Claude
> >
> > Actually I did some testing with Transformer, which I'm more used to than
> > OpVisitor, and with it I can evaluate a cost function for each node.
> > However I would need several different trees to be evaluated in order to
> > find the best plan, and I can't see how Transformer could help on it.
> >
> > I was thinking that maybe ARQ has a default query planner that could be
> > substituted on configuration, or by applying a custom cost function.
> >
> > Was your work on DERI opensourced? Can you talk about it? I'm curious
> about
> > which strategies did you used on your planner.
> >
> > Cheers,
> >
> >
> >
> > --
> > diogo patrão
> >
> >
> >
> >
> > On Wed, Aug 28, 2013 at 6:24 PM, Claude Warren <[email protected]> wrote:
> >
> > > This discussion reminds me that a friend and I discussed the
> possibility
> > of
> > > adding pluggable OpVisitors to allow chained rewrites of the query
> before
> > > execution.
> > >
> > >
> > > On Wed, Aug 28, 2013 at 10:23 PM, Claude Warren <[email protected]>
> > wrote:
> > >
> > > > When I did the work on the query engine described above I used an
> > > > OpVisitor implementation in modifyOp to rewrite the query.  It should
> > be
> > > > possible to implement the optimization there, but I don't know that
> it
> > is
> > > > the best place.
> > > >
> > > > Claude
> > > >
> > > >
> > > > On Wed, Aug 28, 2013 at 5:19 PM, Diogo FC Patrao <
> > [email protected]
> > > >wrote:
> > > >
> > > >> Hi Claude
> > > >>
> > > >> I meant the first option, optimize queries to run at federated
> > > >> environments. Although I'm working on a query "federator" as well,
> and
> > > am
> > > >> also interested in similar projects as well.
> > > >>
> > > >> Thanks
> > > >>
> > > >> --
> > > >> diogo patrão
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> On Wed, Aug 28, 2013 at 12:50 PM, Claude Warren <[email protected]>
> > > wrote:
> > > >>
> > > >> > I did some work on query planning for federated queries when I
> > worked
> > > >> for
> > > >> > DERI. (http://deri.ie).  However, that work may have been a bit
> > more
> > > >> > complex than you are considering as we had some sources for
> > vocabulary
> > > >> > equivalence and were attempting to handle that at the same time.
> > > >> >
> > > >> > When you say a query planner for federated queries, are you
> looking
> > to
> > > >> > optimise queries that contain federated queries, or attempting to
> > > build
> > > >> a
> > > >> > federated query from a non-federated one?
> > > >> >
> > > >> > Claude
> > > >> >
> > > >> >
> > > >> > On Wed, Aug 28, 2013 at 1:54 PM, Diogo FC Patrao <
> > > [email protected]
> > > >> > >wrote:
> > > >> >
> > > >> > > Hello
> > > >> > >
> > > >> > > I was thinking about writing a query planner for federated
> > queries.
> > > >> > AFAIK,
> > > >> > > there's no such thing on ARQ, and couldn't find any third-party
> > libs
> > > >> for
> > > >> > > that.
> > > >> > >
> > > >> > > So my questions are:
> > > >> > >
> > > >> > > 1) do you know of any open source projects out there that deals
> > with
> > > >> it?
> > > >> > > I'm willing to contribute.
> > > >> > >
> > > >> > > 2) If there is no such project, is a Transformer the right place
> > to
> > > go
> > > >> > for
> > > >> > > it? I thought about extending TransformerCopy to reorder OpJoins
> > to
> > > >> solve
> > > >> > > first those with smaller costs.
> > > >> > >
> > > >> > > Thanks!
> > > >> > >
> > > >> > >
> > > >> > > --
> > > >> > > diogo patrão
> > > >> > >
> > > >> >
> > > >> >
> > > >> >
> > > >> > --
> > > >> > I like: Like Like - The likeliest place on the web<
> > > >> > http://like-like.xenei.com>
> > > >> > Identity: https://www.identify.nu/[email protected]
> > > >> > LinkedIn: http://www.linkedin.com/in/claudewarren
> > > >> >
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > I like: Like Like - The likeliest place on the web<
> > > http://like-like.xenei.com>
> > > > Identity: https://www.identify.nu/[email protected]
> > > > LinkedIn: http://www.linkedin.com/in/claudewarren
> > > >
> > >
> > >
> > >
> > > --
> > > I like: Like Like - The likeliest place on the web<
> > > http://like-like.xenei.com>
> > > Identity: https://www.identify.nu/[email protected]
> > > LinkedIn: http://www.linkedin.com/in/claudewarren
> > >
> >
>
>
>
> --
> I like: Like Like - The likeliest place on the web<
> http://like-like.xenei.com>
> Identity: https://www.identify.nu/[email protected]
> LinkedIn: http://www.linkedin.com/in/claudewarren
>

Re: query planner

Reply via email to