I have been thinking about strategies for optimizing federated queries and
have come to the point where I need do do a merge of potentially very large
results sets where not all results sets contain all the values.

Consider:

Service <a> { [] <x:foo> ?foo ;
 <x:bar> ?bar ;
 <x:fob> ?fob .
}
union
Service <b> { [] <x:foo> ?foo ;
  <x:baz> ?baz ;
  <x:fob> ?fob .
}

which would yield results sets that have the structure:
{?foo ?bar ?fob ?bap}

?bar will always come from <a> and ?bap will always come from <b>, but ?foo
and ?fob may come from either.

I am thinking that the results from <a> could be inserted into a temporary
graph as

_x <x:foo> ?foo
_x <x:bar> ?bar
_x <x:fob> ?fob

then results from <b> could be inserted into the graph as updates to
existing records where { [] <x:foo> ?foo ;  <x:fob> ?fob} any missing
records could be inserted.

The merged result set can then be extracted from the temporary graph as

select ?foo, ?bar, ?fob, ?baz
where { ?dummy <x:foo> ?foo ;
 <x:fob> ?fob ;
OPTIONAL
 { ?dummy <x:bar> ?bar }
OPTIONAL
 { ?dummy <x:baz> ?baz}
}

Questions:
Has anyone attempted this?
Does anyone see any functional issues with the approach?

I understand that there would be performance issues with small datasets,
but for large datasets it may make sense.  Thoughts?


Claude

-- 
I like: Like Like - The likeliest place on the web<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren

Reply via email to