I have been thinking about strategies for optimizing federated queries and
have come to the point where I need do do a merge of potentially very large
results sets where not all results sets contain all the values.
Consider:
Service <a> { [] <x:foo> ?foo ;
<x:bar> ?bar ;
<x:fob> ?fob .
}
union
Service <b> { [] <x:foo> ?foo ;
<x:baz> ?baz ;
<x:fob> ?fob .
}
which would yield results sets that have the structure:
{?foo ?bar ?fob ?bap}
?bar will always come from <a> and ?bap will always come from <b>, but ?foo
and ?fob may come from either.
I am thinking that the results from <a> could be inserted into a temporary
graph as
_x <x:foo> ?foo
_x <x:bar> ?bar
_x <x:fob> ?fob
then results from <b> could be inserted into the graph as updates to
existing records where { [] <x:foo> ?foo ; <x:fob> ?fob} any missing
records could be inserted.
The merged result set can then be extracted from the temporary graph as
select ?foo, ?bar, ?fob, ?baz
where { ?dummy <x:foo> ?foo ;
<x:fob> ?fob ;
OPTIONAL
{ ?dummy <x:bar> ?bar }
OPTIONAL
{ ?dummy <x:baz> ?baz}
}
Questions:
Has anyone attempted this?
Does anyone see any functional issues with the approach?
I understand that there would be performance issues with small datasets,
but for large datasets it may make sense. Thoughts?
Claude
--
I like: Like Like - The likeliest place on the web<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren