Quick question:

Would it make sense to have an immutable flag that would tell the optimizer
(or other processes) that a dataset/model/graph is not likely to change?

  More of a hint rather than a rule.






On Thu, Jun 12, 2014 at 2:15 PM, Rob Vesse <[email protected]> wrote:

> You may be interested in the following paper -
> http://www.csd.uoc.gr/~hy561/papers/storageaccess/optimization/Characterist
> ic%20Sets.pdf - on a technique called RDF Characteristic Sets
>
> It tries to solve the problem Andy alludes to that most stats based
> optimisers consider triple patterns in isolation of each other rather than
> as complete units.  The downside of the RDF Characteristic Sets approach
> is that they are potentially very expensive to calculate and would be
> awkward to maintain for mutable data sets.
>
> Rob
>
> On 12/06/2014 13:36, "Andy Seaborne" <[email protected]> wrote:
>
> >On 12/06/14 03:35, DongNing(董宁.阿帕比) wrote:
> >> Thanks Andy!
> >> For more detail on question 2:
> >> If a triples DB such as below--
> >>      S1 :identifier P1
> >>      S2 :identifier P2
> >>      S3 :identifier P3
> >>      S4 :identifier P4
> >>      S5 :identifier P5
> >>
> >>      The Count to (var  :identifier  TERM) is 1
> >>      The Count to (var  :identifier  var ) is 5
> >> Is OK?
> >
> >Yes
> >
> >> But if triples is such as these:
> >>      S1 :identifier P1
> >>      S2 :identifier P1
> >>      S3 :identifier P1
> >>      S4 :identifier P1
> >>      S5 :identifier P1
> >>
> >> The Count to (var  :identifier  TERM) is 1 or 5?,I think is 5.
> >
> >5
> >
> >> The Count to (var  :identifier  var ) is 5.
> >> Is OK?
> >> In addition situation -----if triples like these
> >>      S1 :identifier P1
> >>      S2 :identifier P1
> >>      S3 :identifier P2
> >>      S4 :identifier P2
> >>      S5 :identifier P3
> >> The Count to (var  :identifier  TERM) is ?.
> >
> >Overall points first:
> >
> >* the optimizer is not trying to find the perfect answer, it's trying to
> >find a reasonable answer, mainly deciding between alternatives.  And to
> >some extent its role in life is avoiding the bad as much as finding the
> >good!
> >
> >* The stats optimizer isn't a perfect scheme (see the RDF3X papers for
> >more discussion) because it only considers triples independent. The
> >stats are an appromixation.
> >
> >See also the current fixed optimizer.
> >
> >
> >(var  :identifier  TERM) .. maybe 2.  It's not about exactness; only the
> >first triple gets an exact look up where you could have
> >
> >(var  :identifier  P1) 2
> >(var  :identifier  P2) 2
> >(var  :identifier  P3) 1
> >
> >It could reorder after every pattern but that might end up with the
> >optimizer costing more then the execution.
> >
> >       Andy
> >
> >>
>                              Thank Again!
> >>
>
>                                          Tony
> >>
> >> -----邮件原件-----
> >> 发件人: Andy Seaborne [mailto:[email protected]]
> >> 发送时间: 2014年6月12日 2:08
> >> 收件人: [email protected]
> >> 主题: Re: TDB OPTIMIZER question:a puzzled of RULE language about " VAR
> >>and TERM "
> >>
> >> On 11/06/14 08:03, DongNing(董宁.阿帕比) wrote:
> >>> Hi all:
> >>>
> >>> I am a beginner of jena,I am studying at TDB’S optimizer. About
> >>> Statistics rule.
> >>>
> >>> 1.       I think TERM and VAR’s difference is VAR represent a variant
> >>> in sparql. TREM only represent the probable value in the DB, it don’t
> >>> represent a variant in sparql.
> >>>
> >>> Is that right?
> >>
> >> Yes - TERM means "will be bound at this point"
> >>
> >>>
> >>> 2.       For a statics graph DB(triples are fixed,do not changed)
> >>>
> >>> Count to (var  :identifier  TERM) and Count to ( Var :identifier var)
> >>> should be same?
> >>
> >> No.
> >>
> >> (var  :identifier  TERM) should be an estimate of what the cardinality
> >>when there is a specific value.  (var :identifier var) would be count of
> >>all uses of :identifier.
> >>
> >> if :ifp is an inverse function property,
> >>
> >> (?x :ifp TERM) is one.
> >>
> >>>
> >>> 3.       And there are a few explanation and samples on
> >>> http://jena.apache.org .Are there any other tutorial about statistics
> >>> rule?
> >>
> >> Only the code I'm afraid.
> >>
> >>      Andy
> >>
> >>>
> >>>
> >>>
> >>> THANK!
> >>>
> >>> Tony.Dong
> >>>
> >>>
> >>
> >>
> >
>
>
>
>
>


-- 
I like: Like Like - The likeliest place on the web
<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren

Reply via email to