There are two points you might find helpful when considering your question.
(a) The first ("generative") approach possesses a useful property.
It defines a distribution over all parses (Parses = successes + unification failures)
(Pgen is probability for generative approach). So,
Sum_{Ps_i \in Parses} Pgen(Ps_i) = 1
(b) I would imagine that if you change the probabilities as you suggest
(i.e. same body different head) you end-up with a very different kind
of formalism, for which you need to work out the semantics from scratch
(not just the probabilistic element).
However, the probabilities you seem to be looking for are derivable from (a).
(Psucc probability of S given it parses succesfully,
PrsS_x a parse for sentence S)
Psucc( S ) = Pgen( PrsS_x | SuccParses_S )
= Pgen(PrsS_x) / (Sum_{i} Pgen(PrsS_i) )
Finally note that one way of altering the behaviour in (a) would be to
"renormalise" as you parse. i.e. only consider clauses that unify by
normalising over their labels.
reference:
On various aspects of (a) and (b) in a Logic Programming setting see,
@inproceedings{ Cussens2000,
Title = {Stochastic logic programs: Sampling, inference and
applications},
Author = {James Cussens},
Booktitle = {Sixteenth Annual Conference on Uncertainty in Artificial
Intelligence (UAI-2000)},
Pages = {115-122},
Address = {San Francisco, CA},
Year = {2000},
Publisher = {Morgan Kaufmann},
Url = {ftp://ftp.cs.york.ac.uk/pub/ML_GROUP/Papers/uai00.ps.gz}
}
Nicos.
On 30/10/02 09:04 -0800, George Paliouras wrote:
> Dear all,
>
> I have a question about the use of probabilities in (context-free) grammars.
> According to common use, the probability of a specific parse of sentence is
> calculated as the product of probabilities of all the rules involved in the
> parse tree. Usually, the probabilities that are assigned to the rules are
> calculated from the frequency by which each rule participates in the
> "correct" parse trees of the training sentences. Thus, using the following
> simple grammar (rule probabilities in brackets):
>
> S -> NP VP (1.0)
> NP -> ART NOUN (0.3)
> NP -> ART NOUN RC (0.7)
> VP -> VERB NP (1.0)
> RC -> that VP (1.0)
> VERB -> saw (0.4)
> VERB -> heard (0.6)
> NOUN -> cat (0.2)
> NOUN -> dog (0.4)
> NOUN -> mouse (0.4)
> ART -> a (0.5)
> ART -> the (0.5)
>
> to parse the sentence "the cat saw the mouse", one gets the probability:
> 1.0*0.3*0.5*0.2*1.0*0.4*0.3*0.5*0.4 = 0.00072, as the sentence has the
> following parse tree:
> (S (NP (ART the) (NOUN cat))
> (VP (VERB saw)) (NP (ART the) (NOUN mouse)))
>
> This approach seems "generative" in the sense that the calculated
> probability corresponds to the probability of the sentence being
> generated by the grammar. However, the significance of this
> number in a "parsing" mode is not clear to me. A bottom-up
> parser would be able to generate the above tree *unambiguously*,
> i.e., there is no other way to parse this sentence with the
> given grammar. Therefore it seems reasonable to arrive at the
> probability estimate of 1.0 for the parse. This could be achieved
> by the use of a different approach to the calculation of rule
> probabilities. Namely, assign a probability <1.0 only when
> two rules share the same body (and of course have different
> heads), that is only when there is ambiguity on what rule to
> use for parsing.
>
> Given that I have not met this approach in the literature, I assume
> that something is wrong in my reasoning. Any help on this issue and
> references to related work would be greately appreciated.
>
> Tanks in advance,
>
> George Paliouras
>
>
------- End of Forwarded Message