[UAI] Probabilities in CFGs

George Paliouras Wed, 30 Oct 2002 11:22:41 -0800

Dear all,

I have a question about the use of probabilities in (context-free) grammars.
According to common use, the probability of a specific parse of sentence is 
calculated as the product of probabilities of all the rules involved in the 
parse tree. Usually, the probabilities that are assigned to the rules are 
calculated from the frequency by which each rule participates in the 
"correct" parse trees of the training sentences. Thus, using the following 
simple grammar (rule probabilities in brackets):


S -> NP VP        (1.0)
NP -> ART NOUN    (0.3)
NP -> ART NOUN RC (0.7)
VP -> VERB NP     (1.0)
RC -> that VP     (1.0)
VERB -> saw       (0.4)
VERB -> heard     (0.6)
NOUN -> cat       (0.2)
NOUN -> dog       (0.4)
NOUN -> mouse     (0.4)
ART -> a          (0.5)
ART -> the        (0.5)

to parse the sentence "the cat saw the mouse", one gets the probability: 
1.0*0.3*0.5*0.2*1.0*0.4*0.3*0.5*0.4 = 0.00072, as the sentence has the 
following parse tree:
(S (NP (ART the) (NOUN cat))
   (VP (VERB saw)) (NP (ART the) (NOUN mouse)))

This approach seems "generative" in the sense that the calculated 
probability corresponds to the probability of the sentence being
generated by the grammar. However, the significance of this
number in a "parsing" mode is not clear to me. A bottom-up
parser would be able to generate the above tree *unambiguously*,
i.e., there is no other way to parse this sentence with the 
given grammar. Therefore it seems reasonable to arrive at the 
probability estimate of 1.0 for the parse. This could be achieved
by the use of a different approach to the calculation of rule
probabilities. Namely, assign a probability <1.0 only when
two rules share the same body (and of course have different
heads), that is only when there is ambiguity on what rule to
use for parsing.

Given that I have not met this approach in the literature, I assume
that something is wrong in my reasoning. Any help on this issue and 
references to related work would be greately appreciated.

Tanks in advance,

George Paliouras

[UAI] Probabilities in CFGs

Reply via email to