I try to use the search graph output now for producing a word lattice in
PLF style. I'm still a bit confused on how to use the recombined
hypotheses and their pointers to superior hypo's. Do I have to copy the
relevant parts from the superior hypotheses into the lattice or should I
join the hypotheses that point to recombined hypo's with the existing
graph? To give an example:
who is bill ?
(0)-->(1)-->(2)--->(3)-->(4)
|
|--->(5)------------->(6)
how | is bill ?
|
|---->(7)----->(8)
is the bill
where (6) is a recombined hypo pointing to (4) and covering tokens 1-3
and (8) is a recombined hypo that points to (3)
Should I copy the relevant parts of (4) that cover the same tokens to
the graph as a link to (5) or can I safely join (5) and (1)? Probably
not because this would produce "who is the bill" which is not
necessarily an option ...
Thanks a lot for clarifying this to me!
Jörg
Chris Dyer wrote:
> As long as you're just splitting, keeping the weights consistent isn't
> too hard- just keep all the weight in one segment and make all the
> rest of the segments have no impact when they multiply (i.e., a
> probability of 1, or a cost of 0). The openFST or AT&T tools can help
> you manipulate lattices if you want to do more interesting things with
> weights, such as pushing them to the start of paths.
>
> Chris
>
> On Mon, Mar 1, 2010 at 1:58 PM, Loïc BARRAULT
> <[email protected]> wrote:
>> Indeed, splitting is not hard, but the trickiest thing is how much
>> probability/score amount do you give to each part of the split ?
>> Maybe it has not any real impact in the end, or has it ?
>> Loïc
>>
>> 2010/3/1 Chris Dyer <[email protected]>
>>> I guess word-graph doesn't split phrases either (I was just guessing).
>>> It appears to be in SLF format, which is used by a number of tools
>>> (like HTK and the SRI tools). SRILM can split lattices with
>>> multi-word arcs into lattices, or you can write your own code to do
>>> it. It's not terribly hard.
>>>
>>> Chris
>>>
>>> On Mon, Mar 1, 2010 at 12:32 PM, Joerg Tiedemann
>>> <[email protected]> wrote:
>>>> Ok thanks. I will use the output-word-graph option. However, I also get
>>>> phrases with that option (in the w attribute), for example here:
>>>>
>>>> ....
>>>> J=42 S=0 E=53 a=0, 0, 0, -0.693147, 0.999896 l=-13.695
>>>> r=-20, 0, -1.60944, 0, 0, 0 w=bill clinton , pC=0.0613498,
>>>> c=-3.23392
>>>> ...
>>>>
>>>> I'm not sure if I'm using the command line argument correctly:
>>>> echo 'who is bill clinton ?' | \
>>>> moses -f moses.ini -output-word-graph test.graph 0
>>>>
>>>> Jörg
>>>>
>>>>
>>>> On 3/1/10 5:35 PM, Chris Dyer wrote:
>>>>> I don't have such a tool, but it wouldn't be too difficult to write
>>>>> one. I think the difference between word graph and search graph is
>>>>> the search graph has full phrases on the edges, whereas the word graph
>>>>> has single words on the edges. For the input, you need single word
>>>>> edges.
>>>>> -Chris
>>>>>
>>>>> 2010/3/1 Jörg Tiedemann<[email protected]>:
>>>>>> Is there a tool to convert output search graphs to word lattices in
>>>>>> PLF
>>>>>> (moses lattice input format)? It's the option -output-search-graph
>>>>>> that I should use for getting the relevant information, right? I'm not
>>>>>> really sure if I understand the difference between -output-word-graph
>>>>>> and -output-search-graph
>>>>>> Thanks!
>>>>>>
>>>>>> Jörg
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *******/\/\/\/\/\/\/\/\/\/\/\******************************************
>>>>>> Jörg Tiedemann [email protected]
>>>>>> Visiting Professor http://stp.lingfil.uu.se/~joerg/
>>>>>> Dep. of Linguistics and Philology
>>>>>> Uppsala University tel: +46 (0)18 - 471 1412
>>>>>> Box 635, SE-751 26 Uppsala/SWEDEN fax: +46 (0)18 - 471 1094
>>>>>>
>>>>>> *********************************/\/\/\/\/\/\/\/\/\/\/\****************
>>>>>> _______________________________________________
>>>>>> Moses-support mailing list
>>>>>> [email protected]
>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> [email protected]
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>> --
>> ---
>> Loïc BARRAULT
>> Post-doctoral researcher
>> LIUM - University of Le Mans
>> Tél. +33/0 2 43 83 38 52
>> http://www-lium.univ-lemans.fr/~barrault
>> MANY : Open Source MT System Combination
>> http://www-lium.univ-lemans.fr/~barrault/MANY
>> ---
>>
--
*******/\/\/\/\/\/\/\/\/\/\/\******************************************
Jörg Tiedemann [email protected]
Visiting Professor http://stp.lingfil.uu.se/~joerg/
Dep. of Linguistics and Philology
Uppsala University tel: +46 (0)18 - 471 1412
Box 635, SE-751 26 Uppsala/SWEDEN fax: +46 (0)18 - 471 1094
*********************************/\/\/\/\/\/\/\/\/\/\/\****************
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support