Hey Marcin,

On 08/24/2012 12:52 PM, Marcin Junczys-Dowmunt wrote:
> You are aligning unsegmented words to segmented phoneme sequences, do I
> understand that correctly? Maybe it's worth to use letter sequences
> instead of words and replace spaces with special characters.
> Like this:
> 
> t h i s _ i s _ a n _ e x a m p l e .

I'm not quite sure what you mean by unsegmented words? But my Data tends
to look something like this (of course more complex and in czech, but
i'll write something by hand:

hello world, this is a test.
h e l o w oe r l d th i s i z a t ae z t
alignment is cool
ae l a i ng m e n t i z k u l

and so on. Now i'm trying to to find alignments like:
hello=h e l o
world=w oe r l d
this=th i s
is=i z
a=a
test=t ae z t
alignment=ae l a i ng m e n t
is=i z
cool=k u l

But i believe that what you suggest is already done by the
postprocessing tool i use (PISA), only in a much more sophisticated
manner. The only problem is non-monotonity for me, really ;(. But thanks
for your suggestion, i think i'll try it out nonetheless (i tend to
clutch at every straw at this stage ;P).

Thanks and Best Regards
- Dario
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to