On 02/11/2015 19:12, Arefeh Kazemi wrote:
Hi Hieu

On 2 November 2015 at 21:37, Hieu Hoang <[email protected] <mailto:[email protected]>> wrote:

    ok. I can't find the input sentence you gave me as an example

    however, i think i know the issue. it's not a bug, but we haven't
    made clear that input and output sentences in phrase-based and
    chart decoders are slighty different.


    In the chart decoder, there are implied <s> and </s> at the
    beginning and end of the each input and output sentence. They are
    not displayed, but the alignment still refers to them. So the
    input sentence
       "darya alexandrovna went alone to her room ."
    has 8 words, but in the decoder, it's actually
      "<s> darya alexandrovna went alone to her room . </s>"

    does that explain your problem?


I think so! Thanks!
I noticed that for all input sentences the highest index in the alignment is the sentence length +2. If I change the alignment file and remove all the alignment with 0 or (source sentence's length in the left or target sentence's length in the right) the problem will be solved. right?
yep

and another question:
for using reordering evaluation metrics such as kendall or hamming, we need aligned dev file. Have you considered this type of alignment for that too? Should I change the input alignment file for tuning with these metrics?
I don't think the metric have been developed with hiero models in mind. You may need to add an extra argument to let them know the alignments need to be shifted


Thanks
Arefeh



    On 02/11/2015 17:19, Arefeh Kazemi wrote:

    Hi Hieu


    This is the command:

    (I've not tuned the system, so I use the initial moses.ini file)

    the moses.ini is attached.


     nohup nice  ~/mosesdecoder/bin/moses_chart       \

       -f  ~ /moses.ini -alignment-output-file ~/align.txt  \

       < ~ /toyCorpus/mizan-test-toy.en                \

       > ~ /mizan-translated.fa        \

       2> ~ /mizan-test.out

     ~/mosesdecoder/scripts/generic/multi-bleu.perl \

       -lc ~ / toyCorpus /mizan-test-toy.fa             \

       < ~ /mizan-translated.fa


    Thanks again

    Arefeh


    On 2 November 2015 at 20:07, Hieu Hoang <[email protected]
    <mailto:[email protected]>> wrote:

        err. What exactly do I run to reproduce the problem? What is
        the input? Which ini file? I don't need the extract  file or
        the corpus


        On 01/11/2015 11:41, Arefeh Kazemi wrote:
        ​
        extract.inv.sorted.gz
        
<https://drive.google.com/a/dcu.ie/file/d/0B37sY2C6IhcjLVp6X2ZQNm5TUHM/view?usp=drive_web>
        ​​
        extract.sorted.gz
        
<https://drive.google.com/a/dcu.ie/file/d/0B37sY2C6IhcjaHF6Nms5dEtJWFE/view?usp=drive_web>
        ​​
        other.zip
        
<https://drive.google.com/a/dcu.ie/file/d/0B37sY2C6IhcjdFh2NkhIaVFEQVU/view?usp=drive_web>
        ​Hi Hieu
        Thanks for the reply.
        my original files are so huge so I attached a toy model
        which the mismatch happens for it too.

        Thanks again.
        Arefeh​
        toyCorpus.zip
        
<https://drive.google.com/a/dcu.ie/file/d/0B37sY2C6IhcjS0dSSnA2NjVBTnc/view?usp=drive_web>
        ​​
        toyLM.zip
        
<https://drive.google.com/a/dcu.ie/file/d/0B37sY2C6IhcjM2xiVkVYaDY3ZFk/view?usp=drive_web>
        ​

        On 1 November 2015 at 01:10, Hieu Hoang <[email protected]
        <mailto:[email protected]>> wrote:

            that should never happen. Can you please make available
            the model and input files for download so I can check it


            On 31/10/2015 10:30, Arefeh Kazemi wrote:
            Hi

            I needed the word alignment between the source and the
            output translation and I used -alignment-output-file
            parameter. It gives me an alignment file but there are
            some mismatches between the source sentences' length
            and the alignment so that the highest index in the
            alignment is greater than the sentence length.
            for example, for the source sentence
            "darya alexandrovna went alone to her room ."
             the alignment is :
            0-0 1-1 2-1 3-6 4-3 5-2 6-5 7-4 8-7 9-8

            I checked the sentences but there is no strange string
            in them.

            Does anyone know why this happens?!

            Regards
            Arefeh

            /

            *Email Disclaimer*

            /"This e-mail and any files transmitted with it are
            confidential and are intended solely for use by the
            addressee. Any unauthorised dissemination, distribution
            or copying of this message and any attachments is
            strictly prohibited. If you have received this e-mail
            in error, please notify the sender and delete the
            message. Any views or opinions presented in this e-mail
            may solely be the views of the author and cannot be
            relied upon as being those of Dublin City University.
            E-mail communications such as this cannot be guaranteed
            to be virus-free, timely, secure or error-free and
            Dublin City University does not accept liability for
            any such matters o r their cons equences. Please
            consider the environment before printing this e-mail."/

            *Séanadh Ríomhphoist*

            /"Tá an ríomhphost seo agus aon chomhad a sheoltar leis
            faoi rún agus is lena úsáid ag an seolaí agus sin
            amháin é. Tá cosc iomlán ar scaipeadh, dháileadh nó
            chóipeáil neamhúdaraithe ar an teachtaireacht seo agus
            ar aon cheangaltán atá ag dul leis. Má tá an ríomhphost
            seo faighte agat trí dhearmad cuir sin in iúl le do
            thoil don seoltóir agus scrios an teachtaireacht.
            D’fhéadfadh sé gurb iad tuairimí an údair agus sin
            amháin atá in aon tuairimí no dearcthaí atá curtha i
            láthair sa ríomhphost seo agus níor chóir glacadh leo
            mar thuairimí nó dhearcthaí Ollscoil Chathair Bhaile
            Átha Cliath. Ní ghlactar leis go bhfuil cumarsáid
            ríomhphoist den sórt seo saor ó víreas, in am, slán, nó
            saor ó earráid agus ní ghlacann Olls coil Chathair
            Bhaile Átha Cliath le dliteanas in aon chás den sórt
            sin ná as aon iarmhairt a d’eascródh astu. Cuimhnigh ar
            an timpeallacht le do thoil sula gcuireann tú an
            ríomhphost seo i gcló."/

            /


            _______________________________________________
            Moses-support mailing list
            [email protected] <mailto:[email protected]>
            http://mailman.mit.edu/mailman/listinfo/moses-support

-- Hieu Hoang
            http://www.hoang.co.uk/hieu



        /

        *Email Disclaimer*

        /"This e-mail and any files transmitted with it are
        confidential and are intended solely for use by the
        addressee. Any unauthorised dissemination, distribution or
        copying of this message and any attachments is strictly
        prohibited. If you have received this e-mail in error,
        please notify the sender and delete the message. Any views
        or opinions presented in this e-mail may solely be the views
        of the author and cannot be relied upon as being those of
        Dublin City University. E-mail communications such as this
        cannot be guaranteed to be virus-free, timely, secure or
        error-free and Dublin City University does not accept
        liability for any such matters or their cons equences.
        Please consider the environment before printing this e-mail."/

        *Séanadh Ríomhphoist*

        /"Tá an ríomhphost seo agus aon chomhad a sheoltar leis faoi
        rún agus is lena úsáid ag an seolaí agus sin amháin é. Tá
        cosc iomlán ar scaipeadh, dháileadh nó chóipeáil
        neamhúdaraithe ar an teachtaireacht seo agus ar aon
        cheangaltán atá ag dul leis. Má tá an ríomhphost seo faighte
        agat trí dhearmad cuir sin in iúl le do thoil don seoltóir
        agus scrios an teachtaireacht. D’fhéadfadh sé gurb iad
        tuairimí an údair agus sin amháin atá in aon tuairimí no
        dearcthaí atá curtha i láthair sa ríomhphost seo agus níor
        chóir glacadh leo mar thuairimí nó dhearcthaí Ollscoil
        Chathair Bhaile Átha Cliath. Ní ghlactar leis go bhfuil
        cumarsáid ríomhphoist den sórt seo saor ó víreas, in am,
        slán, nó saor ó earráid agus ní ghlacann Ollscoil Chathair
        Bhaile Átha Cliath le dliteanas in aon chás den sórt sin ná
        as aon iarmhairt a d’eascródh astu. Cuimhnigh ar an
        timpeallacht le do thoil sula gcuireann tú an ríomhphost seo
        i gcló."/

        /

-- Hieu Hoang
        http://www.hoang.co.uk/hieu



    /

    *Email Disclaimer*

    /"This e-mail and any files transmitted with it are confidential
    and are intended solely for use by the addressee. Any
    unauthorised dissemination, distribution or copying of this
    message and any attachments is strictly prohibited. If you have
    received this e-mail in error, please notify the sender and
    delete the message. Any views or opinions presented in this
    e-mail may solely be the views of the author and cannot be relied
    upon as being those of Dublin City University. E-mail
    communications such as this cannot be guaranteed to be
    virus-free, timely, secure or error-free and Dublin City
    University does not accept liability for any such matters or
    their cons equences. Please consider the environment before
    printing this e-mail."/

    *Séanadh Ríomhphoist*

    /"Tá an ríomhphost seo agus aon chomhad a sheoltar leis faoi rún
    agus is lena úsáid ag an seolaí agus sin amháin é. Tá cosc iomlán
    ar scaipeadh, dháileadh nó chóipeáil neamhúdaraithe ar an
    teachtaireacht seo agus ar aon cheangaltán atá ag dul leis. Má tá
    an ríomhphost seo faighte agat trí dhearmad cuir sin in iúl le do
    thoil don seoltóir agus scrios an teachtaireacht. D’fhéadfadh sé
    gurb iad tuairimí an údair agus sin amháin atá in aon tuairimí no
    dearcthaí atá curtha i láthair sa ríomhphost seo agus níor chóir
    glacadh leo mar thuairimí nó dhearcthaí Ollscoil Chathair Bhaile
    Átha Cliath. Ní ghlactar leis go bhfuil cumarsáid ríomhphoist den
    sórt seo saor ó víreas, in am, slán, nó saor ó earráid agus ní
    ghlacann Ollscoil Chathair Bhaile Átha Cliath le dliteanas in aon
    chás den sórt sin ná as aon iarmhairt a d’eascródh astu.
    Cuimhnigh ar an timpeallacht le do thoil sula gcuireann tú an
    ríomhphost seo i gcló."/

    /

-- Hieu Hoang
    http://www.hoang.co.uk/hieu



/

*Email Disclaimer*

/"This e-mail and any files transmitted with it are confidential and are intended solely for use by the addressee. Any unauthorised dissemination, distribution or copying of this message and any attachments is strictly prohibited. If you have received this e-mail in error, please notify the sender and delete the message. Any views or opinions presented in this e-mail may solely be the views of the author and cannot be relied upon as being those of Dublin City University. E-mail communications such as this cannot be guaranteed to be virus-free, timely, secure or error-free and Dublin City University does not accept liability for any such matters or their consequences. Please consider the environment before printing this e-mail."/

*Séanadh Ríomhphoist*

/"Tá an ríomhphost seo agus aon chomhad a sheoltar leis faoi rún agus is lena úsáid ag an seolaí agus sin amháin é. Tá cosc iomlán ar scaipeadh, dháileadh nó chóipeáil neamhúdaraithe ar an teachtaireacht seo agus ar aon cheangaltán atá ag dul leis. Má tá an ríomhphost seo faighte agat trí dhearmad cuir sin in iúl le do thoil don seoltóir agus scrios an teachtaireacht. D’fhéadfadh sé gurb iad tuairimí an údair agus sin amháin atá in aon tuairimí no dearcthaí atá curtha i láthair sa ríomhphost seo agus níor chóir glacadh leo mar thuairimí nó dhearcthaí Ollscoil Chathair Bhaile Átha Cliath. Ní ghlactar leis go bhfuil cumarsáid ríomhphoist den sórt seo saor ó víreas, in am, slán, nó saor ó earráid agus ní ghlacann Ollscoil Chathair Bhaile Átha Cliath le dliteanas in aon chás den sórt sin ná as aon iarmhairt a d’eascródh astu. Cuimhnigh ar an timpeallacht le do thoil sula gcuireann tú an ríomhphost seo i gcló."/

/

--
Hieu Hoang
http://www.hoang.co.uk/hieu

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to