On 02/11/2015 19:12, Arefeh Kazemi wrote:
Hi Hieu
On 2 November 2015 at 21:37, Hieu Hoang <[email protected]
<mailto:[email protected]>> wrote:
ok. I can't find the input sentence you gave me as an example
however, i think i know the issue. it's not a bug, but we haven't
made clear that input and output sentences in phrase-based and
chart decoders are slighty different.
In the chart decoder, there are implied <s> and </s> at the
beginning and end of the each input and output sentence. They are
not displayed, but the alignment still refers to them. So the
input sentence
"darya alexandrovna went alone to her room ."
has 8 words, but in the decoder, it's actually
"<s> darya alexandrovna went alone to her room . </s>"
does that explain your problem?
I think so! Thanks!
I noticed that for all input sentences the highest index in the
alignment is the sentence length +2.
If I change the alignment file and remove all the alignment with 0 or
(source sentence's length in the left or target sentence's length in
the right) the problem will be solved. right?
yep
and another question:
for using reordering evaluation metrics such as kendall or hamming, we
need aligned dev file. Have you considered this type of alignment for
that too? Should I change the input alignment file for tuning with
these metrics?
I don't think the metric have been developed with hiero models in mind.
You may need to add an extra argument to let them know the alignments
need to be shifted
Thanks
Arefeh
On 02/11/2015 17:19, Arefeh Kazemi wrote:
Hi Hieu
This is the command:
(I've not tuned the system, so I use the initial moses.ini file)
the moses.ini is attached.
nohup nice ~/mosesdecoder/bin/moses_chart \
-f ~ /moses.ini -alignment-output-file ~/align.txt \
< ~ /toyCorpus/mizan-test-toy.en \
> ~ /mizan-translated.fa \
2> ~ /mizan-test.out
~/mosesdecoder/scripts/generic/multi-bleu.perl \
-lc ~ / toyCorpus /mizan-test-toy.fa \
< ~ /mizan-translated.fa
Thanks again
Arefeh
On 2 November 2015 at 20:07, Hieu Hoang <[email protected]
<mailto:[email protected]>> wrote:
err. What exactly do I run to reproduce the problem? What is
the input? Which ini file? I don't need the extract file or
the corpus
On 01/11/2015 11:41, Arefeh Kazemi wrote:
extract.inv.sorted.gz
<https://drive.google.com/a/dcu.ie/file/d/0B37sY2C6IhcjLVp6X2ZQNm5TUHM/view?usp=drive_web>
extract.sorted.gz
<https://drive.google.com/a/dcu.ie/file/d/0B37sY2C6IhcjaHF6Nms5dEtJWFE/view?usp=drive_web>
other.zip
<https://drive.google.com/a/dcu.ie/file/d/0B37sY2C6IhcjdFh2NkhIaVFEQVU/view?usp=drive_web>
Hi Hieu
Thanks for the reply.
my original files are so huge so I attached a toy model
which the mismatch happens for it too.
Thanks again.
Arefeh
toyCorpus.zip
<https://drive.google.com/a/dcu.ie/file/d/0B37sY2C6IhcjS0dSSnA2NjVBTnc/view?usp=drive_web>
toyLM.zip
<https://drive.google.com/a/dcu.ie/file/d/0B37sY2C6IhcjM2xiVkVYaDY3ZFk/view?usp=drive_web>
On 1 November 2015 at 01:10, Hieu Hoang <[email protected]
<mailto:[email protected]>> wrote:
that should never happen. Can you please make available
the model and input files for download so I can check it
On 31/10/2015 10:30, Arefeh Kazemi wrote:
Hi
I needed the word alignment between the source and the
output translation and I used -alignment-output-file
parameter. It gives me an alignment file but there are
some mismatches between the source sentences' length
and the alignment so that the highest index in the
alignment is greater than the sentence length.
for example, for the source sentence
"darya alexandrovna went alone to her room ."
the alignment is :
0-0 1-1 2-1 3-6 4-3 5-2 6-5 7-4 8-7 9-8
I checked the sentences but there is no strange string
in them.
Does anyone know why this happens?!
Regards
Arefeh
/
*Email Disclaimer*
/"This e-mail and any files transmitted with it are
confidential and are intended solely for use by the
addressee. Any unauthorised dissemination, distribution
or copying of this message and any attachments is
strictly prohibited. If you have received this e-mail
in error, please notify the sender and delete the
message. Any views or opinions presented in this e-mail
may solely be the views of the author and cannot be
relied upon as being those of Dublin City University.
E-mail communications such as this cannot be guaranteed
to be virus-free, timely, secure or error-free and
Dublin City University does not accept liability for
any such matters o r their cons equences. Please
consider the environment before printing this e-mail."/
*Séanadh Ríomhphoist*
/"Tá an ríomhphost seo agus aon chomhad a sheoltar leis
faoi rún agus is lena úsáid ag an seolaí agus sin
amháin é. Tá cosc iomlán ar scaipeadh, dháileadh nó
chóipeáil neamhúdaraithe ar an teachtaireacht seo agus
ar aon cheangaltán atá ag dul leis. Má tá an ríomhphost
seo faighte agat trí dhearmad cuir sin in iúl le do
thoil don seoltóir agus scrios an teachtaireacht.
D’fhéadfadh sé gurb iad tuairimí an údair agus sin
amháin atá in aon tuairimí no dearcthaí atá curtha i
láthair sa ríomhphost seo agus níor chóir glacadh leo
mar thuairimí nó dhearcthaí Ollscoil Chathair Bhaile
Átha Cliath. Ní ghlactar leis go bhfuil cumarsáid
ríomhphoist den sórt seo saor ó víreas, in am, slán, nó
saor ó earráid agus ní ghlacann Olls coil Chathair
Bhaile Átha Cliath le dliteanas in aon chás den sórt
sin ná as aon iarmhairt a d’eascródh astu. Cuimhnigh ar
an timpeallacht le do thoil sula gcuireann tú an
ríomhphost seo i gcló."/
/
_______________________________________________
Moses-support mailing list
[email protected] <mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support
--
Hieu Hoang
http://www.hoang.co.uk/hieu
/
*Email Disclaimer*
/"This e-mail and any files transmitted with it are
confidential and are intended solely for use by the
addressee. Any unauthorised dissemination, distribution or
copying of this message and any attachments is strictly
prohibited. If you have received this e-mail in error,
please notify the sender and delete the message. Any views
or opinions presented in this e-mail may solely be the views
of the author and cannot be relied upon as being those of
Dublin City University. E-mail communications such as this
cannot be guaranteed to be virus-free, timely, secure or
error-free and Dublin City University does not accept
liability for any such matters or their cons equences.
Please consider the environment before printing this e-mail."/
*Séanadh Ríomhphoist*
/"Tá an ríomhphost seo agus aon chomhad a sheoltar leis faoi
rún agus is lena úsáid ag an seolaí agus sin amháin é. Tá
cosc iomlán ar scaipeadh, dháileadh nó chóipeáil
neamhúdaraithe ar an teachtaireacht seo agus ar aon
cheangaltán atá ag dul leis. Má tá an ríomhphost seo faighte
agat trí dhearmad cuir sin in iúl le do thoil don seoltóir
agus scrios an teachtaireacht. D’fhéadfadh sé gurb iad
tuairimí an údair agus sin amháin atá in aon tuairimí no
dearcthaí atá curtha i láthair sa ríomhphost seo agus níor
chóir glacadh leo mar thuairimí nó dhearcthaí Ollscoil
Chathair Bhaile Átha Cliath. Ní ghlactar leis go bhfuil
cumarsáid ríomhphoist den sórt seo saor ó víreas, in am,
slán, nó saor ó earráid agus ní ghlacann Ollscoil Chathair
Bhaile Átha Cliath le dliteanas in aon chás den sórt sin ná
as aon iarmhairt a d’eascródh astu. Cuimhnigh ar an
timpeallacht le do thoil sula gcuireann tú an ríomhphost seo
i gcló."/
/
--
Hieu Hoang
http://www.hoang.co.uk/hieu
/
*Email Disclaimer*
/"This e-mail and any files transmitted with it are confidential
and are intended solely for use by the addressee. Any
unauthorised dissemination, distribution or copying of this
message and any attachments is strictly prohibited. If you have
received this e-mail in error, please notify the sender and
delete the message. Any views or opinions presented in this
e-mail may solely be the views of the author and cannot be relied
upon as being those of Dublin City University. E-mail
communications such as this cannot be guaranteed to be
virus-free, timely, secure or error-free and Dublin City
University does not accept liability for any such matters or
their cons equences. Please consider the environment before
printing this e-mail."/
*Séanadh Ríomhphoist*
/"Tá an ríomhphost seo agus aon chomhad a sheoltar leis faoi rún
agus is lena úsáid ag an seolaí agus sin amháin é. Tá cosc iomlán
ar scaipeadh, dháileadh nó chóipeáil neamhúdaraithe ar an
teachtaireacht seo agus ar aon cheangaltán atá ag dul leis. Má tá
an ríomhphost seo faighte agat trí dhearmad cuir sin in iúl le do
thoil don seoltóir agus scrios an teachtaireacht. D’fhéadfadh sé
gurb iad tuairimí an údair agus sin amháin atá in aon tuairimí no
dearcthaí atá curtha i láthair sa ríomhphost seo agus níor chóir
glacadh leo mar thuairimí nó dhearcthaí Ollscoil Chathair Bhaile
Átha Cliath. Ní ghlactar leis go bhfuil cumarsáid ríomhphoist den
sórt seo saor ó víreas, in am, slán, nó saor ó earráid agus ní
ghlacann Ollscoil Chathair Bhaile Átha Cliath le dliteanas in aon
chás den sórt sin ná as aon iarmhairt a d’eascródh astu.
Cuimhnigh ar an timpeallacht le do thoil sula gcuireann tú an
ríomhphost seo i gcló."/
/
--
Hieu Hoang
http://www.hoang.co.uk/hieu
/
*Email Disclaimer*
/"This e-mail and any files transmitted with it are confidential and
are intended solely for use by the addressee. Any unauthorised
dissemination, distribution or copying of this message and any
attachments is strictly prohibited. If you have received this e-mail
in error, please notify the sender and delete the message. Any views
or opinions presented in this e-mail may solely be the views of the
author and cannot be relied upon as being those of Dublin City
University. E-mail communications such as this cannot be guaranteed to
be virus-free, timely, secure or error-free and Dublin City University
does not accept liability for any such matters or their consequences.
Please consider the environment before printing this e-mail."/
*Séanadh Ríomhphoist*
/"Tá an ríomhphost seo agus aon chomhad a sheoltar leis faoi rún agus
is lena úsáid ag an seolaí agus sin amháin é. Tá cosc iomlán ar
scaipeadh, dháileadh nó chóipeáil neamhúdaraithe ar an teachtaireacht
seo agus ar aon cheangaltán atá ag dul leis. Má tá an ríomhphost seo
faighte agat trí dhearmad cuir sin in iúl le do thoil don seoltóir
agus scrios an teachtaireacht. D’fhéadfadh sé gurb iad tuairimí an
údair agus sin amháin atá in aon tuairimí no dearcthaí atá curtha i
láthair sa ríomhphost seo agus níor chóir glacadh leo mar thuairimí nó
dhearcthaí Ollscoil Chathair Bhaile Átha Cliath. Ní ghlactar leis go
bhfuil cumarsáid ríomhphoist den sórt seo saor ó víreas, in am, slán,
nó saor ó earráid agus ní ghlacann Ollscoil Chathair Bhaile Átha
Cliath le dliteanas in aon chás den sórt sin ná as aon iarmhairt a
d’eascródh astu. Cuimhnigh ar an timpeallacht le do thoil sula
gcuireann tú an ríomhphost seo i gcló."/
/
--
Hieu Hoang
http://www.hoang.co.uk/hieu
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support