The idea of this code was certainly to produce an output of each factor,
for each token.
With the functioning of outputFactorOrder[i], it's not a good idea to
remove this line : it produces empty factor output for many tokens, and
it could produce a bad factors order, due to the lack of one of them for
a given token (if this is possible).
Perhaps it would have been better to keep producing a value, but not
UNK, standing for "unknown token". Rather, for example, UND, standing
for "undefined factor".
A finalized solution would have been to really have all needed factors
defined for each token with a properly set value...
Le 15/06/2017 à 15:27, Hieu Hoang a écrit :
thanks. Committed
https://github.com/moses-smt/mosesdecoder/commit/4b0560b5c9bd95d7c55cb0451e8947de0eee1d6d
* Looking for MT/NLP opportunities *
Hieu Hoang
http://moses-smt.org/
On 15 June 2017 at 14:07, Etienne Monneret (LM)
<[email protected] <mailto:[email protected]>> wrote:
Manager.cpp
OutputSurface(..)
Replace :
for (size_t i = 1 ; i < outputFactorOrder.size() ; i++) {
const Factor *factor = phrase.GetFactor(pos, outputFactorOrder[i]);
if (factor) out << fd << *factor;
else out << fd << UNKNOWN_FACTOR;
}
By :
for (size_t i = 1 ; i < outputFactorOrder.size() ; i++) {
const Factor *factor = phrase.GetFactor(pos, outputFactorOrder[i]);
if (factor) out << fd << *factor;
// else out << fd << UNKNOWN_FACTOR;
}
Best regards,
Etienne
Le 07/06/2017 à 17:23, Hieu Hoang a écrit :
there's probably a bug somewhere in the server code
* Looking for MT/NLP opportunities *
Hieu Hoang
http://moses-smt.org/
On 7 June 2017 at 11:02, Etienne Monneret (LM)
<[email protected] <mailto:[email protected]>> wrote:
Hi !
I just re-compiled a Moses server.
Now, the "report-all-factors" option is marking all words as UNK:
taking|UNK|UNK|UNK advantage|UNK|UNK|UNK of|UNK|UNK|UNK
the|UNK|UNK|UNK
|0-1| mushrooms|UNK|UNK|UNK |2-2| relaxante|UNK|UNK|UNK |6-6|
is|UNK|UNK|UNK an|UNK|UNK|UNK activity|UNK|UNK|UNK |3-5|
.|UNK|UNK|UNK |7-7|
This is only in the XML-RPC reply, because, in the
mosesserver LOG, I
really get the good words marked as UNK:
[moses/server/TranslationRequest.cpp:472] BEST TRANSLATION:
taking
advantage of the mushrooms relaxante|UNK|UNK|UNK is an activity .
[11111111] [total=-108.208]
core=(-100.000,-10.000,5.000,-6.851,-16.168,-4.675,-21.479,-8.000,-60.608)
Is there a new way to do this ?
Best regards,
Etienne
_______________________________________________
Moses-support mailing list
[email protected] <mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support
<http://mailman.mit.edu/mailman/listinfo/moses-support>
_______________________________________________
Moses-support mailing list
[email protected] <mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support
<http://mailman.mit.edu/mailman/listinfo/moses-support>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support