Re: joshua_api

Matt Post Wed, 27 Apr 2016 04:48:32 -0700

Kellen,

Great. I had a chance to start looking over the ReworkedExtractions branch. 
I'll have some more time today. It looks good to me so far. Is there anything 
else you plan to do, or does that branch contain basically all of it (apart 
from the recapitalization fix, which I see should be applied more selectively, 
maybe only when a -recapitalize flag is present, to save on time).


matt


> On Apr 26, 2016, at 1:56 AM, kellen sunderland <kellen.sunderl...@gmail.com> 
> wrote:
> 
> Hey Matt,
> 
> I've opened a new pull request with a few of our commits, feel free to take
> a look when you have some time.
> 
> More importantly I've pushed our queue of upcoming commits to the following
> branch in my fork:
> https://github.com/KellenSunderland/incubator-joshua/commits/ReworkedExtractions
> .  From there you can get an idea for the work we've done so far.  I
> haven't opened a PR yet for these commits because there's still some
> merging I have to do (there's a few failing tests and I had to temporarily
> comment out some of your casing code).  Once that's fixed I'll do a proper
> PR for these commits.
> 
> -Kellen
> 
> On Mon, Apr 25, 2016 at 1:35 PM, Matt Post <p...@cs.jhu.edu> wrote:
> 
>> Great. On that first point, I meant that translate() would return a
>> Translation object, which would know its hypergraph and could iterate over
>> a KBestExtractor. In any case, though, it sounds like you are a bit ahead
>> of me on this, so I'll wait for a push that I can see, and then we can
>> converge on the design.
>> 
>> matt
>> 
>> 
>>> On Apr 25, 2016, at 4:10 PM, Hieber, Felix <fhie...@amazon.de> wrote:
>>> 
>>> Hi Matt,
>>> 
>>> These are some nice suggestions. Most of the work we have done is in
>> line of what you propose so I would agree with Kellen that we should
>> synchronize and compare better earlier than later.
>>> 
>>> Best,
>>> Felix
>>> 
>>>> On 25.04.2016, at 07:44, kellen sunderland <kellen.sunderl...@gmail.com>
>> wrote:
>>>> 
>>>> Hey Matt,
>>>> 
>>>> Sorry for the late reply.  The Joshua-6 folder and tst may have just
>> been
>>>> artifacts of some symlinks I have locally.  Sorry they may have been
>> pushed
>>>> by mistake, I can clean that up.
>>>> 
>>>> Good idea to have the api code in a separate branch.  We can merge the
>> work
>>>> that we've done some time next week.
>>>> 
>>>> KBestExtractor is one of the things we want to return via the API.  We
>>>> already have some of this implemented though as you suggest.  I'll try
>> and
>>>> push the remaining work we've done into my github branch so you can
>> compare.
>>>> 
>>>> -Kellen
>>>> 
>>>>> On Mon, Apr 25, 2016 at 6:11 AM, Matt Post <p...@cs.jhu.edu> wrote:
>>>>> 
>>>>> Okay, after looking at this a bit more, I have a better understanding,
>> and
>>>>> an idea for how to move forward.
>>>>> 
>>>>> First, I see that Translation.java has provisions for structured
>> output.
>>>>> I'm guessing StructuredTranslation was added by mistake?
>>>>> 
>>>>> Moving forward, on the joshua_api branch, I was thinking of the
>> following,
>>>>> but want to make sure it doesn't collide with what you've done or are
>> doing:
>>>>> 
>>>>> - Factor KBestExtractor to return Translation objects instead of
>> printing,
>>>>> and also turn it into an iterator
>>>>> 
>>>>> - There's a real discrepancy with competing forest representations.
>> There
>>>>> are operations on the hypergraph (via WalkerFunction), and then also
>>>>> operations on Derivations. This leads to code that operates on both. It
>>>>> would be nice if the KBestExtractor just returned something like a
>> reduced
>>>>> "slice" of a forest forest new nodes containing only single back
>> pointers,
>>>>> representing exactly the nth-best derivation. Then we could
>> generically use
>>>>> the WalkerFunctions on that (e.g., viterbi extraction), and get rid of
>> many
>>>>> of the DerivationVisitor classes
>>>>> 
>>>>> - Related: constructing the k-best list is expensive, even for just the
>>>>> first item, since you have to set up all the candidate lists and so on.
>>>>> This led to me implementing top-n = 0, where you can get the
>> translation
>>>>> and some limited information (not replayed features) via Viterbi
>> extractors
>>>>> on the hypergraph, and you only have to call KBestExtractor if you
>> actually
>>>>> want k-best lists. This leads to dual code, e.g., substitutions of
>>>>> output_format in multiple places. The first item the KBestIterator
>> returns
>>>>> should be constructed more efficiently, on the assumption that the
>> caller
>>>>> might not ask for more items. The StructuredTranslation object already
>> is
>>>>> lazy about returning things that are asked for (e.g., it will only
>> replay
>>>>> features if you ask for the feature functions).
>>>>> 
>>>>> I will probably implement most of these tonight and tomorrow unless
>> there
>>>>> are objections from anyone (including an objection asking for more
>> time to
>>>>> evaluate!)
>>>>> 
>>>>> matt
>>>>> 
>>>>> 
>>>>>> On Apr 23, 2016, at 7:22 PM, Matt Post <p...@cs.jhu.edu> wrote:
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> Kellen suggested we create a Joshua API, which I think is an excellent
>>>>> idea. I've just made a start at this. It is not done and needs more
>> work,
>>>>> but I know that the Amazon folks have done some things on the backend,
>> and
>>>>> I wanted to make sure not to duplicate any work they might have done.
>> Also,
>>>>> it's something we should discuss.
>>>>>> 
>>>>>> First, I was a bit confused about the joshua-6 subdirectory, and the
>>>>> files there (also, what is tst/? Both of these were from a recent
>> commit).
>>>>> I moved those over and then things didn't compile. I got things
>> compiling
>>>>> and then made a few changes to StructuredTranslation.
>>>>>> 
>>>>>> The biggest change I hope doesn't create problems is that I simplified
>>>>> StructuredTranslation to no longer contain the Hypergraph object;
>> instead,
>>>>> it contains a DerivationState object. This represents a particular
>> k-best
>>>>> derivation, using Huang & Chiang (2005)-style ranked back pointers. The
>>>>> nice thing is that you can simplify define a DerivationVisitor class
>> and
>>>>> pass it to DeriviationState::visit, and it will see every node in a
>>>>> particular derivation.
>>>>>> 
>>>>>> This is distinct from WalkerFunction, which walks an entire
>> *HyperGraph*.
>>>>>> 
>>>>>> Let me know what you guys thing about these changes, and maybe we can
>>>>> spec out the API, and then clean things up inside a bit to use it
>> (there's
>>>>> no reason to be passing output stream writers to KBestExtractor, for
>>>>> example...).
>>>>>> 
>>>>>> matt
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> Begin forwarded message:
>>>>>>> 
>>>>>>> From: mjp...@apache.org
>>>>>>> Subject: incubator-joshua git commit: Simplified
>> StructuredTranslation
>>>>> to use derivations instead of hypergraphs, now using in KBestExtractor
>>>>>>> Date: April 23, 2016 at 7:12:19 PM EDT
>>>>>>> To: comm...@joshua.incubator.apache.org
>>>>>>> Reply-To: dev@joshua.incubator.apache.org
>>>>>>> 
>>>>>>> Repository: incubator-joshua
>>>>>>> Updated Branches:
>>>>>>> refs/heads/joshua_api [created] 824319561
>>>>>>> 
>>>>>>> 
>>>>>>> Simplified StructuredTranslation to use derivations instead of
>>>>> hypergraphs, now using in KBestExtractor
>>>>>>> 
>>>>>>> The StructuredTranslation object is a great idea. I rewrote it here
>> to
>>>>> do the following:
>>>>>>> 
>>>>>>> - It now compiles. I'm not sure why it was tucked under
>>>>> $JOSHUA/joshua-6, but I just noticed this, and when I brought it in, it
>>>>> didn't work
>>>>>>> -  I rewrote it to be based on a single (k-best) derivation, instead
>> of
>>>>> knowing about the whole hypergraph. We should also build a more general
>>>>> object that knows about all the StructuredTranslation objects (maybe
>> with
>>>>> some renaming
>>>>>>> -  I changed it to have an option to only compute each of the items
>>>>> (e.g., features) if it was requested. The non-lazy version remains the
>>>>> default.
>>>>>>> -  KBestExtractor now uses these. This is the first step to making a
>>>>> proper API. My thinking is that a large object (maybe Translation?)
>> will
>>>>> contain the k-best extractor and can return StructuredTranslation
>> objects
>>>>> as requested (again, we may want to jiggle the names a bit)
>>>>>>> 
>>>>>>> 
>>>>>>> Project:
>> http://git-wip-us.apache.org/repos/asf/incubator-joshua/repo
>>>>>>> Commit:
>>>>> 
>> http://git-wip-us.apache.org/repos/asf/incubator-joshua/commit/82431956
>>>>>>> Tree:
>>>>> http://git-wip-us.apache.org/repos/asf/incubator-joshua/tree/82431956
>>>>>>> Diff:
>>>>> http://git-wip-us.apache.org/repos/asf/incubator-joshua/diff/82431956
>>>>>>> 
>>>>>>> Branch: refs/heads/joshua_api
>>>>>>> Commit: 8243195611a17e0ef067ec7dbf6c4a57612d041b
>>>>>>> Parents: bc83a1a
>>>>>>> Author: Matt Post <p...@cs.jhu.edu>
>>>>>>> Authored: Sat Apr 23 19:12:12 2016 -0400
>>>>>>> Committer: Matt Post <p...@cs.jhu.edu>
>>>>>>> Committed: Sat Apr 23 19:12:12 2016 -0400
>>>>>>> 
>>>>>>> 
>> ----------------------------------------------------------------------
>>>>>>> src/joshua/decoder/StructuredTranslation.java   | 144
>>>>> ++++++++++---------
>>>>>>> .../decoder/hypergraph/KBestExtractor.java      |  47 +++---
>>>>>>> 2 files changed, 98 insertions(+), 93 deletions(-)
>>>>>>> 
>> ----------------------------------------------------------------------
>>>>> 
>> http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/82431956/src/joshua/decoder/StructuredTranslation.java
>>>>>>> 
>> ----------------------------------------------------------------------
>>>>>>> diff --git a/src/joshua/decoder/StructuredTranslation.java
>>>>> b/src/joshua/decoder/StructuredTranslation.java
>>>>>>> index 1939ea0..e3018b4 100644
>>>>>>> --- a/src/joshua/decoder/StructuredTranslation.java
>>>>>>> +++ b/src/joshua/decoder/StructuredTranslation.java
>>>>>>> @@ -10,7 +10,10 @@ import java.util.List;
>>>>>>> import java.util.Map;
>>>>>>> 
>>>>>>> import joshua.decoder.ff.FeatureFunction;
>>>>>>> +import joshua.decoder.ff.FeatureVector;
>>>>>>> import joshua.decoder.hypergraph.HyperGraph;
>>>>>>> +import joshua.decoder.hypergraph.KBestExtractor.DerivationState;
>>>>>>> +import joshua.decoder.io.DeNormalize;
>>>>>>> import joshua.decoder.hypergraph.ViterbiFeatureVectorWalkerFunction;
>>>>>>> import joshua.decoder.hypergraph.ViterbiOutputStringWalkerFunction;
>>>>>>> import joshua.decoder.hypergraph.WalkerFunction;
>>>>>>> @@ -30,77 +33,51 @@ import joshua.decoder.segment_file.Sentence;
>>>>>>> public class StructuredTranslation {
>>>>>>> 
>>>>>>> private final Sentence sourceSentence;
>>>>>>> -  private final List<FeatureFunction> featureFunctions;
>>>>>>> +  private final DerivationState derivationRoot;
>>>>>>> +  private final JoshuaConfiguration joshuaConfiguration;
>>>>>>> 
>>>>>>> -  private final String translationString;
>>>>>>> -  private final List<String> translationTokens;
>>>>>>> -  private final float translationScore;
>>>>>>> -  private List<List<Integer>> translationWordAlignments;
>>>>>>> -  private Map<String,Float> translationFeatures;
>>>>>>> -  private final float extractionTime;
>>>>>>> +  private String translationString = null;
>>>>>>> +  private List<String> translationTokens = null;
>>>>>>> +  private String translationWordAlignments = null;
>>>>>>> +  private FeatureVector translationFeatures = null;
>>>>>>> +  private float extractionTime = 0.0f;
>>>>>>> +  private float translationScore = 0.0f;
>>>>>>> 
>>>>>>> +  /* If we need to replay the features, this will get set to true,
>> so
>>>>> that it's only done once */
>>>>>>> +  private boolean featuresReplayed = false;
>>>>>>> +
>>>>>>> public StructuredTranslation(final Sentence sourceSentence,
>>>>>>> -      final HyperGraph hypergraph,
>>>>>>> -      final List<FeatureFunction> featureFunctions) {
>>>>>>> -
>>>>>>> -      final long startTime = System.currentTimeMillis();
>>>>>>> -
>>>>>>> -      this.sourceSentence = sourceSentence;
>>>>>>> -      this.featureFunctions = featureFunctions;
>>>>>>> -      this.translationString = extractViterbiString(hypergraph);
>>>>>>> -      this.translationTokens = extractTranslationTokens();
>>>>>>> -      this.translationScore = extractTranslationScore(hypergraph);
>>>>>>> -      this.translationFeatures = extractViterbiFeatures(hypergraph);
>>>>>>> -      this.translationWordAlignments =
>>>>> extractViterbiWordAlignment(hypergraph);
>>>>>>> -      this.extractionTime = (System.currentTimeMillis() -
>> startTime) /
>>>>> 1000.0f;
>>>>>>> -  }
>>>>>>> -
>>>>>>> -  private Map<String,Float> extractViterbiFeatures(final HyperGraph
>>>>> hypergraph) {
>>>>>>> -    if (hypergraph == null) {
>>>>>>> -      return emptyMap();
>>>>>>> -    } else {
>>>>>>> -      ViterbiFeatureVectorWalkerFunction viterbiFeatureVectorWalker
>> =
>>>>> new ViterbiFeatureVectorWalkerFunction(featureFunctions,
>> sourceSentence);
>>>>>>> -      walk(hypergraph.goalNode, viterbiFeatureVectorWalker);
>>>>>>> -      return new
>>>>> HashMap<String,Float>(viterbiFeatureVectorWalker.getFeaturesMap());
>>>>>>> -    }
>>>>>>> -  }
>>>>>>> +      final DerivationState derivationRoot,
>>>>>>> +      JoshuaConfiguration config) {
>>>>>>> 
>>>>>>> -  private List<List<Integer>> extractViterbiWordAlignment(final
>>>>> HyperGraph hypergraph) {
>>>>>>> -    if (hypergraph == null) {
>>>>>>> -      return emptyList();
>>>>>>> -    } else {
>>>>>>> -      final WordAlignmentExtractor wordAlignmentWalker = new
>>>>> WordAlignmentExtractor();
>>>>>>> -      walk(hypergraph.goalNode, wordAlignmentWalker);
>>>>>>> -      return wordAlignmentWalker.getFinalWordAlignments();
>>>>>>> -    }
>>>>>>> -  }
>>>>>>> -
>>>>>>> -  private float extractTranslationScore(final HyperGraph
>> hypergraph) {
>>>>>>> -    if (hypergraph == null) {
>>>>>>> -      return 0;
>>>>>>> -    } else {
>>>>>>> -      return hypergraph.goalNode.getScore();
>>>>>>> -    }
>>>>>>> -  }
>>>>>>> -
>>>>>>> -  private String extractViterbiString(final HyperGraph hypergraph) {
>>>>>>> -    if (hypergraph == null) {
>>>>>>> -      return sourceSentence.source();
>>>>>>> -    } else {
>>>>>>> -      final WalkerFunction viterbiOutputStringWalker = new
>>>>> ViterbiOutputStringWalkerFunction();
>>>>>>> -      walk(hypergraph.goalNode, viterbiOutputStringWalker);
>>>>>>> -      return viterbiOutputStringWalker.toString();
>>>>>>> -    }
>>>>>>> +    this(sourceSentence, derivationRoot, config, true);
>>>>>>> }
>>>>>>> +
>>>>>>> 
>>>>>>> -  private List<String> extractTranslationTokens() {
>>>>>>> -    if (translationString.isEmpty()) {
>>>>>>> -      return emptyList();
>>>>>>> -    } else {
>>>>>>> -      return asList(translationString.split("\\s+"));
>>>>>>> +  public StructuredTranslation(final Sentence sourceSentence,
>>>>>>> +      final DerivationState derivationRoot,
>>>>>>> +      JoshuaConfiguration config,
>>>>>>> +      boolean now) {
>>>>>>> +
>>>>>>> +    final long startTime = System.currentTimeMillis();
>>>>>>> +
>>>>>>> +    this.sourceSentence = sourceSentence;
>>>>>>> +    this.derivationRoot = derivationRoot;
>>>>>>> +    this.joshuaConfiguration = config;
>>>>>>> +
>>>>>>> +    if (now) {
>>>>>>> +      getTranslationString();
>>>>>>> +      getTranslationTokens();
>>>>>>> +      getTranslationScore();
>>>>>>> +      getTranslationFeatures();
>>>>>>> +      getTranslationWordAlignments();
>>>>>>> }
>>>>>>> +    this.translationScore = getTranslationScore();
>>>>>>> +
>>>>>>> +    this.extractionTime = (System.currentTimeMillis() - startTime) /
>>>>> 1000.0f;
>>>>>>> }
>>>>>>> 
>>>>>>> +
>>>>>>> // Getters to use upstream
>>>>>>> 
>>>>>>> public Sentence getSourceSentence() {
>>>>>>> @@ -112,25 +89,60 @@ public class StructuredTranslation {
>>>>>>> }
>>>>>>> 
>>>>>>> public String getTranslationString() {
>>>>>>> -    return translationString;
>>>>>>> +    if (this.translationString == null) {
>>>>>>> +      if (derivationRoot == null) {
>>>>>>> +        this.translationString = sourceSentence.source();
>>>>>>> +      } else {
>>>>>>> +        this.translationString = derivationRoot.getHypothesis();
>>>>>>> +      }
>>>>>>> +    }
>>>>>>> +    return this.translationString;
>>>>>>> }
>>>>>>> 
>>>>>>> public List<String> getTranslationTokens() {
>>>>>>> +    if (this.translationTokens == null) {
>>>>>>> +      String trans = getTranslationString();
>>>>>>> +      if (trans.isEmpty()) {
>>>>>>> +        this.translationTokens = emptyList();
>>>>>>> +      } else {
>>>>>>> +        this.translationTokens = asList(trans.split("\\s+"));
>>>>>>> +      }
>>>>>>> +    }
>>>>>>> +
>>>>>>> return translationTokens;
>>>>>>> }
>>>>>>> 
>>>>>>> public float getTranslationScore() {
>>>>>>> +    if (derivationRoot == null) {
>>>>>>> +      this.translationScore = 0.0f;
>>>>>>> +    } else {
>>>>>>> +      this.translationScore = derivationRoot.getModelCost();
>>>>>>> +    }
>>>>>>> +
>>>>>>> return translationScore;
>>>>>>> }
>>>>>>> 
>>>>>>> /**
>>>>>>> * Returns a list of target to source alignments.
>>>>>>> */
>>>>>>> -  public List<List<Integer>> getTranslationWordAlignments() {
>>>>>>> -    return translationWordAlignments;
>>>>>>> +  public String getTranslationWordAlignments() {
>>>>>>> +    if (this.translationWordAlignments == null) {
>>>>>>> +      if (derivationRoot == null)
>>>>>>> +        this.translationWordAlignments = "";
>>>>>>> +      else {
>>>>>>> +        WordAlignmentExtractor wordAlignmentExtractor = new
>>>>> WordAlignmentExtractor();
>>>>>>> +        derivationRoot.visit(wordAlignmentExtractor);
>>>>>>> +        this.translationWordAlignments =
>>>>> wordAlignmentExtractor.toString();
>>>>>>> +      }
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    return this.translationWordAlignments;
>>>>>>> }
>>>>>>> 
>>>>>>> -  public Map<String,Float> getTranslationFeatures() {
>>>>>>> +  public FeatureVector getTranslationFeatures() {
>>>>>>> +    if (this.translationFeatures == null)
>>>>>>> +      this.translationFeatures = derivationRoot.replayFeatures();
>>>>>>> +
>>>>>>> return translationFeatures;
>>>>>>> }
>>>>> 
>> http://git-wip-us.apache.org/repos/asf/incubator-joshua/blob/82431956/src/joshua/decoder/hypergraph/KBestExtractor.java
>>>>>>> 
>> ----------------------------------------------------------------------
>>>>>>> diff --git a/src/joshua/decoder/hypergraph/KBestExtractor.java
>>>>> b/src/joshua/decoder/hypergraph/KBestExtractor.java
>>>>>>> index 42539cc..ea6ca73 100644
>>>>>>> --- a/src/joshua/decoder/hypergraph/KBestExtractor.java
>>>>>>> +++ b/src/joshua/decoder/hypergraph/KBestExtractor.java
>>>>>>> @@ -34,6 +34,7 @@ import java.util.regex.Matcher;
>>>>>>> import joshua.corpus.Vocabulary;
>>>>>>> import joshua.decoder.BLEU;
>>>>>>> import joshua.decoder.JoshuaConfiguration;
>>>>>>> +import joshua.decoder.StructuredTranslation;
>>>>>>> import joshua.decoder.chart_parser.ComputeNodeResult;
>>>>>>> import joshua.decoder.ff.FeatureFunction;
>>>>>>> import joshua.decoder.ff.FeatureVector;
>>>>>>> @@ -167,33 +168,25 @@ public class KBestExtractor {
>>>>>>> // Determine the k-best hypotheses at each HGNode
>>>>>>> VirtualNode virtualNode = getVirtualNode(node);
>>>>>>> DerivationState derivationState =
>>>>> virtualNode.lazyKBestExtractOnNode(this, k);
>>>>>>> +
>>>>>>> //    DerivationState derivationState = getKthDerivation(node, k);
>>>>>>> if (derivationState != null) {
>>>>>>> -      // ==== read the kbest from each hgnode and convert to output
>>>>> format
>>>>>>> -      FeatureVector features = new FeatureVector();
>>>>>>> 
>>>>>>> -      /*
>>>>>>> -       * To save space, the decoder only stores the model cost, no
>> the
>>>>> individual feature values. If
>>>>>>> -       * you want to output them, you have to replay them.
>>>>>>> -       */
>>>>>>> -      String hypothesis = null;
>>>>>>> -      if (joshuaConfiguration.outputFormat.contains("%f")
>>>>>>> -          || joshuaConfiguration.outputFormat.contains("%d"))
>>>>>>> -        features = derivationState.replayFeatures();
>>>>>>> -
>>>>>>> -      hypothesis = derivationState.getHypothesis()
>>>>>>> +      StructuredTranslation translation = new StructuredTranslation(
>>>>>>> +          sentence, derivationState, joshuaConfiguration);
>>>>>>> +
>>>>>>> +      String hypothesis = translation.getTranslationString()
>>>>>>>       .replaceAll("-lsb-", "[")
>>>>>>>       .replaceAll("-rsb-", "]")
>>>>>>>       .replaceAll("-pipe-", "|");
>>>>>>> 
>>>>>>> -
>>>>>>>   outputString = joshuaConfiguration.outputFormat
>>>>>>>       .replace("%k", Integer.toString(k))
>>>>>>>       .replace("%s", hypothesis)
>>>>>>>       .replace("%S", DeNormalize.processSingleLine(hypothesis))
>>>>>>>       .replace("%i", Integer.toString(sentence.id()))
>>>>>>> -          .replace("%f", joshuaConfiguration.moses ?
>>>>> features.mosesString() : features.toString())
>>>>>>> -          .replace("%c", String.format("%.3f",
>> derivationState.cost));
>>>>>>> +          .replace("%f", joshuaConfiguration.moses ?
>>>>> translation.getTranslationFeatures().mosesString() :
>>>>> translation.getTranslationFeatures().toString())
>>>>>>> +          .replace("%c", String.format("%.3f",
>>>>> translation.getTranslationScore()));
>>>>>>> 
>>>>>>>   if (joshuaConfiguration.outputFormat.contains("%t")) {
>>>>>>>     outputString = outputString.replace("%t",
>>>>> derivationState.getTree());
>>>>>>> @@ -250,11 +243,11 @@ public class KBestExtractor {
>>>>>>>   return;
>>>>>>> 
>>>>>>> for (int k = 1; k <= topN; k++) {
>>>>>>> -      String hypStr = getKthHyp(hg.goalNode, k);
>>>>>>> -      if (null == hypStr)
>>>>>>> +      String translation = getKthHyp(hg.goalNode, k);
>>>>>>> +      if (null == translation)
>>>>>>>     break;
>>>>>>> 
>>>>>>> -      out.write(hypStr);
>>>>>>> +      out.write(translation);
>>>>>>>   out.write("\n");
>>>>>>>   out.flush();
>>>>>>> }
>>>>>>> @@ -704,11 +697,11 @@ public class KBestExtractor {
>>>>>>> /**
>>>>>>>  * Visits every state in the derivation in a depth-first order.
>>>>>>>  */
>>>>>>> -    private DerivationVisitor visit(DerivationVisitor visitor) {
>>>>>>> +    public DerivationVisitor visit(DerivationVisitor visitor) {
>>>>>>>   return visit(visitor, 0);
>>>>>>> }
>>>>>>> 
>>>>>>> -    private DerivationVisitor visit(DerivationVisitor visitor, int
>>>>> indent) {
>>>>>>> +    public DerivationVisitor visit(DerivationVisitor visitor, int
>>>>> indent) {
>>>>>>> 
>>>>>>>   visitor.before(this, indent);
>>>>>>> 
>>>>>>> @@ -733,25 +726,25 @@ public class KBestExtractor {
>>>>>>>   return visitor;
>>>>>>> }
>>>>>>> 
>>>>>>> -    private String getHypothesis() {
>>>>>>> +    public String getHypothesis() {
>>>>>>>   return getHypothesis(defaultSide);
>>>>>>> }
>>>>>>> 
>>>>>>> -    private String getTree() {
>>>>>>> +    public String getTree() {
>>>>>>>   return visit(new TreeExtractor()).toString();
>>>>>>> }
>>>>>>> 
>>>>>>> -    private String getHypothesis(Side side) {
>>>>>>> +    public String getHypothesis(Side side) {
>>>>>>>   return visit(new HypothesisExtractor(side)).toString();
>>>>>>> }
>>>>>>> 
>>>>>>> -    private FeatureVector replayFeatures() {
>>>>>>> +    public FeatureVector replayFeatures() {
>>>>>>>   FeatureReplayer fp = new FeatureReplayer();
>>>>>>>   visit(fp);
>>>>>>>   return fp.getFeatures();
>>>>>>> }
>>>>>>> 
>>>>>>> -    private String getDerivation() {
>>>>>>> +    public String getDerivation() {
>>>>>>>   return visit(new DerivationExtractor()).toString();
>>>>>>> }
>>>>>>> 
>>>>>>> @@ -811,7 +804,7 @@ public class KBestExtractor {
>>>>>>>  */
>>>>>>> void after(DerivationState state, int level);
>>>>>>> }
>>>>>>> -
>>>>>>> +
>>>>>>> /**
>>>>>>> * Extracts the hypothesis from the leaves of the tree using the
>>>>> generic (depth-first) visitor.
>>>>>>> * Since we're using the visitor, we can't just print out the words as
>>>>> we see them. We have to
>>>>>>> @@ -878,7 +871,7 @@ public class KBestExtractor {
>>>>>>>   return outputs.pop().replaceAll("<s> ", "").replace(" </s>", "");
>>>>>>> }
>>>>>>> }
>>>>>>> -
>>>>>>> +
>>>>>>> /**
>>>>>>> * Assembles a Penn treebank format tree for a given derivation.
>>>>>>> */
>>>>> 
>>>>> 
>>> Amazon Development Center Germany GmbH
>>> Berlin - Dresden - Aachen
>>> main office: Krausenstr. 38, 10117 Berlin
>>> Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
>>> Ust-ID: DE289237879
>>> Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
>>> 
>> 
>>

Re: joshua_api

Reply via email to