In my local copy I have these methods in the interface:
 Map<String, Double> scoreMap(String text);
 SortedMap<Double, Set<String>> sortedScoreMap(String text);

and these impls of them in the ME impl


  public Map<String, Double> scoreMap(String text) {
    Map<String, Double> probDist = new HashMap<String, Double>();

    double[] categorize = categorize(text);
    int catSize = getNumberOfCategories();
    for (int i = 0; i < catSize; i++) {
      String category = getCategory(i);
      probDist.put(category, categorize[getIndex(category)]);
    }
    return probDist;

  }

  public SortedMap<Double, Set<String>> sortedScoreMap(String text) {
    SortedMap<Double, Set<String>> descendingMap = new TreeMap<Double,
Set<String>>().descendingMap();
    double[] categorize = categorize(text);
    int catSize = getNumberOfCategories();
    for (int i = 0; i < catSize; i++) {
      String category = getCategory(i);
      double score = categorize[getIndex(category)];
      if (descendingMap.containsKey(score)) {
        descendingMap.get(score).add(category);
      } else {
        Set<String> newset = new HashSet<>();
        newset.add(category);
        descendingMap.put(score, newset);
      }
    }
    return descendingMap;
  }


They are pretty simple, but if everyone agrees I can commit them (with some
java docs)





On Sat, Apr 26, 2014 at 8:39 AM, Jörn Kottmann <kottm...@gmail.com> wrote:

> On Thu, 2014-04-24 at 19:54 -0300, William Colen wrote:
> > Yes, it looks nice. Maybe we should redo all the DocumentCategorizer
> > interface. It is different from other tools, for example, we can't get
> the
> > best category of one document with only one call, we need to use two
> > methods.
>
> Yes that is right. +1 to change it. Can we deprecate the old methods and
> just add new ones to not break backward compatibility?
>
> Jörn
>
>

Reply via email to