[ 
https://issues.apache.org/jira/browse/OPENNLP-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17733894#comment-17733894
 ] 

ASF GitHub Bot commented on OPENNLP-1174:
-----------------------------------------

rzo1 commented on code in PR #541:
URL: https://github.com/apache/opennlp/pull/541#discussion_r1233251530


##########
opennlp-tools/src/main/java/opennlp/tools/util/featuregen/GeneratorFactory.java:
##########
@@ -95,25 +87,235 @@
 public class GeneratorFactory {
 
   /**
-   * The {@link XmlFeatureGeneratorFactory} is responsible to construct
-   * an {@link AdaptiveFeatureGenerator} from an given XML {@link Element}
-   * which contains all necessary configuration if any.
+   * Creates an {@link AdaptiveFeatureGenerator} from an provided XML 
descriptor.
+   * <p>
+   * Usually this XML descriptor contains a set of nested feature generators
+   * which are then used to generate the features by one of the opennlp
+   * components.
+   *
+   * @param xmlDescriptorIn the {@link InputStream} from which the descriptor
+   *                        is read, the stream remains open and must be 
closed by the caller.
+   * @param resourceManager the resource manager which is used to resolve 
resources
+   *                        referenced by a key in the descriptor
+   * @return created feature generators
+   * @throws IOException if an error occurs during reading from the descriptor
+   *                     {@link InputStream}
    */
-  @Deprecated // TODO: (OPENNLP-1174) just remove when back-compat is no 
longer needed
-  interface XmlFeatureGeneratorFactory {
+  public static AdaptiveFeatureGenerator create(InputStream xmlDescriptorIn,
+                                                
FeatureGeneratorResourceProvider resourceManager)
+      throws IOException {
 
-    /**
-     * Creates an {@link AdaptiveFeatureGenerator} from a the describing
-     * XML element.
-     *
-     * @param generatorElement the element which contains the configuration
-     * @param resourceManager the resource manager which could be used
-     *     to access referenced resources
-     *
-     * @return the configured {@link AdaptiveFeatureGenerator}
-     */
-    AdaptiveFeatureGenerator create(Element generatorElement,
-        FeatureGeneratorResourceProvider resourceManager) throws 
InvalidFormatException;
+    final org.w3c.dom.Document xmlDescriptorDOM = createDOM(xmlDescriptorIn);
+    final Element generatorElement = xmlDescriptorDOM.getDocumentElement();
+
+    return createGenerator(generatorElement, resourceManager);
+  }
+
+  /**
+   * Creates a {@link AdaptiveFeatureGenerator} for the provided element.
+   * To accomplish this it looks up the corresponding factory by the
+   * element tag name. The factory is then responsible for the creation
+   * of the generator from the element.
+   *
+   * @param generatorElement must not be {@code null}
+   * @param resourceManager  may be {@code null}
+   * @return a {@link AdaptiveFeatureGenerator}
+   */
+  private static AdaptiveFeatureGenerator buildGenerator(Element 
generatorElement,
+                                                         
FeatureGeneratorResourceProvider resourceManager)
+      throws InvalidFormatException {
+
+    if (generatorElement == null) {
+      throw new IllegalArgumentException("generatorElement must not be NULL");
+    }
+
+    final String className = generatorElement.getAttribute("class");
+    if (className.isBlank()) {
+      throw new InvalidFormatException("generator must have class attribute");
+    } else {
+      try {
+        final Class<?> factoryClass = Class.forName(className);
+        try {
+          final Constructor<?> constructor = factoryClass.getConstructor();
+          final AbstractXmlFeatureGeneratorFactory factory =
+              (AbstractXmlFeatureGeneratorFactory) constructor.newInstance();
+          factory.init(generatorElement, resourceManager);
+          return factory.create();
+        } catch (NoSuchMethodException | InvocationTargetException | 
InstantiationException
+                 | IllegalAccessException e) {
+          throw new RuntimeException(e);
+        }
+      } catch (ClassNotFoundException e) {
+        throw new RuntimeException(e);
+      }
+    }
+  }
+
+  /**
+   * Creates a {@link AdaptiveFeatureGenerator} for the provided element.
+   * To accomplish this it looks up the corresponding factory by the
+   * element tag name. The factory is then responsible for the creation
+   * of the generator from the element.
+   *
+   * @param generatorElement must not be {@code null}
+   * @param resourceManager  may be {@code null}
+   * @return a {@link AdaptiveFeatureGenerator}
+   */
+  private static AdaptiveFeatureGenerator createGenerator(Element 
generatorElement,
+                                                          
FeatureGeneratorResourceProvider resourceManager)
+      throws InvalidFormatException {
+
+    if (generatorElement == null) {
+      throw new IllegalArgumentException("generatorElement must not be NULL");
+    }
+
+    final String elementName = generatorElement.getTagName();
+
+    // check it is new format?
+    if ("featureGenerators".equals(elementName)) {
+
+      final List<AdaptiveFeatureGenerator> generators = new ArrayList<>();
+      final NodeList childNodes = generatorElement.getChildNodes();
+      for (int i = 0; i < childNodes.getLength(); i++) {
+        final Node childNode = childNodes.item(i);
+        if (childNode instanceof Element elem) {
+          final String type = elem.getTagName();
+          if ("generator".equals(type)) {
+            generators.add(buildGenerator(elem, resourceManager));
+          } else {
+            throw new InvalidFormatException("Unexpected element: " + 
elementName);
+          }
+        }
+      }
+
+      AdaptiveFeatureGenerator featureGenerator;
+      if (generators.size() == 1) {
+        featureGenerator = generators.get(0);
+      } else if (generators.size() > 1) {
+        featureGenerator = new AggregatedFeatureGenerator(generators.toArray(
+            new AdaptiveFeatureGenerator[0]));
+      } else {
+        throw new InvalidFormatException("featureGenerators must have one or 
more generators");
+      }
+
+      // disallow manually specifying CachedFeatureGenerator
+      if (featureGenerator instanceof CachedFeatureGenerator) {
+        throw new InvalidFormatException("CachedFeatureGeneratorFactory cannot 
be specified manually." +
+            "Use cache=\"true\" attribute in featureGenerators element 
instead.");
+      }
+
+      // check cache usage
+      if (Boolean.parseBoolean(generatorElement.getAttribute("cache"))) {
+        return new CachedFeatureGenerator(featureGenerator);
+      } else {
+        return featureGenerator;
+      }
+    } else {
+      throw new IllegalArgumentException(
+          "[OPENNLP-1174] - Classic configuration format is no longer 
supported!");
+    }
+  }
+
+  private static org.w3c.dom.Document createDOM(InputStream xmlDescriptorIn)
+      throws IOException {
+
+    final DocumentBuilder documentBuilder = XmlUtil.createDocumentBuilder();
+
+    org.w3c.dom.Document xmlDescriptorDOM;
+
+    try {
+      xmlDescriptorDOM = documentBuilder.parse(xmlDescriptorIn);
+    } catch (SAXException e) {
+      throw new InvalidFormatException("Descriptor is not valid XML!", e);
+    }
+    return xmlDescriptorDOM;
+  }
+
+  public static Map<String, ArtifactSerializer<?>> 
extractArtifactSerializerMappings(
+      InputStream xmlDescriptorIn) throws IOException {
+
+    final org.w3c.dom.Document xmlDescriptorDOM = createDOM(xmlDescriptorIn);
+    final Element element = xmlDescriptorDOM.getDocumentElement();
+
+    final String elementName = element.getTagName();
+
+    // check it is new format?
+    if ("featureGenerators".equals(elementName)) {
+      return addMappingsFromXmlChildren(element.getChildNodes(), new 
HashMap<>());
+    } else {
+      throw new IllegalArgumentException(
+          "[OPENNLP-1174] - Classic configuration format is no longer 
supported!");
+    }
+  }
+
+  private static void extractArtifactSerializerMappings(
+      Map<String, ArtifactSerializer<?>> mapping, Element element) {
+    final String className = element.getAttribute("class");
+    if (!className.isBlank()) {
+      try {
+        final Class<?> factoryClass = Class.forName(className);
+        try {
+          final Constructor<?> constructor = factoryClass.getConstructor();
+          final AbstractXmlFeatureGeneratorFactory factory =
+              (AbstractXmlFeatureGeneratorFactory) constructor.newInstance();
+          factory.init(element, null);
+          final Map<String, ArtifactSerializer<?>> map = 
factory.getArtifactSerializerMapping();
+          if (map != null) {
+            mapping.putAll(map);
+          }
+        } catch (NoSuchMethodException | InvocationTargetException | 
InstantiationException
+                 | IllegalAccessException e) {
+          throw new RuntimeException(e);
+        } catch (InvalidFormatException ignored) {
+        }
+      } catch (ClassNotFoundException e) {
+        throw new RuntimeException(e);
+      }
+    }
+
+    addMappingsFromXmlChildren(element.getChildNodes(), mapping);
+  }
+
+  private static Map<String, ArtifactSerializer<?>> addMappingsFromXmlChildren(
+      final NodeList nodes, final Map<String, ArtifactSerializer<?>> mapping) {
+    for (int i = 0; i < nodes.getLength(); i++) {
+      if (nodes.item(i) instanceof Element childElem) {
+        if ("generator".equals(childElem.getTagName())) {
+          extractArtifactSerializerMappings(mapping, childElem);
+        }
+      }
+    }
+    return mapping;
+  }
+
+  /**
+   * Provides a list with all the elements in the xml feature descriptor.
+   *
+   * @param xmlDescriptorIn the xml feature descriptor
+   * @return a list containing all elements
+   * @throws IOException            if inputstream cannot be open
+   * @throws InvalidFormatException if xml is not well-formed
+   */
+  public static List<Element> getDescriptorElements(InputStream 
xmlDescriptorIn)
+      throws IOException {
+
+    final List<Element> elements = new ArrayList<>();
+    final org.w3c.dom.Document xmlDescriptorDOM = createDOM(xmlDescriptorIn);
+    final XPath xPath = XPathFactory.newInstance().newXPath();
+    NodeList allElements;
+    try {
+      final XPathExpression exp = xPath.compile("//*");
+      allElements = (NodeList) 
exp.evaluate(xmlDescriptorDOM.getDocumentElement(), XPathConstants.NODESET);
+    } catch (XPathExpressionException e) {
+      throw new IllegalStateException("The hard coded XPath expression should 
always be valid!");

Review Comment:
   Should be `InvalidFormatException` here, which is already documented in the 
JavaDoc but code was misbehaving. Changed to `InvalidFormatException`





> remove classic format support in feature generator XML config when it is no 
> longer needed
> -----------------------------------------------------------------------------------------
>
>                 Key: OPENNLP-1174
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1174
>             Project: OpenNLP
>          Issue Type: Planned Work
>            Reporter: Koji Sekiguchi
>            Assignee: Richard Zowalla
>            Priority: Minor
>
> I put many "TODO" marks in the patch for OPENNLP-1154, e.g.:
> {code:java}
> @Deprecated // TODO: just remove when back-compat is no longer needed
> static void register(Map<String, GeneratorFactory.XmlFeatureGeneratorFactory> 
> factoryMap) {
>   factoryMap.put("generators", new AggregatedFeatureGeneratorFactory());
> }
> {code}
> Before merging it, I'd like to create this ticket to get ticket number and 
> add the number in the patch for OPENNLP-1154 so that we can easily find 
> unnecessary codes when we want to remove in the future..



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to