Hi,

can you share an example document which shows the behavior?

Thanks... Dominik.


On Sun, Oct 6, 2019 at 6:48 AM Teresa Kim
<teresa....@linguamatics.com.invalid> wrote:

> Hi
>
>
> I have documents (either 'doc' or 'docx') that have a special character
> for 'greater than equal' and using codes in 'WordToHtmlConverter', I see
> those characters are converted into '('.
>
> I tried with the latest apache poi release 4.1.0.
>
>
> My java code is:
>
>
> public class TestWordtoHtmlConverter {
>
>      public static void main(String[] args ) {
>          try {
>          HWPFDocumentCore wordDocument = WordToHtmlUtils.loadDoc(new
> FileInputStream(args[0]));
>
>          WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(
>                  DocumentBuilderFactory.newInstance().newDocumentBuilder()
>                          .newDocument());
>
>          wordToHtmlConverter.processDocument(wordDocument);
>          Document htmlDocument = wordToHtmlConverter.getDocument();
>          ByteArrayOutputStream out = new ByteArrayOutputStream();
>          DOMSource domSource = new DOMSource(htmlDocument);
>          StreamResult streamResult = new StreamResult(out);
>
>          TransformerFactory tf = TransformerFactory.newInstance();
>          Transformer serializer = tf.newTransformer();
>          serializer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
>          serializer.setOutputProperty(OutputKeys.INDENT, "yes");
>          serializer.setOutputProperty(OutputKeys.METHOD, "html");
>          serializer.transform(domSource, streamResult);
>          out.close();
>
>          String result = new String(out.toByteArray());
>          System.out.println(result);
>        } catch (Exception e) {
>        }
>
> Is there anyway I can correctly identify these symbols?
>
>
> In the sample document, I am interested in getting 'bad one'.
>
>
> Thanks
>
> T.
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@poi.apache.org
> For additional commands, e-mail: user-h...@poi.apache.org

Reply via email to