Dear Ahmed,

there are two distinct steps to be taken into account to insert your HTML
string into an ODF document:

   1. Transform the HTML to ODF
   2. Map the ODF String to API calls creating the new ODF content

In addition, there might be different approaches depending on the following
three options. The HTML you insert consists of ...

a) a full new document (unlikely from your description)
b) one or more parts of the document (and you know exactly where to these
part too)
c) one or more parts of the document  (and the parts might vary and
sometimes only formatting is being added to existing content)

Regarding 1)
If you are in control of the HTML it is far easier, as you might know, that
it is always valid XHTML.
If not, I suggest considering to add Tidy HTML to your pipeline, a quick
google search revealed the following:
https://www.ibm.com/developerworks/library/x-tiptidy/index.html
Mapping HTML to ODF can be done in many ways, for instance via XSLT, based
on DOM or SAX events. There is already some work being found on the
Internet you may base your work upon.

Regarding 2)
If you have scenario a) or b) of ODF XML on certain destination places, you
might traverse through the ODF DOM via the ODFDOM API and parse the ODF XML
string to a DOM to be added.
I did not find quickly a regression test using this functionality but even
if the API does not provide it now, it may be extended to do so.
If it is use case (c) I would be wiser to transform the XHTML to ODF change
events (although the events might alter in their naming, they would be
easily adaptable)

Hope I could help you,
Svante


ᐧ

2018-06-28 15:40 GMT+02:00 Ahmed I Ibrahim <[email protected]>:

> Dear all,
>
> I have a string that I receive in my code which its content is HTML
> formatted like the following
>
> String inputString ="<p>    <strong>test with Bold Description of an
> ESB</strong></p><p>    This is a test with &lt;wierd characters&gt;&nbsp;
> that "break" the processing.</p><p>    <font color="#E3A49C"><strong>Test
> with colors MORNING</strong></font></p><br />"
>
> And I want to insert it into the document but to get it in the right
> format which means that  for example the part <strong>test with Bold
> Description of an ESB</strong> should be processed to remove the <strong>
> tags and should be formatted as bold.
>
> How can I do that? Should I use an HTML parser to pars the string and
> handle it using my code? Or is there a way using ODFDOM for example?
>
> I really appreciate your support in this.
>
> Thanks & Best Regards / ?????? ?????
>
> Ahmed Ibrahim
> MBA, Executive Architect,
> Egypt Client Innovation Center, IBM Global Business Services
> IT Architect Egypt Profession Lead
> Egypt Mobile: +20 100 1615 506
> Qatar Mobile: +974 3366 5808
>
>

Reply via email to