[tex4ht] [bug #340] Math issues in the ODF export

2023-10-05 Thread Michal Hoftich
Update of bug #340 (project tex4ht):

 Open/Closed:Open => Closed 


___

Reply to this item at:

  

___
  Message sent via/by Puszcza
  http://puszcza.gnu.org.ua/



[tex4ht] [bug #340] Math issues in the ODF export

2023-10-05 Thread Michal Hoftich
Follow-up Comment #10, bug #340 (project tex4ht):

It seems that most of these issues were resolved in the meantime. The only
remaining issue was incorrect support for some relation operators:

2. Instead of "<" characters at start of array columns, upside-down "?" are
displayed. 

I think it is LibreOffice's bug, but as a workaround, I've found that empty
 inserted before these operators work. The fix is included in the
development version of make4ht.

___

Reply to this item at:

  

___
  Message sent via/by Puszcza
  http://puszcza.gnu.org.ua/



Re: [tex4ht] [bug #340] Math issues in the ODF export

2016-11-26 Thread Michal Hoftich
> I have never dived into the code for tex4ht, but this last
> description suggests to me that xtpipes is an interface to
> James Clark's "xt", which is an implementation of XSLT
> transformations via java libraries.
> 
> See http://www.jclark.com/xml/
> 
> I would not expect xtpipes to work properly unless xt is
> installed on the user's platform.  There are 3 jars:
> xp/xp.jar, xt/sax.jar, and xt/xt.jar.  The installation I
> have is from 1999 and occupies about 4 megabytes in
> /usr/local.  If it would help I can send along a shell
> script for running it from the command line.

I am not sure whether it is the same thing. Maybe it was inspired? Here
is the source [1] and the documentation [2].
> 
> New DomObject?  Sounds interesting.  Is there a doc for it?

It is in the development version of LuaXML [3].  Documentation is only
in the source comments at the moment, I am still working on it. There is
also CSS selector library [4], which can be used for element traversing
and matching, using CSS selector syntax. I plan to use it for
conversions from XML to LaTeX, or other formats (StarMath?).  But it is
not usable yet.  Anyway, the usage can be seen in the unit tests ([5],
[6]).  

[1] 
http://svn.gnu.org.ua/viewvc/tex4ht/trunk/src/java/xtpipes/Xtpipes.java?revision=4=markup
[2] http://michal-h21.github.io/src4ht/xtpipes.html
[3] https://github.com/michal-h21/LuaXML/blob/master/luaxml-domobject.lua
[4] https://github.com/michal-h21/LuaXML/blob/master/luaxml-cssquery.lua
[5] https://github.com/michal-h21/LuaXML/blob/master/test/dom-test.lua
[6] https://github.com/michal-h21/LuaXML/blob/master/test/cssquery-test.lua




Re: [tex4ht] [bug #340] Math issues in the ODF export

2016-11-26 Thread William F Hammond
Michal writes:

> Thanks Karl, I thought that there is no XSLT processor included.

> I think that xtpipes works, for example cross-references
> doesn't work without it. As I understand it, it combines
> SAX processor with XSLT or calls to Java.

I have never dived into the code for tex4ht, but this last
description suggests to me that xtpipes is an interface to
James Clark's "xt", which is an implementation of XSLT
transformations via java libraries.

See http://www.jclark.com/xml/

I would not expect xtpipes to work properly unless xt is
installed on the user's platform.  There are 3 jars:
xp/xp.jar, xt/sax.jar, and xt/xt.jar.  The installation I
have is from 1999 and occupies about 4 megabytes in
/usr/local.  If it would help I can send along a shell
script for running it from the command line.

> I guess that all what xtpipes does can be done using Lua
> and the new Domobject, which will be used in next make4ht
> version. Maybe the Domobject will be usable also for
> conversion of mathml to starmath or TeX annotations.

New DomObject?  Sounds interesting.  Is there a doc for it?

-- Bill




[tex4ht] [bug #340] Math issues in the ODF export

2016-11-24 Thread Michal Hoftich
Follow-up Comment #7, bug #340 (project tex4ht):

Thanks Karl. 

Yes, maybe it is not a bad idea to contact LO people before I try to write
mathml to starmath convertor in Lua.

I have one question. Is any xslt processor included in TeX Live? I know that
our xt-pipes include a xslt processor, but does anybody know how it works? 

I am interested in Mathml to TeX conversion and found one potential solution
[1], which is based on xslt. I would like to try if it could be used to add
TeX as annotation for mathml.  

[1] https://github.com/transpect/mml2tex

___

Reply to this item at:

  

___
  Message sent via/by Puszcza
  http://puszcza.gnu.org.ua/



[tex4ht] [bug #340] Math issues in the ODF export

2016-11-23 Thread Karl Berry
Follow-up Comment #6, bug #340 (project tex4ht):

committed the new mathml.4ht and ooffice-mml.4ht to TL.

regarding star(t)math, i suppose only the LO people can help, if they invented
it. maybe they'd be willing to work with you on improving the overall
situation wrt tex4ht/LO/mathml ...

___

Reply to this item at:

  

___
  Message sent via/by Puszcza
  http://puszcza.gnu.org.ua/



Re: [tex4ht] [bug #340] Math issues in the ODF export

2016-11-23 Thread Michal Hoftich
Hi Bill,

> 
> For MathML I think it's better -- and sometimes more
> inter-operable -- simply to insert the full MathML namespace
> via the xmlns attribute in each  element and avoid
> prefixing altogether.  Of course, if inside a  element
> you want to insert something that is not MathML, that (a) is
> probably going to cause a problem somewhere and (b) would
> require either prefixing or inserting an xmlns attribute on
> each external subelement.
> 
> With HTML5 (as text/html) using prefixes will cause
> breakage.  Using an xmlns attribute on a  element in
> HTML5 is unnecessary but should go as unrecognized and be
> ignored by web browsers.  It is helpful when one wants the
> code to work for both the text/html and the
> application/xhtml+xml serializations of HTML5.  I'm guessing
> that this technique would be good for xtpipes.
> 

The prefixes are used only in OpenOffice output (and probably other
formats based on XML, I guess), not in HTML. I think that Eitan used
them because in the main xml file in the ODF file, everything is
prefixed. But each math instance is saved in standalone XML file and
included as picture from the main document, so it seems that prefixing
is not necessary. At least in the examples I've found, the prefixes
weren't used.

We still don't have HTML5 output unfortunately - basic structure
shouldn't be that hard to support, but what about new semantic or
accessibility attributes. Interesting ideas are contained in Scholarly
html format, although it is little too much prescriptive for my taste.
For example, for every math instance, annotation in TeX format is
required. This is doable in tex4ht, but not easy.


> 
> And, alas, failing to include presentation mathml in the
> html namespace was a way to make support of mathml in web
> browsers have secondary importance.  There are problems in
> browser handling of HTML5 that result from having MathML be
> external.  For example, a comma following an inline 
> element may wind up on the following line, whereas a comma
> following an (inline)  element will not.
> 

Sure, it seems that no one who creates reading applications likes
mathml. Some issues in browsers can be at least fixed using Mathjax,
but this is not an option in office suites. 

Totally tragic is mathml support in Epub 3 readers, where mathml is part
of the standard. Those who support it, do it using Mathjax [2]. This was
painfully slow last time I tried one of such applications on my phone.

One interesting idea is to use Mathjax to convert mathml to html - it
uses some CSS tricks to display the math correctly. Mathjax is now
supported in Node.js [3], with cli tool for such conversion. I've played
with it a bit and created library for make4ht [4]. It can be used with
the following make4ht build file:


---
local mathjax_node = require "mathnode"

Make:htlatex{}

local format = "woff"
Make:match("html$", mathjax_node, {fontdir = format, fontformat = format})


It assumes that Mathjax fonts in "woff" format are stored in the "woff"
directory. You can see sample result here [5].

I was also able to compile the previous example to an Epub file, which
could be displayed in readers with modern CSS support. I also tried to
convert that epub file to kindle format, but it didn't work, as
expected.


Best regards,
Michal


[1] https://w3c.github.io/scholarly-html/
[2] http://docs.mathjax.org/en/latest/misc/epub.html
[3] https://github.com/mathjax/MathJax-node
[4] https://github.com/michal-h21/make4ht/blob/master/mathnode.lua
[5] http://michal-h21.github.io/mathjaxsample/sample.html



[tex4ht] [bug #340] Math issues in the ODF export

2016-11-23 Thread Michal Hoftich
Follow-up Comment #5, bug #340 (project tex4ht):

I've just found that LO can read mathml files without prefix, I must had done
some mistake previously when it didn't work. The issue with prefix-less mathml
is that processing with xtpipes don't work. 

I've also found interesting thread [1], where odt export from tex4ht is
discussed. It seems that at least 7 years ago improving bugs in mathml import
wasn't important for OO devs (I totally understand that they didn't had enough
developers, moreover developers who understand mathml). Also it seems that the
annotation in StarMath format is required for valid odf file. Which again
leads to a question: is there any usable mathml to StartMath converter?

[1] https://bz.apache.org/ooo/show_bug.cgi?id=69088

___

Reply to this item at:

  

___
  Message sent via/by Puszcza
  http://puszcza.gnu.org.ua/



[tex4ht] [bug #340] Math issues in the ODF export

2016-11-23 Thread Michal Hoftich
Follow-up Comment #4, bug #340 (project tex4ht):

Yes, you can push it to TL, it fixes the issue with . We should leave
this issue open, as we should add prefixes to all mathml attributes, and
address the LO's mathml handling.

___

Reply to this item at:

  

___
  Message sent via/by Puszcza
  http://puszcza.gnu.org.ua/



[tex4ht] [bug #340] Math issues in the ODF export

2016-11-23 Thread Michal Hoftich
Follow-up Comment #3, bug #340 (project tex4ht):

Thanks Karl. The html+mathml produced from the sample document are valid. I
can't find functional ODF validator, the online one [1] gives me "Internal
server error". I've tried to validate the mathml files included in the ODF can
be validated in the HTML validator. It failed because of math: prefix, when I
removed it, then they passed as valid. LibreOffice don't open the mathml files
without namespace, so this shouldn't be an issue. So I guess this really is a
bug in LibreOffice .

I've also figured out substance of another issue, which I found earlier, but
didn't understand. Word can open ODT files, but it cannot display the math
from tex4ht by default. But when you open the ODT file in LibreOffice and save
it, then it can be read. It seems that LO converts the mathml to its's own
format called StarMath, which can be then edited in LO's equation editor. Word
seems to understand only the StarMath, so it can read the math in ODT files
only after it is added by LO. 

I can't find much information about StarMath. There is an element reference
[1] and it seems that it is based on Troff's Eqn format [3]. I can't find
Mathml to StartMath nor Eqn convertor, so we probably need to rely on LO's
Mathml support, or write custom Mathml to StarMath converter. 

BTW, I've created  simple DOM library for LuaXML, next version of Make4ht will
provide Lua filters based on it, in addition to regular expression filters. It
can do some really funny stuff.

[1] https://odf-validator.rhcloud.com/
[2] https://wiki.documentfoundation.org/images/2/26/MG44-MathGuide.pdf
[3] http://manpages.ubuntu.com/manpages/precise/man1/eqn.1.html

___

Reply to this item at:

  

___
  Message sent via/by Puszcza
  http://puszcza.gnu.org.ua/



[tex4ht] [bug #340] Math issues in the ODF export

2016-11-22 Thread Karl Berry
Follow-up Comment #2, bug #340 (project tex4ht):

p.s. regarding your commit (r200), is this something i should push into TL
now? thanks for all ...

___

Reply to this item at:

  

___
  Message sent via/by Puszcza
  http://puszcza.gnu.org.ua/



[tex4ht] [bug #340] Math issues in the ODF export

2016-11-22 Thread Karl Berry
Follow-up Comment #1, bug #340 (project tex4ht):

Hi Michal - regarding whether the mathml fragment is ok, I can only suggest
passing it (a whole doc using it) through the W3C (or other) validator and
see. It's hard to imagine what could be wrong with it, but who knows. (And
whether reporting bugs to libreoffice is worthwhile, I also don't know.)

___

Reply to this item at:

  

___
  Message sent via/by Puszcza
  http://puszcza.gnu.org.ua/



[tex4ht] [bug #340] Math issues in the ODF export

2016-11-22 Thread Michal Hoftich
URL:
  

 Summary: Math issues in the ODF export
 Project: tex4ht
Submitted by: michal_h21
Submitted on: Tue 22 Nov 2016 05:39:54 PM EET
Category: None
Priority: 5 - Normal
Severity: 5 - Normal
  Status: None
 Privacy: Public
 Assigned to: None
Originator Email: 
 Open/Closed: Open
 Discussion Lock: Any

___

Details:

When I tried to compile the code from a question on TeX.sx [1], I found
several issues in ODT export:

1. The braces from \left are small, they don't cover the three lines in the
multi-line equation.

2. Instead of "<" characters at start of array columns, upside-down "?" are
displayed. 

Ad 1: Definitions of \Configure{left} and \Configure{right} are redefined in
ooffice-mml.4ht. It is generated from tex4ht-ooffice.tex. There is a comment
in the sources:

> OO doesn't seem to hono mfenced

I've tried to delete the configurations for `left` and `right` from
ooffice-mml.4ht, so the default mathml configuration was used. This resulted
in brackets of correct size, but wrong form. "(" instead of "{" was used,
right bracket shouldn't be displayed at all. 

I've took a look at the generated mathml code. For each math, one file named
"filename-m{count}/content.xml" is created. The automatic size bracket are
contained in `` element, with attributes `left` and `right`, where
the bracket character is specified. mathml used in odf uses `math:` prefix on
each element, this prefix must be used also on attributes. In our case `left`
and `right` attributes didn't have this prefix, so they haven't been taken
into the account and default brackets are used.

The prefixes are added using `\a:mathml` command in the tex4ht-mathml.tex, it
is empty by default, but ooffice uses mathml: prefix. It is used on all
element names and on most attributes, but it is missing on some of them, in
particular in all configurations which use `` element. 

I will add the prefix for the attributes to all configurations which use
`` element and remove the configurations of "left"  and "right" from
ooffice-mml.4ht. But I guess there is much more instances of prefix-less
attributes which need to be fixed.

Also, maybe it is worth checking whether all mathml fixes in ooffice-mml.4ht
are really useful, or if there were only some minor bugs in mathml.4ht as in
this case.

Ad 2: It seems to be a bug in the LibreOffice mathml handling. Minimal example
which shows this issue is ${} < c$

This result in following mathml:

http://www.w3.org/1998/Math/MathML;
xmlns:xlink="http://www.w3.org/1999/xlink;>   
  c

this seems like valid code and Firefox for instance has no problem in
displaying that. It can be fixed if we add `` tag before
``.

So my question is: is it really a bug in LO, or is there also some issue with
the mathml from tex4ht? If it is bug in tex4ht, can we insert ``
automatically in the place of {} in the math context? Or is some
post-processing of the XML needed?

[1]  http://tex.stackexchange.com/q/340322/2891




___

Reply to this item at:

  

___
  Message sent via/by Puszcza
  http://puszcza.gnu.org.ua/