Re: XML stream writer library
On 1/12/21 6:19 PM, Thibaut Cuvelier wrote: > On Tue, 12 Jan 2021 at 16:33, Lorenzo Bertini > mailto:lorenzobertin...@gmail.com>> wrote: > > Il 08/01/21 03:00, Thibaut Cuvelier ha scritto: > > A tour of some C++ libraries for XML: > > - RapidXML: mostly unmaintained since 2013, no support for > namespaces > > (except in forks: https://github.com/dwd/rapidxml > <https://github.com/dwd/rapidxml> > > <https://github.com/dwd/rapidxml <https://github.com/dwd/rapidxml>>) > > - Boost Property Tree: no XML parser, which limits further use > (it can > > use RapidXML though, see above) > > - libstudxml: C++ library, designed for speed, no DOM > > - libxml2: C library, designed for features and not speed (also > includes > > XPath and XSLT, DTD and XML Schema, namespaces), "mature" and > barely not > > evolving anymore > > - libxml++: depends on glibmm2 > > - Xerces-C++: C++ library, designed for features and not speed > (also > > includes XPath, DTD and XML Schema, namespaces), "mature" and > barely not > > evolving anymore; no XSLT (Xalan could be used, but it only > works with a > > ancient version of Xerces; XQuilla implemented XPath 2, but is > no more > > developed since 2016) > > - Expat: C library, designed for speed, no DOM by default > (provided by > > https://github.com/kolotsey/expat-dom > <https://github.com/kolotsey/expat-dom> > > <https://github.com/kolotsey/expat-dom > <https://github.com/kolotsey/expat-dom>>), with namespaces > > - tinyxml2: C++ library, designed for speed only (also includes > XPath > > through the unmaintained > https://github.com/stanthomas/tinyxml2-ex > <https://github.com/stanthomas/tinyxml2-ex> > > <https://github.com/stanthomas/tinyxml2-ex > <https://github.com/stanthomas/tinyxml2-ex>>, no validation, no > > namespaces), mature and slowly evolving > > - pugixml: C++ library, designed for speed with a few features > (like > > XPath, no validation, no namespaces), mature and evolving > > - libroxml: C library, no clear design goal (includes XPath, > namespaces, > > no validation), evolving > > - Saxon-C: C/C++ wrapper of the state-of-the-art Java library, > largest > > amount of features (XPath and XSLT 3, DTD and XML Schema > validation -- > > extension for RelaxNG: http://www.cfoster.net/saxon-jing/ > <http://www.cfoster.net/saxon-jing/> > > <http://www.cfoster.net/saxon-jing/ > <http://www.cfoster.net/saxon-jing/>> --, namespaces), very mature, > > really evolving (both performance and features), but it requires > a JVM > > (Excelsior is built-in, even though it's not been maintained for > quite a > > long time) > > - Qt: no, I was joking :). Qt XML is not supported anymore, it's > > recommended to switch to QXmlStreamReader and QXmlStreamWriter > (which > > are only SAX-like). Qt XML Patterns used to have XPath, XSLT, > and XML > > Schema, but it's been deprecated a while ago (Qt 5.13 for the last > > wake-up call, but it hasn't been touched since Qt 4, basically) > > > > If LyX is being really serious about XML (i.e. moving as many > things as > > possible to XML technologies), Saxon is probably the way to go. > > Otherwise, it's going to be too heavy to ship Saxon and a JVM > along with > > LyX. Instead, pugixml seems to me like a good choice: a few > features > > (XPath is the most relevant for LyX, and included in the base > library, > > no need for addons), good performance, still maintained (there is a > > chance to have bugs fixed in a newer version, plus security > > vulnerabilities taken care of). > Was this addressed in the virtual meeting? > > > As far as I know, it wasn't discussed. We were pretty focused on planning for 2.4.0. > Anyhow, I think that for a start we'd need only the most basic > features > (tag insertion, indent), as was the purpose of #12055 in the first > place > (I'm sorry to have opened this pandora's box), so maybe no harm will > come if we start wrapping pugi. > > Let me know what you think, and if this is not the time for this, as > with LyX 2.4 coming out there might be other things that need focus. > > > It looks like the patches cannot get integrated into the master > development branch before 2.4 is out (or at least branched). However, > in the meantime, I think I can create a feature branch and push your > patches there (https://www.lyx.org/trac/browser/features > <https://www.lyx.org/trac/browser/features>). Yes, that would be the way to go. Riki -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: XML stream writer library
On Tue, 12 Jan 2021 at 16:33, Lorenzo Bertini wrote: > Il 08/01/21 03:00, Thibaut Cuvelier ha scritto: > > A tour of some C++ libraries for XML: > > - RapidXML: mostly unmaintained since 2013, no support for namespaces > > (except in forks: https://github.com/dwd/rapidxml > > <https://github.com/dwd/rapidxml>) > > - Boost Property Tree: no XML parser, which limits further use (it can > > use RapidXML though, see above) > > - libstudxml: C++ library, designed for speed, no DOM > > - libxml2: C library, designed for features and not speed (also includes > > XPath and XSLT, DTD and XML Schema, namespaces), "mature" and barely not > > evolving anymore > > - libxml++: depends on glibmm2 > > - Xerces-C++: C++ library, designed for features and not speed (also > > includes XPath, DTD and XML Schema, namespaces), "mature" and barely not > > evolving anymore; no XSLT (Xalan could be used, but it only works with a > > ancient version of Xerces; XQuilla implemented XPath 2, but is no more > > developed since 2016) > > - Expat: C library, designed for speed, no DOM by default (provided by > > https://github.com/kolotsey/expat-dom > > <https://github.com/kolotsey/expat-dom>), with namespaces > > - tinyxml2: C++ library, designed for speed only (also includes XPath > > through the unmaintained https://github.com/stanthomas/tinyxml2-ex > > <https://github.com/stanthomas/tinyxml2-ex>, no validation, no > > namespaces), mature and slowly evolving > > - pugixml: C++ library, designed for speed with a few features (like > > XPath, no validation, no namespaces), mature and evolving > > - libroxml: C library, no clear design goal (includes XPath, namespaces, > > no validation), evolving > > - Saxon-C: C/C++ wrapper of the state-of-the-art Java library, largest > > amount of features (XPath and XSLT 3, DTD and XML Schema validation -- > > extension for RelaxNG: http://www.cfoster.net/saxon-jing/ > > <http://www.cfoster.net/saxon-jing/> --, namespaces), very mature, > > really evolving (both performance and features), but it requires a JVM > > (Excelsior is built-in, even though it's not been maintained for quite a > > long time) > > - Qt: no, I was joking :). Qt XML is not supported anymore, it's > > recommended to switch to QXmlStreamReader and QXmlStreamWriter (which > > are only SAX-like). Qt XML Patterns used to have XPath, XSLT, and XML > > Schema, but it's been deprecated a while ago (Qt 5.13 for the last > > wake-up call, but it hasn't been touched since Qt 4, basically) > > > > If LyX is being really serious about XML (i.e. moving as many things as > > possible to XML technologies), Saxon is probably the way to go. > > Otherwise, it's going to be too heavy to ship Saxon and a JVM along with > > LyX. Instead, pugixml seems to me like a good choice: a few features > > (XPath is the most relevant for LyX, and included in the base library, > > no need for addons), good performance, still maintained (there is a > > chance to have bugs fixed in a newer version, plus security > > vulnerabilities taken care of). > Was this addressed in the virtual meeting? As far as I know, it wasn't discussed. > Also, since Xerces-C was the > most feature full and mature after Saxon-C, I was curious as to why you > didn't mention it. > Actually, Xerces-C and Xerces-C++ are just the same thing (the official name being Xerces-C++ and the name of the packages Xerces-C, if I got it correctly). > Anyhow, I think that for a start we'd need only the most basic features > (tag insertion, indent), as was the purpose of #12055 in the first place > (I'm sorry to have opened this pandora's box), so maybe no harm will > come if we start wrapping pugi. > > Let me know what you think, and if this is not the time for this, as > with LyX 2.4 coming out there might be other things that need focus. > It looks like the patches cannot get integrated into the master development branch before 2.4 is out (or at least branched). However, in the meantime, I think I can create a feature branch and push your patches there ( https://www.lyx.org/trac/browser/features). -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: XML stream writer library
Il 08/01/21 03:00, Thibaut Cuvelier ha scritto: A tour of some C++ libraries for XML: - RapidXML: mostly unmaintained since 2013, no support for namespaces (except in forks: https://github.com/dwd/rapidxml <https://github.com/dwd/rapidxml>) - Boost Property Tree: no XML parser, which limits further use (it can use RapidXML though, see above) - libstudxml: C++ library, designed for speed, no DOM - libxml2: C library, designed for features and not speed (also includes XPath and XSLT, DTD and XML Schema, namespaces), "mature" and barely not evolving anymore - libxml++: depends on glibmm2 - Xerces-C++: C++ library, designed for features and not speed (also includes XPath, DTD and XML Schema, namespaces), "mature" and barely not evolving anymore; no XSLT (Xalan could be used, but it only works with a ancient version of Xerces; XQuilla implemented XPath 2, but is no more developed since 2016) - Expat: C library, designed for speed, no DOM by default (provided by https://github.com/kolotsey/expat-dom <https://github.com/kolotsey/expat-dom>), with namespaces - tinyxml2: C++ library, designed for speed only (also includes XPath through the unmaintained https://github.com/stanthomas/tinyxml2-ex <https://github.com/stanthomas/tinyxml2-ex>, no validation, no namespaces), mature and slowly evolving - pugixml: C++ library, designed for speed with a few features (like XPath, no validation, no namespaces), mature and evolving - libroxml: C library, no clear design goal (includes XPath, namespaces, no validation), evolving - Saxon-C: C/C++ wrapper of the state-of-the-art Java library, largest amount of features (XPath and XSLT 3, DTD and XML Schema validation -- extension for RelaxNG: http://www.cfoster.net/saxon-jing/ <http://www.cfoster.net/saxon-jing/> --, namespaces), very mature, really evolving (both performance and features), but it requires a JVM (Excelsior is built-in, even though it's not been maintained for quite a long time) - Qt: no, I was joking :). Qt XML is not supported anymore, it's recommended to switch to QXmlStreamReader and QXmlStreamWriter (which are only SAX-like). Qt XML Patterns used to have XPath, XSLT, and XML Schema, but it's been deprecated a while ago (Qt 5.13 for the last wake-up call, but it hasn't been touched since Qt 4, basically) If LyX is being really serious about XML (i.e. moving as many things as possible to XML technologies), Saxon is probably the way to go. Otherwise, it's going to be too heavy to ship Saxon and a JVM along with LyX. Instead, pugixml seems to me like a good choice: a few features (XPath is the most relevant for LyX, and included in the base library, no need for addons), good performance, still maintained (there is a chance to have bugs fixed in a newer version, plus security vulnerabilities taken care of). Was this addressed in the virtual meeting? Also, since Xerces-C was the most feature full and mature after Saxon-C, I was curious as to why you didn't mention it. Anyhow, I think that for a start we'd need only the most basic features (tag insertion, indent), as was the purpose of #12055 in the first place (I'm sorry to have opened this pandora's box), so maybe no harm will come if we start wrapping pugi. Let me know what you think, and if this is not the time for this, as with LyX 2.4 coming out there might be other things that need focus. -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: XML stream writer library
On Thu, 7 Jan 2021 at 18:23, Thibaut Cuvelier wrote: > On Thu, 7 Jan 2021, 12:52 Lorenzo Bertini, > wrote: > >> I think almost all the options are on the table at this point. For the >> sake of completeness I think it's worth mentioning DOM library Boost >> Property Tree, which popped up frequently while searching. >> >> I think Thibaut is right when saying that, for the way LyX is structured >> now, a SAX writer would be more appropriate, because we won't work on >> xml directly, but convert the LyX file. However most of the libraries >> have a DOM approach, and also, if someday we'll convert LyX format to >> something xml-like, we might have to start all of this again. >> >> I did a small benchmark with pugixml and to both read and write a xml >> document of 2.2Mb of equivalent ~100/120 pages chock full of math: it >> takes negligble time to both read and write on my really modest laptop >> A10-9600). Peak memory consumption was 14Mb, but since some MathML was >> corrupted (it has trouble with backslash \) it's possible it might be >> way less once fixed: LyX consumption opening the corresponding LyX file >> was ~120Mb. The benchmark table in >> < >> http://rapidxml.sourceforge.net/manual.html#namespacerapidxml_1performance> >> >> seems to indicate that pugixml and RapidXML have performance just one >> order greater than strlen, so I don't think parse time will ever be a >> problem. > > > Thanks for your benchmark. For me, the major difference between the two > libraries is that pugixml is still maintained, but not really RapidXML. And > XML parsing is very often a source of security problems (not just XXE). > > I'm unfamiliar with the concept of "wrapping" libraries and "layers": is >> it when you write your own classes and methods on top of some common >> stuff those libraries do, so if for whatever reason you have to switch >> you can "plug" another easily? >> > > Yes, exactly. > Below is my take on https://stackoverflow.com/questions/9387610/what-xml-parser-should-i-use-in-c and https://github.com/fffaraz/awesome-cpp#xml XPath would be very useful if LyX switches to an XML representation (easy queries on an XML document, think of SQL for XML). XSLT is a way to describe transformations from XML to anything. If LyX switches to an XML representation, it might be used to replace C++ exporters (but formula conversion will be a pain!). It might lower the entry bar for new contributors, even though XSLT is not an easy language. XQuery is a script language for XML processes. Apart from Java libraries, only versions 1.0 are implemented: apart from XPath, it really limits their use… A state-of-the-art implementation of the current norms is Saxon, which has a C binding. To allow for validation of XML files (i.e. check they respect some grammar), DTD is the oldest way (inherited from SGML), XML Schema adds many features over DTD (like types). The best technology nowadays is RelaxNG (it's not recent: 2005), which is much more powerful than XML Schema. XInclude is the XML way of specifying includes of other files (not necessarily XML). Think \input in LaTeX or LyX child documents with a few more features. Name spaces are similar to those of C++, and are especially useful when mixing several standards (like MathML and DocBook). A tour of some C++ libraries for XML: - RapidXML: mostly unmaintained since 2013, no support for namespaces (except in forks: https://github.com/dwd/rapidxml) - Boost Property Tree: no XML parser, which limits further use (it can use RapidXML though, see above) - libstudxml: C++ library, designed for speed, no DOM - libxml2: C library, designed for features and not speed (also includes XPath and XSLT, DTD and XML Schema, namespaces), "mature" and barely not evolving anymore - libxml++: depends on glibmm2 - Xerces-C++: C++ library, designed for features and not speed (also includes XPath, DTD and XML Schema, namespaces), "mature" and barely not evolving anymore; no XSLT (Xalan could be used, but it only works with a ancient version of Xerces; XQuilla implemented XPath 2, but is no more developed since 2016) - Expat: C library, designed for speed, no DOM by default (provided by https://github.com/kolotsey/expat-dom), with namespaces - tinyxml2: C++ library, designed for speed only (also includes XPath through the unmaintained https://github.com/stanthomas/tinyxml2-ex, no validation, no namespaces), mature and slowly evolving - pugixml: C++ library, designed for speed with a few features (like XPath, no validation, no namespaces), mature and evolving - libroxml: C library, no clear design goal (includes XPath, namespaces, no validation), evolving - Saxon-C: C/C++ wrapper of the state-of-the-art Java library, largest amount of fea
Re: XML stream writer library
On Thu, 7 Jan 2021, 12:52 Lorenzo Bertini, wrote: > I think almost all the options are on the table at this point. For the > sake of completeness I think it's worth mentioning DOM library Boost > Property Tree, which popped up frequently while searching. > > I think Thibaut is right when saying that, for the way LyX is structured > now, a SAX writer would be more appropriate, because we won't work on > xml directly, but convert the LyX file. However most of the libraries > have a DOM approach, and also, if someday we'll convert LyX format to > something xml-like, we might have to start all of this again. > > I did a small benchmark with pugixml and to both read and write a xml > document of 2.2Mb of equivalent ~100/120 pages chock full of math: it > takes negligble time to both read and write on my really modest laptop > A10-9600). Peak memory consumption was 14Mb, but since some MathML was > corrupted (it has trouble with backslash \) it's possible it might be > way less once fixed: LyX consumption opening the corresponding LyX file > was ~120Mb. The benchmark table in > < > http://rapidxml.sourceforge.net/manual.html#namespacerapidxml_1performance> > > seems to indicate that pugixml and RapidXML have performance just one > order greater than strlen, so I don't think parse time will ever be a > problem. Thanks for your benchmark. For me, the major difference between the two libraries is that pugixml is still maintained, but not really RapidXML. And XML parsing is very often a source of security problems (not just XXE). I'm unfamiliar with the concept of "wrapping" libraries and "layers": is > it when you write your own classes and methods on top of some common > stuff those libraries do, so if for whatever reason you have to switch > you can "plug" another easily? > Yes, exactly. > -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: XML stream writer library
I think almost all the options are on the table at this point. For the sake of completeness I think it's worth mentioning DOM library Boost Property Tree, which popped up frequently while searching. I think Thibaut is right when saying that, for the way LyX is structured now, a SAX writer would be more appropriate, because we won't work on xml directly, but convert the LyX file. However most of the libraries have a DOM approach, and also, if someday we'll convert LyX format to something xml-like, we might have to start all of this again. I did a small benchmark with pugixml and to both read and write a xml document of 2.2Mb of equivalent ~100/120 pages chock full of math: it takes negligble time to both read and write on my really modest laptop A10-9600). Peak memory consumption was 14Mb, but since some MathML was corrupted (it has trouble with backslash \) it's possible it might be way less once fixed: LyX consumption opening the corresponding LyX file was ~120Mb. The benchmark table in <http://rapidxml.sourceforge.net/manual.html#namespacerapidxml_1performance> seems to indicate that pugixml and RapidXML have performance just one order greater than strlen, so I don't think parse time will ever be a problem. I'm unfamiliar with the concept of "wrapping" libraries and "layers": is it when you write your own classes and methods on top of some common stuff those libraries do, so if for whatever reason you have to switch you can "plug" another easily? Thanks, Lo. -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: XML stream writer library
On Tue, 5 Jan 2021 at 10:37, Joel Kulesza wrote: > On Tue, Jan 5, 2021 at 1:19 AM Pavel Sanda wrote: > >> On Mon, Jan 04, 2021 at 09:48:42PM +0100, Thibaut Cuvelier wrote: >> > There are multiple issues here. What is needed to generate HTML and >> DocBook >> > is a simple SAX writer, not a parser. I've done plenty of research about >> > it, there's no XML library that does that. Most of them are using a DOM, >> > which is a total waste of memory for such an application: it stores a >> > complete XML tree in memory before serialising it. With SAX, you just >> need >> > a string backend, which is much more lightweight (by several factors). >> >> After little bit more thinking, is using DOM actually that big issue? >> I mean how much it takes - for document of length n its O(n) in space? >> >> Sure, it might be cut to constant, but practically speaking when you have >> 100 pages document what is the real time/memory consumption. Timewise >> you spent 1s in XML compared to next 30s in conversion figures to pdf or >> whatever format? Spacewise probably one more time than what we >> already allocated for document itself. >> >> If using more heavy-weight caliber xml lib is not pain from API point >> of view (and I do not know, you are the expert here) then we might >> actually consider it, given the difficulties in SAX space? >> > > I had a similar thought and will note that I've had good success on other > projects with pugixml. > It's typical to have a DOM tree that is two to five times larger than the raw text, that's not always negligible (Xerces is close to 2, Java implementations anywhere between 2 and 5, I haven't checked pugixml or TinyXML2 for this specific criterion). But that's not the real issue: for generating HTML and DocBook, for now, DOM is not so useful from a developer point of view, DOM is more suitable to handle an existing document or to modify it, not really to generate one from scratch. A SAX writer is really what's the most appropriate, given the way LyX is internally structured: there is very little need to go backward when generating the file (e.g., add something to the header when encountering some LyX inset). Using DOM will not really simplify the code (I'm speaking for the DocBook export, which is highly similar to HTML). However, it might make its logic easier to understand for a newcomer. Nevertheless, DOM comes with more complex syntax: with SAX, you are only appending content to the file, with only strings; with DOM, you have to indicate where you want to write something (with methods like InsetEndChild), and you pass around complete XML nodes (built from the same strings). More specifically, in SAX (where stream is mostly a large string object with helper methods): stream.writeStartTag("tag"); With DOM, taking the example of TinyXML2 (where document is the root of the DOM tree and node the node in the tree that is being filled): node->InsertEndChild( document->NewElement("tag") ); Both are perfectly good choices, though. If we write a thin layer on top of a DOM writer (as Riki suggested, this would allow decoupling with the actual XML library), we might be able to have a syntax close to that of SAX while having the extra flexibility of DOM. This way, the LyX code would be clean, and avoid current intricacies to output things at the right place (in DocBook, especially the tag). More specifically, @Pavel: for DocBook, you spend 0% of your time dealing with images, as it's supposed to be done by the DocBook processor afterwards. Any gain in the XML part of LyX will be noticeable by the user for large documents (book-sized). (And I won't say that something being O(n) is negligible in this case: I'm using daily exponential-time algorithms that work so much faster than polynomial-time ones…) -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: XML stream writer library
On Tue, Jan 5, 2021 at 1:19 AM Pavel Sanda wrote: > On Mon, Jan 04, 2021 at 09:48:42PM +0100, Thibaut Cuvelier wrote: > > There are multiple issues here. What is needed to generate HTML and > DocBook > > is a simple SAX writer, not a parser. I've done plenty of research about > > it, there's no XML library that does that. Most of them are using a DOM, > > which is a total waste of memory for such an application: it stores a > > complete XML tree in memory before serialising it. With SAX, you just > need > > a string backend, which is much more lightweight (by several factors). > > After little bit more thinking, is using DOM actually that big issue? > I mean how much it takes - for document of length n its O(n) in space? > > Sure, it might be cut to constant, but practically speaking when you have > 100 pages document what is the real time/memory consumption. Timewise > you spent 1s in XML compared to next 30s in conversion figures to pdf or > whatever format? Spacewise probably one more time than what we > already allocated for document itself. > > If using more heavy-weight caliber xml lib is not pain from API point > of view (and I do not know, you are the expert here) then we might > actually consider it, given the difficulties in SAX space? > I had a similar thought and will note that I've had good success on other projects with pugixml. Regards, Joel -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: XML stream writer library
On Mon, Jan 04, 2021 at 09:48:42PM +0100, Thibaut Cuvelier wrote: > There are multiple issues here. What is needed to generate HTML and DocBook > is a simple SAX writer, not a parser. I've done plenty of research about > it, there's no XML library that does that. Most of them are using a DOM, > which is a total waste of memory for such an application: it stores a > complete XML tree in memory before serialising it. With SAX, you just need > a string backend, which is much more lightweight (by several factors). After little bit more thinking, is using DOM actually that big issue? I mean how much it takes - for document of length n its O(n) in space? Sure, it might be cut to constant, but practically speaking when you have 100 pages document what is the real time/memory consumption. Timewise you spent 1s in XML compared to next 30s in conversion figures to pdf or whatever format? Spacewise probably one more time than what we already allocated for document itself. If using more heavy-weight caliber xml lib is not pain from API point of view (and I do not know, you are the expert here) then we might actually consider it, given the difficulties in SAX space? Pavel -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: XML stream writer library
On 1/4/21 5:10 PM, Pavel Sanda wrote: > On Mon, Jan 04, 2021 at 09:48:42PM +0100, Thibaut Cuvelier wrote: >> My recommendation, based on a quite long study of XML libraries (i.e. >> several years, but quite far from full-time): either use QXmlStreamWriter >> (which is mostly a SAX implementation in C++) or write our own. >> QXmlStreamWriter is almost 4k-line long, but it can substantially be >> simplified in our case ( >> https://github.com/qt/qtbase/blob/54875be84de059374920e4c0deacd13a41caaa13/src/corelib/serialization/qxmlstream.cpp). >> >> >> TinyXML2 (https://github.com/leethomason/tinyxml2), pugixml ( >> https://github.com/zeux/pugixml), and Xerces-C++ ( >> https://xerces.apache.org/xerces-c/) are only DOM-based. There are quite a >> few C libraries, like libxml2, that can be SAX-like, but C libraries are >> horrible to use (http://www.xmlsoft.org/examples/testWriter.c). I did some searching and, yes, I see the problem. Word is that recent versions of libxml and libxml2 have dependencies on Gnome libraries that we don't want. I'll let you know if I get any answers to my question on the Fedora list. > I do not dare to make any qualified recommendation between the choices > above. But thinking aloud -- if there de facto isn't an alternative > to QXmlStreamWriter, would it be hard to separate that class from > the rest of Qt, fork and include it as an internal lyx routine? > We would have full control over that code without unnecessary surprises > of Qt's development. I was going to suggest something in this spirit. If, as our usual policy has been, we confine QXmlStreamWrapper to support/, then what that basically means is writing our own LyX API as a kind of wrapper around the Qt stuff. (Thibaut, if you haven't already, you might look at how the FileName class. Much of it is a wrapper around QFile.) Some, even many, of the routines might just directly call the Qt equivalent (probably after a call to toqstr, from qstring_helpers). This would be a relatively quick way to get something that worked and was easy to use, and work on adapting DocBook and HTML export to this code could proceed. At that point, we could then write our own XML backend, possibly adapting it from the Qt code. There are quite a few dependencies there, but I'll guess some of them we do not need (e.g., the QApplication and QFile dependencies). Our we build a lightweight library from scratch. (It does seem like maybe there's a general need for that.) With the already functioning backend from QXmlStreamWrapper, it would be easy to test our own code and make sure it was producing the same output. Riki -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: XML stream writer library
On Mon, Jan 04, 2021 at 09:48:42PM +0100, Thibaut Cuvelier wrote: > My recommendation, based on a quite long study of XML libraries (i.e. > several years, but quite far from full-time): either use QXmlStreamWriter > (which is mostly a SAX implementation in C++) or write our own. > QXmlStreamWriter is almost 4k-line long, but it can substantially be > simplified in our case ( > https://github.com/qt/qtbase/blob/54875be84de059374920e4c0deacd13a41caaa13/src/corelib/serialization/qxmlstream.cpp). > > > TinyXML2 (https://github.com/leethomason/tinyxml2), pugixml ( > https://github.com/zeux/pugixml), and Xerces-C++ ( > https://xerces.apache.org/xerces-c/) are only DOM-based. There are quite a > few C libraries, like libxml2, that can be SAX-like, but C libraries are > horrible to use (http://www.xmlsoft.org/examples/testWriter.c). I do not dare to make any qualified recommendation between the choices above. But thinking aloud -- if there de facto isn't an alternative to QXmlStreamWriter, would it be hard to separate that class from the rest of Qt, fork and include it as an internal lyx routine? We would have full control over that code without unnecessary surprises of Qt's development. Pavel -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: XML stream writer library
TinyXML2 (https://github.com/leethomason/tinyxml2), pugixml ( https://github.com/zeux/pugixml), and Xerces-C++ ( https://xerces.apache.org/xerces-c/) are only DOM-based. There are quite a few C libraries, like libxml2, that can be SAX-like, but C libraries are horrible to use (http://www.xmlsoft.org/examples/testWriter.c). There are several C++ wrappers for libxml2 on GitHub. Maybe they can be useful: https://github.com/libxmlplusplus/libxmlplusplus https://github.com/rioki/libxmlmm Yuriy -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: XML stream writer library
On Mon, 4 Jan 2021 at 20:30, Richard Kimberly Heck wrote: > On 1/3/21 3:37 PM, Lorenzo Bertini wrote: > > Hello list, > > In 12055 <https://www.lyx.org/trac/ticket/12055>, discussing the merge of > some MathMLStream and XmlStream components, we were contemplating the > possibility of using an external library to handle XML streams, for example > with indentation and tag insertion. One of the candidates was > QXmlStreamWriter <https://doc.qt.io/qt-5/qxmlstreamwriter.html> class, > but with the talk about removing unnecessary Qt components we thought to > ask the list. > > Lest us know what do you think it's the best course, and if you know of > other libraries we should look. > > As I mention in the bug, I looked over various XML libraries a while ago, > when I was thinking about the long-standing idea of converting LyX's own > format to XML. There seemed to be a myriad of options, and I never settled > upon one. But it looks like there's a general feeling that we don't want to > get too married to Qt---any more than we already are. That is in part > because Qt seems to break itself fairly frequently (especially on OSX) and > partly because they keep changing their attitude towards open source. There > was some thing not long ago about how recent updates would only be > available to paid subscribers right away, or something like that. > > So I'd generally suggest searching around for good, well-maintained XML > libraries, maybe asking on Stack Exchange what people like. I'll send an > email to the Fedora list and see what suggestions pop up. > There are multiple issues here. What is needed to generate HTML and DocBook is a simple SAX writer, not a parser. I've done plenty of research about it, there's no XML library that does that. Most of them are using a DOM, which is a total waste of memory for such an application: it stores a complete XML tree in memory before serialising it. With SAX, you just need a string backend, which is much more lightweight (by several factors). In this case, as the content is generated without ever looking back, SAX is the best choice. You have more choices in the Java world, and the standard library is often enough (well, the standard extensions javax and JAXP). If you need a good XML tool, chances are it will be written in Java, especially if it's open source (Saxon for XSLT or XQuery, eXist or MarkLogic for XML database). On the other hand, if you want to represent a complete LyX document and work on it, you'd rather go for DOM, as you will always have the whole structure in memory: you may want to edit things at any point in the document. (Unless there is never an operation on the file structures, and only on the set of insets of the document) My recommendation, based on a quite long study of XML libraries (i.e. several years, but quite far from full-time): either use QXmlStreamWriter (which is mostly a SAX implementation in C++) or write our own. QXmlStreamWriter is almost 4k-line long, but it can substantially be simplified in our case ( https://github.com/qt/qtbase/blob/54875be84de059374920e4c0deacd13a41caaa13/src/corelib/serialization/qxmlstream.cpp). TinyXML2 (https://github.com/leethomason/tinyxml2), pugixml ( https://github.com/zeux/pugixml), and Xerces-C++ ( https://xerces.apache.org/xerces-c/) are only DOM-based. There are quite a few C libraries, like libxml2, that can be SAX-like, but C libraries are horrible to use (http://www.xmlsoft.org/examples/testWriter.c). -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: XML stream writer library
On 1/3/21 3:37 PM, Lorenzo Bertini wrote: > > Hello list, > > In 12055 <https://www.lyx.org/trac/ticket/12055>, discussing the merge > of some MathMLStream and XmlStream components, we were contemplating > the possibility of using an external library to handle XML streams, > for example with indentation and tag insertion. One of the candidates > was QXmlStreamWriter <https://doc.qt.io/qt-5/qxmlstreamwriter.html> > class, but with the talk about removing unnecessary Qt components we > thought to ask the list. > > Lest us know what do you think it's the best course, and if you know > of other libraries we should look. > As I mention in the bug, I looked over various XML libraries a while ago, when I was thinking about the long-standing idea of converting LyX's own format to XML. There seemed to be a myriad of options, and I never settled upon one. But it looks like there's a general feeling that we don't want to get too married to Qt---any more than we already are. That is in part because Qt seems to break itself fairly frequently (especially on OSX) and partly because they keep changing their attitude towards open source. There was some thing not long ago about how recent updates would only be available to paid subscribers right away, or something like that. So I'd generally suggest searching around for good, well-maintained XML libraries, maybe asking on Stack Exchange what people like. I'll send an email to the Fedora list and see what suggestions pop up. Riki -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
XML stream writer library
Hello list, In 12055 <https://www.lyx.org/trac/ticket/12055>, discussing the merge of some MathMLStream and XmlStream components, we were contemplating the possibility of using an external library to handle XML streams, for example with indentation and tag insertion. One of the candidates was QXmlStreamWriter <https://doc.qt.io/qt-5/qxmlstreamwriter.html> class, but with the talk about removing unnecessary Qt components we thought to ask the list. Lest us know what do you think it's the best course, and if you know of other libraries we should look. Lo (lynx in trac). -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: [LyX/master] DocBook: fix XML in comments (-- forbidden for some historical reason).
On 7/31/20 10:07 PM, Thibaut Cuvelier wrote: > On Sat, 1 Aug 2020 at 03:51, Thibaut Cuvelier <mailto:tcuvel...@lyx.org>> wrote: > > On Sat, 1 Aug 2020 at 01:01, Kornel Benko <mailto:kor...@lyx.org>> wrote: > > Am Sat, 1 Aug 2020 00:42:20 +0200 > schrieb Thibaut Cuvelier <mailto:tcuvel...@lyx.org>>: > > > On Sat, 1 Aug 2020 at 00:28, Kornel Benko <mailto:kor...@lyx.org>> wrote: > > > > > Am Fri, 31 Jul 2020 23:37:33 +0200 (CEST) > > > schrieb Thibaut Cuvelier <mailto:tcuvel...@lyx.org>>: > > > > > > > commit 85946aae2b94fedf5ce9bd35e91ba500986b5121 > > > > Author: Thibaut Cuvelier <mailto:tcuvel...@lyx.org>> > > > > Date: Sat Aug 1 00:02:36 2020 +0200 > > > > > > > > DocBook: fix XML in comments (-- forbidden for some > historical > > > reason). > > > > --- > > > > autotests/export/docbook/deutsches_ert.lyx | 14 +- > > > > autotests/export/docbook/deutsches_ert.xml | 2 +- > > > > src/insets/InsetERT.cpp | 14 -- > > > > src/xml.cpp | 80 > > > ++- > > > > src/xml.h | 5 +- > > > > 5 files changed, 68 insertions(+), 47 deletions(-) > > > > > > Cannot compile ... > > > I get now > > > /usr2/src/lyx/lyx-git/src/xml.cpp:78:79: error: explicit > qualification in > > > declaration of > > > ‘lyx::docstring lyx::xml::escapeString(const docstring&, > > > lyx::XMLStream::EscapeSettings)’ > > > docstring xml::escapeString(docstring const & raw, > > > XMLStream::EscapeSettings e) ^ > > > src/CMakeFiles/lyx2.4.dir/build.make:1441: recipe for target > > > > 'src/CMakeFiles/lyx2.4.dir/usr2/src/lyx/lyx-git/src/xml.cpp.o' > failed > > > > > > > Is the new version better? > > Looks good :) > Now the remaining problematic case is > 3718 - > > > export/examples/Articles/American_Astronomical_Society_%28AASTeX_v._6.2%29_docbook5 > (Failed) > ('\'' in param) > ... > Newell, E. B., > and O'Neil, E. J. 1978, > , 37, 27 xml:id='Ortolani et al. > (1985)'>Ortolani, S., Rosino, L., and Sandage, A. 1985, , 90, 473 > > ... > > (Mark, this is from the examples directory (not templates dir > with the same filename)) > > > That one is fixed too, thanks! > > > Still on that American_Astronomical_Society test case, I try to > improve things a bit. The main choke point for now is appendices: why > is there a layout Appendix, which does not work like the usual "start > of appendix". Maybe the constructor ofParagraphParameters should just > check whether the LaTeX command is \appendix? Just to chime in: Great work, both of you. I hope to be able to adapt some of this to the XHTML output once it stabilizes. (Mostly the font stuff.)( Riki -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: [LyX/master] DocBook: fix XML in comments (-- forbidden for some historical reason).
On Sat, 1 Aug 2020 at 03:51, Thibaut Cuvelier wrote: > On Sat, 1 Aug 2020 at 01:01, Kornel Benko wrote: > >> Am Sat, 1 Aug 2020 00:42:20 +0200 >> schrieb Thibaut Cuvelier : >> >> > On Sat, 1 Aug 2020 at 00:28, Kornel Benko wrote: >> > >> > > Am Fri, 31 Jul 2020 23:37:33 +0200 (CEST) >> > > schrieb Thibaut Cuvelier : >> > > >> > > > commit 85946aae2b94fedf5ce9bd35e91ba500986b5121 >> > > > Author: Thibaut Cuvelier >> > > > Date: Sat Aug 1 00:02:36 2020 +0200 >> > > > >> > > > DocBook: fix XML in comments (-- forbidden for some historical >> > > reason). >> > > > --- >> > > > autotests/export/docbook/deutsches_ert.lyx | 14 +- >> > > > autotests/export/docbook/deutsches_ert.xml |2 +- >> > > > src/insets/InsetERT.cpp| 14 -- >> > > > src/xml.cpp| 80 >> > > ++- >> > > > src/xml.h |5 +- >> > > > 5 files changed, 68 insertions(+), 47 deletions(-) >> > > >> > > Cannot compile ... >> > > I get now >> > > /usr2/src/lyx/lyx-git/src/xml.cpp:78:79: error: explicit >> qualification in >> > > declaration of >> > > ‘lyx::docstring lyx::xml::escapeString(const docstring&, >> > > lyx::XMLStream::EscapeSettings)’ >> > > docstring xml::escapeString(docstring const & raw, >> > > XMLStream::EscapeSettings e) ^ >> > > src/CMakeFiles/lyx2.4.dir/build.make:1441: recipe for target >> > > 'src/CMakeFiles/lyx2.4.dir/usr2/src/lyx/lyx-git/src/xml.cpp.o' failed >> > > >> > >> > Is the new version better? >> >> Looks good :) >> Now the remaining problematic case is >> 3718 - >> >> export/examples/Articles/American_Astronomical_Society_%28AASTeX_v._6.2%29_docbook5 >> (Failed) >> ('\'' in param) >> ... >> Newell, E. B., and O'Neil, >> E. J. 1978, >> , 37, 27 Ortolani, S., Rosino, L., and Sandage, A. 1985, , 90, >> 473 >> >> ... >> >> (Mark, this is from the examples directory (not templates dir with the >> same filename)) >> > > That one is fixed too, thanks! > Still on that American_Astronomical_Society test case, I try to improve things a bit. The main choke point for now is appendices: why is there a layout Appendix, which does not work like the usual "start of appendix". Maybe the constructor of ParagraphParameters should just check whether the LaTeX command is \appendix? -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: [LyX/master] DocBook: fix XML in comments (-- forbidden for some historical reason).
On Sat, 1 Aug 2020 at 01:01, Kornel Benko wrote: > Am Sat, 1 Aug 2020 00:42:20 +0200 > schrieb Thibaut Cuvelier : > > > On Sat, 1 Aug 2020 at 00:28, Kornel Benko wrote: > > > > > Am Fri, 31 Jul 2020 23:37:33 +0200 (CEST) > > > schrieb Thibaut Cuvelier : > > > > > > > commit 85946aae2b94fedf5ce9bd35e91ba500986b5121 > > > > Author: Thibaut Cuvelier > > > > Date: Sat Aug 1 00:02:36 2020 +0200 > > > > > > > > DocBook: fix XML in comments (-- forbidden for some historical > > > reason). > > > > --- > > > > autotests/export/docbook/deutsches_ert.lyx | 14 +- > > > > autotests/export/docbook/deutsches_ert.xml |2 +- > > > > src/insets/InsetERT.cpp| 14 -- > > > > src/xml.cpp| 80 > > > ++- > > > > src/xml.h |5 +- > > > > 5 files changed, 68 insertions(+), 47 deletions(-) > > > > > > Cannot compile ... > > > I get now > > > /usr2/src/lyx/lyx-git/src/xml.cpp:78:79: error: explicit qualification > in > > > declaration of > > > ‘lyx::docstring lyx::xml::escapeString(const docstring&, > > > lyx::XMLStream::EscapeSettings)’ > > > docstring xml::escapeString(docstring const & raw, > > > XMLStream::EscapeSettings e) ^ > > > src/CMakeFiles/lyx2.4.dir/build.make:1441: recipe for target > > > 'src/CMakeFiles/lyx2.4.dir/usr2/src/lyx/lyx-git/src/xml.cpp.o' failed > > > > > > > Is the new version better? > > Looks good :) > Now the remaining problematic case is > 3718 - > > export/examples/Articles/American_Astronomical_Society_%28AASTeX_v._6.2%29_docbook5 > (Failed) > ('\'' in param) > ... > Newell, E. B., and O'Neil, > E. J. 1978, > , 37, 27 Ortolani, S., Rosino, L., and Sandage, A. 1985, , 90, > 473 > > ... > > (Mark, this is from the examples directory (not templates dir with the > same filename)) > That one is fixed too, thanks! -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: [LyX/master] DocBook: fix XML in comments (-- forbidden for some historical reason).
Am Sat, 1 Aug 2020 00:42:20 +0200 schrieb Thibaut Cuvelier : > On Sat, 1 Aug 2020 at 00:28, Kornel Benko wrote: > > > Am Fri, 31 Jul 2020 23:37:33 +0200 (CEST) > > schrieb Thibaut Cuvelier : > > > > > commit 85946aae2b94fedf5ce9bd35e91ba500986b5121 > > > Author: Thibaut Cuvelier > > > Date: Sat Aug 1 00:02:36 2020 +0200 > > > > > > DocBook: fix XML in comments (-- forbidden for some historical > > reason). > > > --- > > > autotests/export/docbook/deutsches_ert.lyx | 14 +- > > > autotests/export/docbook/deutsches_ert.xml |2 +- > > > src/insets/InsetERT.cpp| 14 -- > > > src/xml.cpp| 80 > > ++- > > > src/xml.h |5 +- > > > 5 files changed, 68 insertions(+), 47 deletions(-) > > > > Cannot compile ... > > I get now > > /usr2/src/lyx/lyx-git/src/xml.cpp:78:79: error: explicit qualification in > > declaration of > > ‘lyx::docstring lyx::xml::escapeString(const docstring&, > > lyx::XMLStream::EscapeSettings)’ > > docstring xml::escapeString(docstring const & raw, > > XMLStream::EscapeSettings e) ^ > > src/CMakeFiles/lyx2.4.dir/build.make:1441: recipe for target > > 'src/CMakeFiles/lyx2.4.dir/usr2/src/lyx/lyx-git/src/xml.cpp.o' failed > > > > Is the new version better? Looks good :) Now the remaining problematic case is 3718 - export/examples/Articles/American_Astronomical_Society_%28AASTeX_v._6.2%29_docbook5 (Failed) ('\'' in param) ... Newell, E. B., and O'Neil, E. J. 1978, , 37, 27 Ortolani, S., Rosino, L., and Sandage, A. 1985, , 90, 473 ... (Mark, this is from the examples directory (not templates dir with the same filename)) Kornel pgpliMT0FCL1g.pgp Description: Digitale Signatur von OpenPGP -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: [LyX/master] DocBook: fix XML in comments (-- forbidden for some historical reason).
On Sat, 1 Aug 2020 at 00:28, Kornel Benko wrote: > Am Fri, 31 Jul 2020 23:37:33 +0200 (CEST) > schrieb Thibaut Cuvelier : > > > commit 85946aae2b94fedf5ce9bd35e91ba500986b5121 > > Author: Thibaut Cuvelier > > Date: Sat Aug 1 00:02:36 2020 +0200 > > > > DocBook: fix XML in comments (-- forbidden for some historical > reason). > > --- > > autotests/export/docbook/deutsches_ert.lyx | 14 +- > > autotests/export/docbook/deutsches_ert.xml |2 +- > > src/insets/InsetERT.cpp| 14 -- > > src/xml.cpp| 80 > ++- > > src/xml.h |5 +- > > 5 files changed, 68 insertions(+), 47 deletions(-) > > Cannot compile ... > I get now > /usr2/src/lyx/lyx-git/src/xml.cpp:78:79: error: explicit qualification in > declaration of > ‘lyx::docstring lyx::xml::escapeString(const docstring&, > lyx::XMLStream::EscapeSettings)’ > docstring xml::escapeString(docstring const & raw, > XMLStream::EscapeSettings e) ^ > src/CMakeFiles/lyx2.4.dir/build.make:1441: recipe for target > 'src/CMakeFiles/lyx2.4.dir/usr2/src/lyx/lyx-git/src/xml.cpp.o' failed > Is the new version better? -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: [LyX/master] DocBook: fix XML in comments (-- forbidden for some historical reason).
Am Fri, 31 Jul 2020 23:37:33 +0200 (CEST) schrieb Thibaut Cuvelier : > commit 85946aae2b94fedf5ce9bd35e91ba500986b5121 > Author: Thibaut Cuvelier > Date: Sat Aug 1 00:02:36 2020 +0200 > > DocBook: fix XML in comments (-- forbidden for some historical reason). > --- > autotests/export/docbook/deutsches_ert.lyx | 14 +- > autotests/export/docbook/deutsches_ert.xml |2 +- > src/insets/InsetERT.cpp| 14 -- > src/xml.cpp| 80 ++- > src/xml.h |5 +- > 5 files changed, 68 insertions(+), 47 deletions(-) Cannot compile ... I get now /usr2/src/lyx/lyx-git/src/xml.cpp:78:79: error: explicit qualification in declaration of ‘lyx::docstring lyx::xml::escapeString(const docstring&, lyx::XMLStream::EscapeSettings)’ docstring xml::escapeString(docstring const & raw, XMLStream::EscapeSettings e) ^ src/CMakeFiles/lyx2.4.dir/build.make:1441: recipe for target 'src/CMakeFiles/lyx2.4.dir/usr2/src/lyx/lyx-git/src/xml.cpp.o' failed Kornel pgpKQEuAD64SK.pgp Description: Digitale Signatur von OpenPGP -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
Guillaume just made me notice a limitation of this patch for MathFactory: with \def, the column for the XML entity was not parsed. It's fixed in this new stand-alone patch (along with small updates in lib/symbols for things I forgot). Thibaut Cuvelier On Thu, 18 Jun 2020 at 23:40, Pavel Sanda wrote: > On Thu, Jun 18, 2020 at 11:35:39PM +0200, Thibaut Cuvelier wrote: > > indicating that no real translation was available. I highly suspect that > > \varGamma was displayed as "varGamma", i.e. as text, instead of the right > > symbol. As far as I know, there is no HTML entity for rare symbols like > > \varGamma. > > Yep, that's the case. With your patch it gets unicode instead of 'varGamma' > and it actually seem to work on less antiquated systems than I have :) > > Pavel > -- > lyx-devel mailing list > lyx-devel@lists.lyx.org > http://lists.lyx.org/mailman/listinfo/lyx-devel > 0006-Fix-in-symbols-handling-parse-the-XML-entity-with-de.patch Description: Binary data -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Thu, Jun 18, 2020 at 11:35:39PM +0200, Thibaut Cuvelier wrote: > indicating that no real translation was available. I highly suspect that > \varGamma was displayed as "varGamma", i.e. as text, instead of the right > symbol. As far as I know, there is no HTML entity for rare symbols like > \varGamma. Yep, that's the case. With your patch it gets unicode instead of 'varGamma' and it actually seem to work on less antiquated systems than I have :) Pavel -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
That effect was not intended: few new Unicode-based XML entities should be output in HTML. I therefore investigated further the case of varGamma. Before, it was simply not translated into MathML: the symbols file only contained an x ( https://github.com/cburschka/lyx/blob/57272e837b148975817440bdc6a66b9935fa00a3/lib/symbols#L753), indicating that no real translation was available. I highly suspect that \varGamma was displayed as "varGamma", i.e. as text, instead of the right symbol. As far as I know, there is no HTML entity for rare symbols like \varGamma. (By the way, I could not make a formula with \varGamma in MathType, so I can't really compare.) Thibaut Cuvelier On Thu, 18 Jun 2020 at 22:03, Pavel Sanda wrote: > On Thu, Jun 18, 2020 at 09:41:10PM +0200, Kornel Benko wrote: > > > Mint 19.3 Tara. > > > > Tricia! > > Ok, debian based distros seem to be covered, I just checked with another > win machine > and its fine. I'll comit the last patch. > Pavel > -- > lyx-devel mailing list > lyx-devel@lists.lyx.org > http://lists.lyx.org/mailman/listinfo/lyx-devel > -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Thu, Jun 18, 2020 at 09:41:10PM +0200, Kornel Benko wrote: > > Mint 19.3 Tara. > > Tricia! Ok, debian based distros seem to be covered, I just checked with another win machine and its fine. I'll comit the last patch. Pavel -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
Am Thu, 18 Jun 2020 21:40:25 +0200 schrieb Kornel Benko : > Am Thu, 18 Jun 2020 21:24:45 +0200 > schrieb Pavel Sanda : > > > On Thu, Jun 18, 2020 at 09:08:07PM +0200, Kornel Benko wrote: > > > I have no problems, if that is the expected line. > > > > Good, what distro? Pavel > > Mint 19.3 Tara. Tricia! > Kornel pgpPnc8qT94Av.pgp Description: Digitale Signatur von OpenPGP -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
Am Thu, 18 Jun 2020 21:24:45 +0200 schrieb Pavel Sanda : > On Thu, Jun 18, 2020 at 09:08:07PM +0200, Kornel Benko wrote: > > I have no problems, if that is the expected line. > > Good, what distro? Pavel Mint 19.3 Tara. Kornel pgpMkJYoL8IYI.pgp Description: Digitale Signatur von OpenPGP -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Thu, Jun 18, 2020 at 09:08:07PM +0200, Kornel Benko wrote: > I have no problems, if that is the expected line. Good, what distro? Pavel -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
Am Thu, 18 Jun 2020 21:08:07 +0200 schrieb Kornel Benko : > Am Thu, 18 Jun 2020 20:54:08 +0200 > schrieb Pavel Sanda : > > > On Thu, Jun 18, 2020 at 08:46:08PM +0200, Kornel Benko wrote: > > > > > If you go to > > > > > https://en.wikipedia.org/wiki/Mathematical_Alphanumeric_Symbols > > > > > and check for the U+1D6Ex line do you see some meaningful symbols > > > > > like \varGamma in your browsers? > > > Sorry, I misread the question. Dont know about fonts used by browsers. > > > Are they the > > > same as the installed system fonts? > > > > Simply launch in your browser the page on the link above and check whether > > you see > > reasonable characters at line U+1D6Ex > > > > I have many fonts you listed, but my browser doesn't show a thing. > > > > Pavel > > I have no problems, if that is the expected line. > > Kornel Just checked my firefox fonts ... I see there as standard font: DejaVu Serif (in about:preferences) Kornel pgpeNTTQYhwUg.pgp Description: Digitale Signatur von OpenPGP -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
Am Thu, 18 Jun 2020 20:54:08 +0200 schrieb Pavel Sanda : > On Thu, Jun 18, 2020 at 08:46:08PM +0200, Kornel Benko wrote: > > > > If you go to > > > > https://en.wikipedia.org/wiki/Mathematical_Alphanumeric_Symbols > > > > and check for the U+1D6Ex line do you see some meaningful symbols > > > > like \varGamma in your browsers? > > Sorry, I misread the question. Dont know about fonts used by browsers. Are > > they the > > same as the installed system fonts? > > Simply launch in your browser the page on the link above and check whether > you see > reasonable characters at line U+1D6Ex > > I have many fonts you listed, but my browser doesn't show a thing. > > Pavel I have no problems, if that is the expected line. Kornel pgpsTMtqcM3MH.pgp Description: Digitale Signatur von OpenPGP -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Thu, Jun 18, 2020 at 08:46:08PM +0200, Kornel Benko wrote: > > > If you go to > > > https://en.wikipedia.org/wiki/Mathematical_Alphanumeric_Symbols > > > and check for the U+1D6Ex line do you see some meaningful symbols > > > like \varGamma in your browsers? > Sorry, I misread the question. Dont know about fonts used by browsers. Are > they the same > as the installed system fonts? Simply launch in your browser the page on the link above and check whether you see reasonable characters at line U+1D6Ex I have many fonts you listed, but my browser doesn't show a thing. Pavel -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
Am Thu, 18 Jun 2020 20:44:20 +0200 schrieb Kornel Benko : > Am Thu, 18 Jun 2020 20:01:18 +0200 > schrieb Pavel Sanda : > > > On Mon, Jun 15, 2020 at 02:46:47PM +0200, Thibaut Cuvelier wrote: > > > Here is a new version of the first patch. Indeed, I forgot to completely > > > refactor a few font-related things. It should be much better now. (I also > > > removed a method declaration that belongs to another DocBook commit down > > > the line, and I reworked spaces at a few places to reduce the patch > > > size.) > > > > The first patch should be one already, second one was just committed. > > In the third I see the following changes in the export of Math Manual: > > > > - sqint > > + > > > > - varGamma > > + > > > > > > I looked more deeply in the symbols and clearly many symbols will be > > print out via unicode now. > > The question is how widespread are fonts for supporting it. > > > > My (outdated) setup does not have those fonts but I would like to ask > > other devs about win/mac/mainstream lin distros situation. > > > > If you go to > > https://en.wikipedia.org/wiki/Mathematical_Alphanumeric_Symbols > > and check for the U+1D6Ex line do you see some meaningful symbols > > like \varGamma in your browsers? > > > > Pavel > > You could try > perl development/tools/listFontWithLang.pl -c u+x1D6E4 > On my system I got 54 fonts. > Among others, there are Deja Vu, Libertine Math, Libertinus Math, STIX, TeX > Gyre. > > Kornel Sorry, I misread the question. Dont know about fonts used by browsers. Are they the same as the installed system fonts? Kornel pgpZp427SsBAT.pgp Description: Digitale Signatur von OpenPGP -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
Am Thu, 18 Jun 2020 20:01:18 +0200 schrieb Pavel Sanda : > On Mon, Jun 15, 2020 at 02:46:47PM +0200, Thibaut Cuvelier wrote: > > Here is a new version of the first patch. Indeed, I forgot to completely > > refactor a few font-related things. It should be much better now. (I also > > removed a method declaration that belongs to another DocBook commit down > > the line, and I reworked spaces at a few places to reduce the patch size.) > > The first patch should be one already, second one was just committed. > In the third I see the following changes in the export of Math Manual: > > - sqint > + > > - varGamma > + > > > I looked more deeply in the symbols and clearly many symbols will be > print out via unicode now. > The question is how widespread are fonts for supporting it. > > My (outdated) setup does not have those fonts but I would like to ask > other devs about win/mac/mainstream lin distros situation. > > If you go to > https://en.wikipedia.org/wiki/Mathematical_Alphanumeric_Symbols > and check for the U+1D6Ex line do you see some meaningful symbols > like \varGamma in your browsers? > > Pavel You could try perl development/tools/listFontWithLang.pl -c u+x1D6E4 On my system I got 54 fonts. Among others, there are Deja Vu, Libertine Math, Libertinus Math, STIX, TeX Gyre. Kornel pgpBIIHgAf4F2.pgp Description: Digitale Signatur von OpenPGP -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Mon, Jun 15, 2020 at 02:46:47PM +0200, Thibaut Cuvelier wrote: > Here is a new version of the first patch. Indeed, I forgot to completely > refactor a few font-related things. It should be much better now. (I also > removed a method declaration that belongs to another DocBook commit down > the line, and I reworked spaces at a few places to reduce the patch size.) The first patch should be one already, second one was just committed. In the third I see the following changes in the export of Math Manual: - sqint + - varGamma + I looked more deeply in the symbols and clearly many symbols will be print out via unicode now. The question is how widespread are fonts for supporting it. My (outdated) setup does not have those fonts but I would like to ask other devs about win/mac/mainstream lin distros situation. If you go to https://en.wikipedia.org/wiki/Mathematical_Alphanumeric_Symbols and check for the U+1D6Ex line do you see some meaningful symbols like \varGamma in your browsers? Pavel -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Thu, Jun 18, 2020 at 05:26:34PM +0200, Jean-Marc Lasgouttes wrote: > Le 18/06/2020 ?? 17:19, Pavel Sanda a écrit : > >Thanks, I committed the first patch with consts added (binding to > >temporaries). > >I propose to drop the second patch altogether. Unless you use simple > >expressions > >chaining is not particularly safe, << is not a sequence point so order of > >evaluation > >is undefined. > > What do you mean? We do that all over the place with streams. I thought it > was how streams were supposed to be used. Depends what you mean by "that". Simply put, chaining is OK when the expresions (out << expr1 << expr2 ...;) don't have chance to interact with each other. So "simple" code like out << var1 << var2 << string3 ..; etc is just fine. If, on the other hand, exprX messes with some variables which are used in exprY then you get undefined behaviour because standard leave the order undefined. Putting ';' (or any other sequence point) between the expressions gives you guaranteed order. If you think we do have the second case in the code already, we should better review it :) Pavel -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
Le 18/06/2020 à 17:19, Pavel Sanda a écrit : Thanks, I committed the first patch with consts added (binding to temporaries). I propose to drop the second patch altogether. Unless you use simple expressions chaining is not particularly safe, << is not a sequence point so order of evaluation is undefined. What do you mean? We do that all over the place with streams. I thought it was how streams were supposed to be used. JMarc -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Thu, Jun 18, 2020 at 02:03:38AM +0200, Thibaut Cuvelier wrote: > On Mon, 15 Jun 2020 at 23:51, Pavel Sanda wrote: > > > On Mon, Jun 15, 2020 at 02:46:47PM +0200, Thibaut Cuvelier wrote: > > > Here is a new version of the first patch. Indeed, I forgot to completely > > > refactor a few font-related things. It should be much better now. (I also > > > > Ok, we are bit closer, I committed update of first patch. > > > > There is still missing id='magicparlabel-XX' part which need to be fixed. > > Please check xHTML output of e.g. Math manual to be identical. > > > > Here is a new version of the patches. Among the changes: > - removal of a TODO for xml::ParTag (the constructor with parid is no more > used anywhere). ParTag still exists, as it is used to perform some dispatch > in the stream. > - restoration of the parids (looks like it was a reversed condition at some > point). > - slight code cleanups (chain the << calls in streams, as is done in most > places), in a separate commit. > A quick note: the MathML patches remove useless spaces around symbols (they > are not present in output from MathType, for instance). This is similar to > how simpler things are already handled: before this patch, LyX output > A but ; now, there are no spaces for lambda. > The remaining differences are due to previous differences in treatment > between the old DocBook and the existing HTML exporters worked. Thanks, I committed the first patch with consts added (binding to temporaries). I propose to drop the second patch altogether. Unless you use simple expressions chaining is not particularly safe, << is not a sequence point so order of evaluation is undefined. Pavel -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Mon, 15 Jun 2020 at 23:51, Pavel Sanda wrote: > On Mon, Jun 15, 2020 at 02:46:47PM +0200, Thibaut Cuvelier wrote: > > Here is a new version of the first patch. Indeed, I forgot to completely > > refactor a few font-related things. It should be much better now. (I also > > Ok, we are bit closer, I committed update of first patch. > > There is still missing id='magicparlabel-XX' part which need to be fixed. > Please check xHTML output of e.g. Math manual to be identical. > Here is a new version of the patches. Among the changes: - removal of a TODO for xml::ParTag (the constructor with parid is no more used anywhere). ParTag still exists, as it is used to perform some dispatch in the stream. - restoration of the parids (looks like it was a reversed condition at some point). - slight code cleanups (chain the << calls in streams, as is done in most places), in a separate commit. A quick note: the MathML patches remove useless spaces around symbols (they are not present in output from MathType, for instance). This is similar to how simpler things are already handled: before this patch, LyX output A but ; now, there are no spaces for lambda. The remaining differences are due to previous differences in treatment between the old DocBook and the existing HTML exporters worked. I attach two patches: the first one gets the parid right (and eliminates a TODO), I can merge it with the first patch of the series if you prefer; the second one only has the cosmetic changes. Cheers, > Pavel > -- > lyx-devel mailing list > lyx-devel@lists.lyx.org > http://lists.lyx.org/mailman/listinfo/lyx-devel > 0005-Slight-cleaning-chain-calls-to-more-uniform-indentat.patch Description: Binary data 0004-Get-rid-of-parid.patch Description: Binary data -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Mon, Jun 15, 2020 at 02:46:47PM +0200, Thibaut Cuvelier wrote: > Here is a new version of the first patch. Indeed, I forgot to completely > refactor a few font-related things. It should be much better now. (I also Ok, we are bit closer, I committed update of first patch. There is still missing id='magicparlabel-XX' part which need to be fixed. Please check xHTML output of e.g. Math manual to be identical. Cheers, Pavel -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Mon, Jun 15, 2020 at 01:48:50PM +0200, Thibaut Cuvelier wrote: > The goal being to completely overhaul that export (i.e. rewrite it), I > don't know if it's helpful to ensure it still works with this patch :/. No it's not. Pavel -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Tue, Jun 09, 2020 at 07:02:23PM +0200, Thibaut Cuvelier wrote: > This patch is made so that there should not be any change to the XHTML > output. Oops, I think I waas too fast to accept your changes in the first patch. I see huge diff between the current Math Manual and the pre a6b07608d8e9de state. E.g. loss of id='magicparlabel-XX' or -> Can you comment? Pavel -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Mon, 15 Jun 2020 at 13:30, Kornel Benko wrote: > Am Mon, 15 Jun 2020 13:21:40 +0200 > schrieb Thibaut Cuvelier : > > > On Mon, 15 Jun 2020 at 12:57, Pavel Sanda wrote: > > > > > On Mon, Jun 15, 2020 at 08:43:39AM +0200, Kornel Benko wrote: > > > > Thanks, no warnings. > > > > > > > > My try to export was not successful though. > > > > Error: Couldn't export file > > > > ---- > > > > No information for exporting the format MS Excel Office Open > XML. > > > > > > These patches are aimed at docbook (to be completed in future) not for > MS > > > Excel Office Open XML export. > > > > > Hm, OK. Somehow I forgot ... > Nonetheless, the output for exporting to docbook creates nearly the same > error message. > The goal being to completely overhaul that export (i.e. rewrite it), I don't know if it's helpful to ensure it still works with this patch :/. (The new version is complete, almost entirely reviewed by Guillaume, so that the functionality can still be present in LyX 2.4: https://gitlab.com/gadmm/lyx-unstable/-/merge_requests/3.) Nevertheless, on my machine, exporting something as DocBook yields an entirely unrelated error (yes, that's a Windows machine, hence no cp): 'cp' is not recognized as an internal or external command, operable program or batch file. support\Systemcall.cpp (291): Systemcall: 'cp "DocBook_Article__28SGML_29.sgml" "DocBook_Article__28SGML_29.xml"' finish ed with exit code 1 Error: Cannot convert file An error occurred while running: cp "DocBook_Article__28SGML_29.sgml" "DocBook_Article__28SGML_29.xml" > > Unless I broke something in that export, but I don't know in which cases > it > > is available. > > > > Pavel > > Kornel > -- > lyx-devel mailing list > lyx-devel@lists.lyx.org > http://lists.lyx.org/mailman/listinfo/lyx-devel > -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
Am Mon, 15 Jun 2020 13:21:40 +0200 schrieb Thibaut Cuvelier : > On Mon, 15 Jun 2020 at 12:57, Pavel Sanda wrote: > > > On Mon, Jun 15, 2020 at 08:43:39AM +0200, Kornel Benko wrote: > > > Thanks, no warnings. > > > > > > My try to export was not successful though. > > > Error: Couldn't export file > > > > > > No information for exporting the format MS Excel Office Open XML. > > > > These patches are aimed at docbook (to be completed in future) not for MS > > Excel Office Open XML export. > > Hm, OK. Somehow I forgot ... Nonetheless, the output for exporting to docbook creates nearly the same error message. > Unless I broke something in that export, but I don't know in which cases it > is available. > > Pavel Kornel pgp8JGEvGNaZl.pgp Description: Digitale Signatur von OpenPGP -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Mon, 15 Jun 2020 at 12:57, Pavel Sanda wrote: > On Mon, Jun 15, 2020 at 08:43:39AM +0200, Kornel Benko wrote: > > Thanks, no warnings. > > > > My try to export was not successful though. > > Error: Couldn't export file > > > > No information for exporting the format MS Excel Office Open XML. > > These patches are aimed at docbook (to be completed in future) not for MS > Excel Office Open XML export. > Unless I broke something in that export, but I don't know in which cases it is available. Pavel > -- > lyx-devel mailing list > lyx-devel@lists.lyx.org > http://lists.lyx.org/mailman/listinfo/lyx-devel > -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Mon, Jun 15, 2020 at 08:43:39AM +0200, Kornel Benko wrote: > Thanks, no warnings. > > My try to export was not successful though. > Error: Couldn't export file > > No information for exporting the format MS Excel Office Open XML. These patches are aimed at docbook (to be completed in future) not for MS Excel Office Open XML export. Pavel -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
Am Mon, 15 Jun 2020 01:54:57 +0200 schrieb Thibaut Cuvelier : ... > > Here is a new version addressing this! I let the function return a docstring, > to avoid > changing too much the initial code without a good reason to do so. Thanks, no warnings. My try to export was not successful though. Error: Couldn't export file No information for exporting the format MS Excel Office Open XML. Is this the wrong export? Do I need some external converter? Kornel pgpmpI_kle471.pgp Description: Digitale Signatur von OpenPGP -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
Am Mon, 15 Jun 2020 00:57:06 +0200 schrieb Thibaut Cuvelier : > Dear list, > > Here is a new version of the patches with the following changes: > - Kornel's patches for tests/dummy_functions.cpp and tex2lyx/dummy_impl.cpp > - rebased on top of 57272e8 > <https://github.com/cburschka/lyx/commit/57272e837b148975817440bdc6a66b9935fa00a3> > - eliminated warnings in fontToTagDocBook (output_docbook), as signalled by > Pavel: it now returns a string, like fontToTag in output_xhtml, instead of > a reference to docstring > - eliminated warnings in InsetCounter and InsetLabel > It seems to work for me, when compiling LyX, tex2lyx, and check_layout > (indeed, as I'm using CMake, only LyX got compiled). I think no other issue > has been raised on that code (otherwise, just send a reminder :)). > > Thibaut Cuvelier > > > On Sun, 14 Jun 2020 at 23:35, Kornel Benko wrote: > > > Am Sun, 14 Jun 2020 23:07:50 +0200 > > schrieb Pavel Sanda : > > > > > On Sun, Jun 14, 2020 at 10:12:31PM +0200, Kornel Benko wrote: > > > > > Can you try to import my patch into dummy_functions.cpp ? > > > > > Pavel > > > > > > > > Works, if inserted also > > > > #include "xml.h" > > > > into src/tests/dummy_functions.cpp. > > > > > > Cool, can you post the patch for convenience? P > > > > Attached. > > > > Kornel > > -- > > lyx-devel mailing list > > lyx-devel@lists.lyx.org > > http://lists.lyx.org/mailman/listinfo/lyx-devel > > Compiles fine, with some warnings. ... /usr2/src/lyx/lyx-test/src/output_docbook.cpp:44:19: warning: ‘const string lyx::{anonymous}::fontToDocBookTag(lyx::xml::FontTypes)’ defined but not used [-Wunused-function] std::string const fontToDocBookTag(xml::FontTypes type) { ^~~~ ... /usr2/src/lyx/lyx-test/src/output_xhtml.cpp:52:34: warning: returning reference to temporary [-Wreturn-local-addr] return from_utf8("em"); ^ /usr2/src/lyx/lyx-test/src/output_xhtml.cpp:54:33: warning: returning reference to temporary [-Wreturn-local-addr] return from_utf8("b"); ^ /usr2/src/lyx/lyx-test/src/output_xhtml.cpp:56:35: warning: returning reference to temporary [-Wreturn-local-addr] return from_utf8("dfn"); ^ /usr2/src/lyx/lyx-test/src/output_xhtml.cpp:60:33: warning: returning reference to temporary [-Wreturn-local-addr] return from_utf8("u"); ^ /usr2/src/lyx/lyx-test/src/output_xhtml.cpp:63:35: warning: returning reference to temporary [-Wreturn-local-addr] return from_utf8("del"); ^ /usr2/src/lyx/lyx-test/src/output_xhtml.cpp:65:33: warning: returning reference to temporary [-Wreturn-local-addr] return from_utf8("i"); ^ /usr2/src/lyx/lyx-test/src/output_xhtml.cpp:84:36: warning: returning reference to temporary [-Wreturn-local-addr] return from_utf8("span"); ^ /usr2/src/lyx/lyx-test/src/output_xhtml.cpp:87:22: warning: returning reference to temporary [-Wreturn-local-addr] return docstring(); ... Kornel pgpMprHK_zSlA.pgp Description: Digitale Signatur von OpenPGP -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
Am Sun, 14 Jun 2020 23:07:50 +0200 schrieb Pavel Sanda : > On Sun, Jun 14, 2020 at 10:12:31PM +0200, Kornel Benko wrote: > > > Can you try to import my patch into dummy_functions.cpp ? > > > Pavel > > > > Works, if inserted also > > #include "xml.h" > > into src/tests/dummy_functions.cpp. > > Cool, can you post the patch for convenience? P Attached. Kornel diff --git a/src/tests/dummy_functions.cpp b/src/tests/dummy_functions.cpp index ca6edc38d0..dd68d7a022 100644 --- a/src/tests/dummy_functions.cpp +++ b/src/tests/dummy_functions.cpp @@ -3,10 +3,11 @@ #include "Format.h" #include "LayoutEnums.h" #include "LyXRC.h" #include "support/Messages.h" +#include "xml.h" using namespace std; namespace lyx { @@ -47,6 +48,11 @@ Formats & theFormats() string alignmentToCSS(LyXAlignment) { return string(); } +namespace xml { +docstring StartTag::writeTag() const { return docstring();}; +docstring StartTag::writeEndTag() const {return docstring();}; +bool StartTag::operator==(FontTag const & rhs) const {return rhs == *this;}; +} } // namespace lyx pgpvwjtdvm8dv.pgp Description: Digitale Signatur von OpenPGP -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Sun, Jun 14, 2020 at 10:12:31PM +0200, Kornel Benko wrote: > > Can you try to import my patch into dummy_functions.cpp ? > > Pavel > > Works, if inserted also > #include "xml.h" > into src/tests/dummy_functions.cpp. Cool, can you post the patch for convenience? P -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
Am Sun, 14 Jun 2020 21:37:12 +0200 schrieb Pavel Sanda : > On Sat, Jun 13, 2020 at 02:39:18PM +0200, Kornel Benko wrote: > > > Just checked, same error in cmake build for tex2lyx without Pavel's patch. > > > > > > With the patch it compiles. > > > > > > Kornel > > > > And as expected, breaks building of check_layout: > > Can you try to import my patch into dummy_functions.cpp ? > Pavel Works, if inserted also #include "xml.h" into src/tests/dummy_functions.cpp. (Without the include I got: /usr2/src/lyx/lyx-test/src/tests/dummy_functions.cpp:53:11: error: ‘StartTag’ has not been declared docstring StartTag::writeTag() const { return docstring();}; ^~~~ /usr2/src/lyx/lyx-test/src/tests/dummy_functions.cpp:53:32: error: non-member function ‘lyx::docstring lyx::xml::writeTag()’ cannot have cv-qualifier docstring StartTag::writeTag() const { return docstring();}; ^ ... ) Kornel pgp67siSOsjQV.pgp Description: Digitale Signatur von OpenPGP -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Sat, Jun 13, 2020 at 02:39:18PM +0200, Kornel Benko wrote: > > Just checked, same error in cmake build for tex2lyx without Pavel's patch. > > > > With the patch it compiles. > > > > Kornel > > And as expected, breaks building of check_layout: Can you try to import my patch into dummy_functions.cpp ? Pavel -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
Le 13/06/2020 à 10:21, Kornel Benko a écrit : Or you can add ../xml.o to the link list. Does this work? I think this is in general better than to use a hack. I do not see it as a hack. This is unused code in tex2lyx. And dummy*.cpp is exactly for this purpose there. I know that, I just advocate to first try to use the real .o file. Pavel did try that, and it did not work, so the second best thing is to use dummy definitions. JMarc -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
Am Sat, 13 Jun 2020 14:36:34 +0200 schrieb Kornel Benko : > Am Sat, 13 Jun 2020 10:09:34 +0200 > schrieb Kornel Benko : > > > Am Fri, 12 Jun 2020 13:15:19 +0200 > > schrieb Pavel Sanda : > > > > > On Thu, Jun 11, 2020 at 06:30:22PM +0200, Thibaut Cuvelier wrote: > > > > Yes, the output is identical. I also added some comments. > > > > > > > > I'm adding a third patch in this, I forgot it was necessary??? It is a > > > > refactoring of XHTMLStream as a more generic XMLStream (required for > > > > DocBook, and it has a few implications for MathML, but not that much). > > > > > > > > > > The refactoring looks reasonable, unfortunately it breaks tex2lyx here. > > > Apart from harmless warning (which should be fixed though) linker > > > for tex2lyx screams as new dependencies appear. Most likely doesn't > > > appear for you because cmake links everything? > > > > The other possibility is that he does not create tex2lyx. > > Under linux-cmake you can use 'make lyx' alone. > > Just checked, same error in cmake build for tex2lyx without Pavel's patch. > > With the patch it compiles. > > Kornel And as expected, breaks building of check_layout: [ 97%] Linking CXX executable ../../bin/check_layout cd /BUILD/BUILDMint18/BuildBisectLyx/src/tests && /usr/bin/cmake -E cmake_link_script CMakeFiles/check_layout.dir/link.txt --verbose=1 /usr/bin/c++ -Wall -Wunused-parameter --std=c++14 -fno-strict-aliasing -O0 -g3 -D_DEBUG -rdynamic CMakeFiles/check_layout.dir/__/insets/InsetLayout.cpp.o CMakeFiles/check_layout.dir/__/CiteEnginesList.cpp.o CMakeFiles/check_layout.dir/__/Color.cpp.o CMakeFiles/check_layout.dir/__/Counters.cpp.o CMakeFiles/check_layout.dir/__/Floating.cpp.o CMakeFiles/check_layout.dir/__/FloatList.cpp.o CMakeFiles/check_layout.dir/__/FontInfo.cpp.o CMakeFiles/check_layout.dir/__/Layout.cpp.o CMakeFiles/check_layout.dir/__/LayoutFile.cpp.o CMakeFiles/check_layout.dir/__/Lexer.cpp.o CMakeFiles/check_layout.dir/__/ModuleList.cpp.o CMakeFiles/check_layout.dir/__/Spacing.cpp.o CMakeFiles/check_layout.dir/__/TextClass.cpp.o CMakeFiles/check_layout.dir/check_layout.cpp.o CMakeFiles/check_layout.dir/boost.cpp.o CMakeFiles/check_layout.dir/dummy_functions.cpp.o -o ../../bin/check_layout ../../lib/libsupport.a -lz -lmagic /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5.9.5 /usr/lib/x86_64-linux-gnu/libQt5Core.so.5.9.5 CMakeFiles/check_layout.dir/__/Layout.cpp.o: In Funktion »lyx::xml::StartTag::StartTag(std::__cxx11::basic_string, std::allocator > const&)«: /usr2/src/lyx/lyx-test/src/xml.h:164: Warnung: undefinierter Verweis auf »vtable for lyx::xml::StartTag« CMakeFiles/check_layout.dir/__/Layout.cpp.o: In Funktion »lyx::xml::StartTag::~StartTag()«: /usr2/src/lyx/lyx-test/src/xml.h:180: Warnung: undefinierter Verweis auf »vtable for lyx::xml::StartTag« collect2: error: ld returned 1 exit status src/tests/CMakeFiles/check_layout.dir/build.make:336: recipe for target 'bin/check_layout' failed make[2]: *** [bin/check_layout] Error 1 make[2]: Verzeichnis „/BUILD/BUILDMint18/BuildBisectLyx“ wird verlassen CMakeFiles/Makefile2:1413: recipe for target 'src/tests/CMakeFiles/check_layout.dir/all' failed make[1]: *** [src/tests/CMakeFiles/check_layout.dir/all] Error 2 make[1]: Verzeichnis „/BUILD/BUILDMint18/BuildBisectLyx“ wird verlassen Makefile:185: recipe for target 'all' failed make: *** [all] Error 2 Kornel pgpuDnUTSwsFj.pgp Description: Digitale Signatur von OpenPGP -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
Am Sat, 13 Jun 2020 10:09:34 +0200 schrieb Kornel Benko : > Am Fri, 12 Jun 2020 13:15:19 +0200 > schrieb Pavel Sanda : > > > On Thu, Jun 11, 2020 at 06:30:22PM +0200, Thibaut Cuvelier wrote: > > > Yes, the output is identical. I also added some comments. > > > > > > I'm adding a third patch in this, I forgot it was necessary??? It is a > > > refactoring of XHTMLStream as a more generic XMLStream (required for > > > DocBook, and it has a few implications for MathML, but not that much). > > > > The refactoring looks reasonable, unfortunately it breaks tex2lyx here. > > Apart from harmless warning (which should be fixed though) linker > > for tex2lyx screams as new dependencies appear. Most likely doesn't > > appear for you because cmake links everything? > > The other possibility is that he does not create tex2lyx. > Under linux-cmake you can use 'make lyx' alone. Just checked, same error in cmake build for tex2lyx without Pavel's patch. With the patch it compiles. Kornel pgpOmfdVwdaqO.pgp Description: Digitale Signatur von OpenPGP -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
Am Sat, 13 Jun 2020 00:33:20 +0200 schrieb Jean-Marc Lasgouttes : > Le 12/06/2020 à 23:28, Pavel Sanda a écrit : > > On Fri, Jun 12, 2020 at 01:15:19PM +0200, Pavel Sanda wrote: > > For this part, the attached patch seem to help with autotools. IMHO it would help on cmake build too. > Or you can add ../xml.o to the link list. Does this work? I think this > is in general better than to use a hack. I do not see it as a hack. This is unused code in tex2lyx. And dummy*.cpp is exactly for this purpose there. The same could probably apply also to check_(layout|ExternalTransforms|Length|ListingsCaption) with dummy_functions.cpp. (Used in tests) > It all depends whether xml.o > brings some other cruft with it. > > JMarc Kornel pgpR8n0dvqjRB.pgp Description: Digitale Signatur von OpenPGP -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
Am Fri, 12 Jun 2020 13:15:19 +0200 schrieb Pavel Sanda : > On Thu, Jun 11, 2020 at 06:30:22PM +0200, Thibaut Cuvelier wrote: > > Yes, the output is identical. I also added some comments. > > > > I'm adding a third patch in this, I forgot it was necessary??? It is a > > refactoring of XHTMLStream as a more generic XMLStream (required for > > DocBook, and it has a few implications for MathML, but not that much). > > The refactoring looks reasonable, unfortunately it breaks tex2lyx here. > Apart from harmless warning (which should be fixed though) linker > for tex2lyx screams as new dependencies appear. Most likely doesn't > appear for you because cmake links everything? The other possibility is that he does not create tex2lyx. Under linux-cmake you can use 'make lyx' alone. > I can look next week, but I spent all my avail lyx-time for today when trying > to fix it. > > 0001 patch gives me: > ... > Pavel Kornel pgpgaqHCcYgD0.pgp Description: Digitale Signatur von OpenPGP -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Sat, Jun 13, 2020 at 12:33:20AM +0200, Jean-Marc Lasgouttes wrote: > Le 12/06/2020 ? 23:28, Pavel Sanda a écrit : > >On Fri, Jun 12, 2020 at 01:15:19PM +0200, Pavel Sanda wrote: > >For this part, the attached patch seem to help with autotools. > > Or you can add ../xml.o to the link list. Does this work? I think this is in That was the first I tried and it brings bunch of other dependencies... Pavel -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
Le 12/06/2020 à 23:28, Pavel Sanda a écrit : On Fri, Jun 12, 2020 at 01:15:19PM +0200, Pavel Sanda wrote: For this part, the attached patch seem to help with autotools. Or you can add ../xml.o to the link list. Does this work? I think this is in general better than to use a hack. It all depends whether xml.o brings some other cruft with it. JMarc CXX dummy_impl.o CXXLDtex2lyx dummy_impl.o: In function `lyx::xml::StartTag::StartTag(std::__cxx11::basic_string, std::allocator > const&)': /home/lyx/devel/src/tex2lyx/../../src/xml.h:164: undefined reference to `vtable for lyx::xml::StartTag' dummy_impl.o: In function `lyx::xml::StartTag::~StartTag()': /home/lyx/devel/src/tex2lyx/../../src/xml.h:180: undefined reference to `vtable for lyx::xml::StartTag' dummy_impl.o: In function `lyx::xml::StartTag::~StartTag()': /home/lyx/devel/src/tex2lyx/../../src/xml.h:180: undefined reference to `vtable for lyx::xml::StartTag' ../Layout.o: In function `lyx::xml::StartTag::StartTag(std::__cxx11::basic_string, std::allocator > const&)': /home/lyx/devel/src/xml.h:164: undefined reference to `vtable for lyx::xml::StartTag' collect2: error: ld returned 1 exit status Makefile:630: recipe for target 'tex2lyx' failed make[4]: *** [tex2lyx] Error 1 Pavel -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Fri, Jun 12, 2020 at 01:15:19PM +0200, Pavel Sanda wrote: For this part, the attached patch seem to help with autotools. > CXX dummy_impl.o > CXXLDtex2lyx > dummy_impl.o: In function > `lyx::xml::StartTag::StartTag(std::__cxx11::basic_string std::char_traits, std::allocator > const&)': > /home/lyx/devel/src/tex2lyx/../../src/xml.h:164: undefined reference to > `vtable for lyx::xml::StartTag' > dummy_impl.o: In function `lyx::xml::StartTag::~StartTag()': > /home/lyx/devel/src/tex2lyx/../../src/xml.h:180: undefined reference to > `vtable for lyx::xml::StartTag' > dummy_impl.o: In function `lyx::xml::StartTag::~StartTag()': > /home/lyx/devel/src/tex2lyx/../../src/xml.h:180: undefined reference to > `vtable for lyx::xml::StartTag' > ../Layout.o: In function > `lyx::xml::StartTag::StartTag(std::__cxx11::basic_string std::char_traits, std::allocator > const&)': > /home/lyx/devel/src/xml.h:164: undefined reference to `vtable for > lyx::xml::StartTag' > collect2: error: ld returned 1 exit status > Makefile:630: recipe for target 'tex2lyx' failed > make[4]: *** [tex2lyx] Error 1 > > > Pavel diff --git a/src/tex2lyx/dummy_impl.cpp b/src/tex2lyx/dummy_impl.cpp index c850cd97c1..2dde13d348 100644 @@ -106,6 +106,11 @@ string alignmentToCSS(LyXAlignment) return string(); } +namespace xml { +docstring StartTag::writeTag() const { return docstring();}; +docstring StartTag::writeEndTag() const {return docstring();}; +bool StartTag::operator==(FontTag const & rhs) const {return rhs == *this;}; +} // // Keep the linker happy on Windows -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Thu, Jun 11, 2020 at 06:30:22PM +0200, Thibaut Cuvelier wrote: > Yes, the output is identical. I also added some comments. > > I'm adding a third patch in this, I forgot it was necessary??? It is a > refactoring of XHTMLStream as a more generic XMLStream (required for > DocBook, and it has a few implications for MathML, but not that much). The refactoring looks reasonable, unfortunately it breaks tex2lyx here. Apart from harmless warning (which should be fixed though) linker for tex2lyx screams as new dependencies appear. Most likely doesn't appear for you because cmake links everything? I can look next week, but I spent all my avail lyx-time for today when trying to fix it. 0001 patch gives me: output_docbook.cpp: In function 'const docstring& lyx::{anonymous}::fontToDocBookTag(lyx::xml::FontTypes)': output_docbook.cpp:48:40: warning: returning reference to temporary [-Wreturn-local-addr] return from_utf8("emphasis"); ^ output_docbook.cpp:50:38: warning: returning reference to temporary [-Wreturn-local-addr] return from_utf8("person"); ^ output_docbook.cpp:62:40: warning: returning reference to temporary [-Wreturn-local-addr] return from_utf8("emphasis"); ^ output_docbook.cpp:64:36: warning: returning reference to temporary [-Wreturn-local-addr] return from_utf8("code"); ^ output_docbook.cpp:77:40: warning: returning reference to temporary [-Wreturn-local-addr] return from_utf8("emphasis"); ^ output_docbook.cpp:79:30: warning: returning reference to temporary [-Wreturn-local-addr] return docstring(); ^ output_docbook.cpp: At global scope: output_docbook.cpp:44:19: warning: 'const docstring& lyx::{anonymous}::fontToDocBookTag(lyx::xml::FontTypes)' defined but not used [-Wunused-function] docstring const & fontToDocBookTag(xml::FontTypes type) { ^~~~ CXX output_xhtml.o output_xhtml.cpp: In function 'const docstring& lyx::fontToHtmlTag(lyx::xml::FontTypes)': output_xhtml.cpp:52:34: warning: returning reference to temporary [-Wreturn-local-addr] return from_utf8("em"); ^ output_xhtml.cpp:54:33: warning: returning reference to temporary [-Wreturn-local-addr] return from_utf8("b"); ^ output_xhtml.cpp:56:35: warning: returning reference to temporary [-Wreturn-local-addr] return from_utf8("dfn"); ^ output_xhtml.cpp:60:33: warning: returning reference to temporary [-Wreturn-local-addr] return from_utf8("u"); ^ output_xhtml.cpp:63:35: warning: returning reference to temporary [-Wreturn-local-addr] return from_utf8("del"); ^ output_xhtml.cpp:65:33: warning: returning reference to temporary [-Wreturn-local-addr] return from_utf8("i"); ^ output_xhtml.cpp:84:36: warning: returning reference to temporary [-Wreturn-local-addr] return from_utf8("span"); ^ output_xhtml.cpp:87:22: warning: returning reference to temporary [-Wreturn-local-addr] return docstring(); ^ insets/InsetCounter.cpp: In member function 'virtual lyx::docstring lyx::InsetCounter::xhtml(lyx::XMLStream&, const lyx::OutputParams&) const': insets/InsetCounter.cpp:197:43: warning: unused parameter 'xs' [-Wunused-parameter] docstring InsetCounter::xhtml(XMLStream & xs, OutputParams const &) const insets/InsetLabel.cpp: In member function 'virtual int lyx::InsetLabel::docbook(lyx::odocstream&, const lyx::OutputParams&) const': insets/InsetLabel.cpp:354:63: warning: unused parameter 'runparams' [-Wunused-parameter] int InsetLabel::docbook(odocstream & os, OutputParams const & runparams) const CXX dummy_impl.o CXXLDtex2lyx dummy_impl.o: In function `lyx::xml::StartTag::StartTag(std::__cxx11::basic_string, std::allocator > const&)': /home/lyx/devel/src/tex2lyx/../../src/xml.h:164: undefined reference to `vtable for lyx::xml::StartTag' dummy_impl.o: In function `lyx::xml::StartTag::~StartTag()': /home/lyx/devel/src/tex2lyx/../../src/xml.h:180: undefined reference to `vtable for lyx::xml::StartTag' dummy_impl.o: In function `lyx::xml::StartTag::~StartTag()': /home/lyx/devel/src/tex2lyx/../../src/xml.h:180: undefined reference to `vtable for lyx::xml::StartTag' ../Layout.o: In function `lyx::xml::StartTag::StartTag(std::__cxx11::b
Re: Tweaking lib/symbols for XML entities
On Tue, Jun 09, 2020 at 07:02:23PM +0200, Thibaut Cuvelier wrote: > This patch is made so that there should not be any change to the XHTML > output. All features are opt-in (name spaces and XML entities). XHTML > output always uses HTML entities (which are very lenient), XML ones should > never be used once this patch is applied. > > That nbsp shouldn't have been modified in the first patch. Here is a new > version of the set of patches so that the change is only applied in XML > mode (again, which should not be used right now, because it will only be > useful for DocBook), with no change in the first one. Thanks, the patch still doesnot apply to current master. Another two bits: 1. error check: if you export math manual to html with and without your patch, the output is identical, right? 2. Pleas add comments to those new members (I know the others don't have it either, but that does not make it right :) + std::string xmlns_; + /// + bool xml_mode_; Pavel -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Wed, May 20, 2020 at 03:02:21AM +0200, Thibaut Cuvelier wrote: > Hi all, > > Again, a new version of the second patch, as I find new things that should > be adapted. Is it good enough to be merged? I skimmed through the pacthes. Given the obselete status of docbook export I do not care that much about new xml part, but I wonder about the changes to the current xHTML export. The first patch looked more or less harmless. The second part seem mostly as a huge work to add up the new xml column to symbols. - Could you clarify whether/how many output changes will be there in xHTML output (it's hard to read from diffs whether you just add new xml values or some html tags were changed as well). - there are many "string" -> from_ascii(ms.namespacedTag("mo")) conversions. Do these have impact on xHTML output? - I saw hardcoded -> unicode char change. Is this related to xHTML output as well (I wonder whether that generally works with some more alternative browsers like lynx)? I couldn't test as patches couldn't be applied (either master moved or you diff against slightly different tree). Apart from these small bits as a whole it seems to me we could merge it. Riki, do you have opinion? Pavel -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Thu, 14 May 2020 at 20:40, Guenter Milde wrote: > Dear Thibaut, > > On 2020-05-10, Thibaut Cuvelier wrote: > > > In order to ensure a valid DocBook entity with math formulae, the MathML > > generator must produce valid XML. Right now, it "only" produces valid > HTML > > (which is already quite an achievement!). The difference is in the > > entities: in HTML, you can use many entities, like . This is no more > > the case in XML, where you have to define all entities (that is, besides > > , , , , ). A solution for DocBook would be to > > define the needed entities in the XML document, but that would require > > generating all math formulas, remembering the needed entities, then > output > > the mapping at the *beginning* of the XML document. > > Why don't you simply use Unicode literal characters? > Many of these characters require more than one UTF-8 character; is it ensured to be read correctly? Right now, it's exclusively ASCII, so no questions asked. Furthermore, XML processors are more used to have entities rather than exotic characters; similarly, MathType uses this kind of entities when exporting in MathML. When users display the XML document, they must use a font that has all the needed characters, which is far from guaranteed. > Günter > > -- > lyx-devel mailing list > lyx-devel@lists.lyx.org > http://lists.lyx.org/mailman/listinfo/lyx-devel > -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
Dear Thibaut, On 2020-05-10, Thibaut Cuvelier wrote: > In order to ensure a valid DocBook entity with math formulae, the MathML > generator must produce valid XML. Right now, it "only" produces valid HTML > (which is already quite an achievement!). The difference is in the > entities: in HTML, you can use many entities, like . This is no more > the case in XML, where you have to define all entities (that is, besides > , , , , ). A solution for DocBook would be to > define the needed entities in the XML document, but that would require > generating all math formulas, remembering the needed entities, then output > the mapping at the *beginning* of the XML document. Why don't you simply use Unicode literal characters? Günter -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On Wed, May 13, 2020 at 01:44:43AM +0200, Thibaut Cuvelier wrote: > I am attaching a new version of the patch that does this. Can we get this line in symbols right? #symbolfont charid charid-in-fallback-Xsymbol-font I hope my understanding is correct that with your patch the structure now is: #symbolfont charid charid-in-fallback-Xsymbol-font math-class xHTML XML Pavel -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: Tweaking lib/symbols for XML entities
On 5/9/20 9:25 PM, Thibaut Cuvelier wrote: > Dear list, > > In order to ensure a valid DocBook entity with math formulae, the > MathML generator must produce valid XML. Right now, it "only" produces > valid HTML (which is already quite an achievement!). The difference is > in the entities: in HTML, you can use many entities, like . This > is no more the case in XML, where you have to define all entities > (that is, besides , , , , ). A solution for > DocBook would be to define the needed entities in the XML document, > but that would require generating all math formulas, remembering the > needed entities, then output the mapping at the /beginning/ of the XML > document. We do this kind of thing already: The validate() routines collect various information that needs to be output to the document preamble. For LaTeX, for example, we need to know whether to load various packages, so e.g. the various insets tell us what they require. Whether that's the right way to proceed here is not clear. You'd have, in effect, to construct the XML and note which entities were used and then construct it again for actual output. But it certainly could be done. > There are mostly two places where these entities are hard-coded in > LyX: InsetMathDecoration, with only a few entities hard-coded in > source code; I should move those to lib/symbols! > lib/symbols, a much harder thing to change. > > Here is what I came up with: > https://gitlab.com/gadmm/lyx-unstable/-/merge_requests/3/diffs?commit_id=0c0fc7624caad400f22072442f9132291ee3036d#e90e8f11b4a89e64b3c9958e7af650b2f526. > It adds a parameter to MathStream to enable outputting XML-valid > entities. Mappings for InsetMathDecoration are done by slightly > adapting the data structure. However, for the other entities, I > hard-coded a mapping in InsetMathSymbol (hundreds of entities…), > because I could not get my head around lib/symbols. (By the way, in > this file, are the "x" mappings symbols that are not yet allowed in > output?) Yes, the x just means that we don't have anything (at the moment) we can use for output. > Would the patch be acceptable as-is? > Otherwise, could a lib/symbols expert (I've heard that there might be > one roaming around) help me with this? As I understand it, it would be > adding a new column in this file to propose an XML entity after the > HTML one. It probably would be better to do this in lib/symbols, since otherwise we have this same kind of information spread out in different places. It probably wouldn't be that hard to change it. It is read by initSymbols in MathFactory.cpp. All the 'character' lines would need an extra column, and this bit of code: is >> charid >> fallbackid >> tmp.extra >> tmp.xmlname; would need to be adapted to read it, with the latexkeys class (in MathParser.h) picking up an extra member. (That might as well just be a struct.) > > I also attach two patches for MathStream: the second one is my current > tentative of implementing XML entities; the first one is about adding > XML-name-spaces support (and not really related to the question above, > but the second one relies on it to avoid conflicts when merging). Am I right that the first patch, as it is, just allows for namespaces and doesn't actually use them? It's pretty long but seems to be straightforward, really. Riki -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Tweaking lib/symbols for XML entities
Dear list, In order to ensure a valid DocBook entity with math formulae, the MathML generator must produce valid XML. Right now, it "only" produces valid HTML (which is already quite an achievement!). The difference is in the entities: in HTML, you can use many entities, like . This is no more the case in XML, where you have to define all entities (that is, besides , , , , ). A solution for DocBook would be to define the needed entities in the XML document, but that would require generating all math formulas, remembering the needed entities, then output the mapping at the *beginning* of the XML document. There are mostly two places where these entities are hard-coded in LyX: InsetMathDecoration, with only a few entities hard-coded in source code; lib/symbols, a much harder thing to change. Here is what I came up with: https://gitlab.com/gadmm/lyx-unstable/-/merge_requests/3/diffs?commit_id=0c0fc7624caad400f22072442f9132291ee3036d#e90e8f11b4a89e64b3c9958e7af650b2f526. It adds a parameter to MathStream to enable outputting XML-valid entities. Mappings for InsetMathDecoration are done by slightly adapting the data structure. However, for the other entities, I hard-coded a mapping in InsetMathSymbol (hundreds of entities…), because I could not get my head around lib/symbols. (By the way, in this file, are the "x" mappings symbols that are not yet allowed in output?) Would the patch be acceptable as-is? Otherwise, could a lib/symbols expert (I've heard that there might be one roaming around) help me with this? As I understand it, it would be adding a new column in this file to propose an XML entity after the HTML one. I also attach two patches for MathStream: the second one is my current tentative of implementing XML entities; the first one is about adding XML-name-spaces support (and not really related to the question above, but the second one relies on it to avoid conflicts when merging). Only the second one has been reviewed by Guillaume. Of course, if these patches look OK to you, I'd be happy to see them merged! It's unlikely I'll need to change something there in the near future. Kind regards, Thibaut Cuvelier 0005-MathML-stream-allows-for-name-spaces.patch Description: Binary data 0016-Convert-HTML-entities-to-XML-entities.patch Description: Binary data -- lyx-devel mailing list lyx-devel@lists.lyx.org http://lists.lyx.org/mailman/listinfo/lyx-devel
Re: LyX<-->word --> Latex to XML
On Tuesday, 23 April 2019 22.23.41 WEST Nico Williams wrote: > FWIW, I've done some LyX->LyXHTML->XML conversions, as well as > LyX->XML via Python (to fix text styling open/close mismatching) and > then XSLT. > > You can find that work here: https://github.com/nicowilliams/lyx2rfc > > It's a bit old and rotted, but it illustrates something like > LaTeX->XML, but for LyX in particular. Hi Nico, thank you for the interesting example. It is nice to see a real use case of docbook in lyx. :-) Best regards, -- José Abílio
Re: Exporting ePub / XML: always SIGSEV
Le 26/04/2019 à 10:52, Stephan Witt a écrit : If I try to interpret it I can see in line (6) our code is involved the last time before the crash. Here the method named lyx::frontend::Action::action is called and this method does nearly nothing except calling some configured method indirectly. A crash here means the internal memory structures of the running LyX got corrupted at some point in time before. Unfortunately these things happen unnoticed at the time the code to blame is executed. Probably there is some memory passed back to the OS too early. Stephan, you can try to run under valgrind. JMarc
Re: Exporting ePub / XML: always SIGSEV
8 > ( 20) 20 CoreFoundation 0x7fff30050dd1 __CFRunLoopDoSources0 : 20 > CoreFoundation 0x7fff30050dd1 __CFRunLoopDoSources0 + 195 > ( 21) 21 CoreFoundation 0x7fff3005037a __CFRunLoopRun : 21 CoreFoundation > 0x7fff3005037a __CFRunLoopRun + 1219 > ( 22) 22 CoreFoundation 0x7fff3004fc64 CFRunLoopRunSpecific : 22 > CoreFoundation 0x7fff3004fc64 CFRunLoopRunSpecific + 463 > ( 23) 23 HIToolbox 0x7fff2f2e6ab5 RunCurrentEventLoopInMode : 23 > HIToolbox 0x7fff2f2e6ab5 RunCurrentEventLoopInMode + 293 > ( 24) 24 HIToolbox 0x7fff2f2e66f4 ReceiveNextEventCommon : 24 HIToolbox > 0x7fff2f2e66f4 ReceiveNextEventCommon + 371 > ( 25) 25 HIToolbox 0x7fff2f2e6568 > _BlockUntilNextEventMatchingListInModeWithFilter : 25 HIToolbox > 0x7fff2f2e6568 _BlockUntilNextEventMatchingListInModeWithFilter + 64 > ( 26) 26 AppKit 0x7fff2d5a1363 _DPSNextEvent : 26 AppKit > 0x7fff2d5a1363 _DPSNextEvent + 997 > ( 27) 27 AppKit 0x7fff2d5a0102 -[NSApplication: 27 AppKit > 0x7fff2d5a0102 -[NSApplication(NSEvent) > _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 1362 > ( 28) 28 AppKit 0x7fff2d59a165 -[NSApplication run] : 28 AppKit > 0x7fff2d59a165 -[NSApplication run] + 699 > ( 29) 29 libqcocoa.dylib 0x00010925f73d > _ZN21QCocoaEventDispatcher13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE > : 29 libqcocoa.dylib 0x00010925f73d > _ZN21QCocoaEventDispatcher13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE > + 2445 > ( 30) 30 QtCore 0x0001067536e1 > _ZN10QEventLoop4execE6QFlagsINS_17ProcessEventsFlagEE : 30 QtCore > 0x0001067536e1 _ZN10QEventLoop4execE6QFlagsINS_17ProcessEventsFlagEE + 417 > ( 31) 31 QtCore 0x000106758288 _ZN16QCoreApplication4execEv : 31 QtCore > 0x000106758288 _ZN16QCoreApplication4execEv + 392 > ( 32) 32 lyx 0x0001053a2841 _ZN3lyx3LyX4execERiPPc : 32 lyx > 0x0001053a2841 _ZN3lyx3LyX4execERiPPc + 1009 > ( 33) 33 lyx 0x000105284d1f main : 33 lyx 0x000105284d1f main + 79 > ( 34) 34 libdyld.dylib 0x7fff5d29ded9 start : 34 libdyld.dylib > 0x7fff5d29ded9 start + 1 > > > > Am 25. Apr. 2019, 22:50 +0200 schrieb Stephan Witt : >> Am 25.04.2019 um 22:14 schrieb jezZiFeR : >>> >>> Dear Stephan, >>> >>> with the tutorial it seems that I also do not get a SIGSEV in some cases, >>> but I also get it when I go via: >>> file–export–export as… >>> >>> When I tried this for.html I also got this message: »Kann keinen >>> LaTeX-Befehl für das Zeichen '⌃' (Code-Punkt 0x2303) finden. >>> Einige Zeichen Ihres Dokuments sind mit der gewählten Kodierung >>> wahrscheinlich nicht darstellbar. >>> Eine Änderung der Dokumentkodierung auf 'utf8' könnte helfen.« >>> >>> This means something like the is no LaTeX-command for »^« and it is >>> suggested to change the document’s coding to utf 8, which is not possible >>> for the tutorial-file. >> >> Yes, this is a problem with the tutorial in LyX 2.3.2 and will be fixed in >> 2.3.3. >> >>> >>> Maybe it is interesting, that the dialogue of two different documents (mine >>> & German tutorial) produce different export-possibilities and the tutorial >>> has much more. I add two screenshots to make it clearer. >> >> Hmmm, the export possibilities efficiently depend on the existing >> converters. The longer list doesn’t necessarily mean more possibilities… I >> don’t know off-hand how the lists are constructed and why they contain >> useless entries. >> >>> >>> Excuse me, what is the »sack trace«? Should I send you the details of the >>> SIGSEV? >> >> Yes, please. >> >> Best regards, >> Stephan >> >>> I will continue trying some more documents tomorrow. >>> >>> Thank you, all best >>> Jess >>> >>> >>> >>> >>> >>> >>> >>> Am 25. Apr. 2019, 21:52 +0200 schrieb Stephan Witt : >>>> Am 25.04.2019 um 18:56 schrieb jezZiFeR : >>>>> >>>>> Hello, >>>>> >>>>> I try to export ePubs for a while now and it never worked in different >>>>> configurations. LyX was always freezing. In the moment I use OS 10.14.3, >>>>> Intel and LyX 2.3.2 with TeXLive 2018. >>>>> >>>>> When I do the following: >>>>> + file – export as – DocBook (XML) and then try to save I get the >>>>> follwing error: >>>>> »Keine Informationen vorhanden, um das Format DocBook (XML) zu >>>>> exportieren.« >>>>> In English this means something like »There is no information to export >>>>> the format DocBook (XML). >>>>> + Now I click »cancel« and get a SIGSEV-signal, like every time. >>>>> >>>>> Does anybody have a hint what I could do? Would be very good to use XML, >>>>> maybe there is a workaround. >>>>> >>>>> Thanks, all best >>>>> Jess >>>> >>>> Hello Jess, >>>> >>>> I’m on Mac OS 10.14.4 with LyX 2.3.2. I’m unable to export to DocBook too. >>>> But - at least with the Tutorial - I don’t get a crash on Cancel. >>>> >>>> Can you send me the stack trace please and tell me if it happens with >>>> every LyX document or with some of your documents only? >>>> >>>> Best regards, >>>> Stephan >>> >> 22.04.20.png> >>
Re: LyX<-->word --> Latex to XML
FWIW, I've done some LyX->LyXHTML->XML conversions, as well as LyX->XML via Python (to fix text styling open/close mismatching) and then XSLT. You can find that work here: https://github.com/nicowilliams/lyx2rfc It's a bit old and rotted, but it illustrates something like LaTeX->XML, but for LyX in particular.
Re: public identifier for DocBook XML export + unescaped '&' in
On 01/08/2017 05:30 PM, Martin A. Brown wrote: > Hello, > >>> Yes, I can and I yes it works. I have attached the patch. I >>> have never touched C++ before, so this is just the dumbest thing >>> I could suggest, though it seems to do the trick. >> Thanks, I've committed these. They're sufficiently minor that we >> don't really NEED a license agreement, but if you'd like to be >> added to the LyX credits (especially if we're going to have you >> help us clean up the DocBook export), just send a message to this >> list saying something like: "I hereby license my contributions to >> LyX under the General Public License, version 2 or any later >> version." > For my contributions to LyX, I hereby license my contributions to > LyX under the General Public License, version 2 or any later > version. I've added you to the credits. Richard
Re: public identifier for DocBook XML export + unescaped '&' in
On Sun, Jan 08, 2017 at 02:30:09PM -0800, Martin A. Brown wrote: > > Hello, > > >> Yes, I can and I yes it works. I have attached the patch. I > >> have never touched C++ before, so this is just the dumbest thing > >> I could suggest, though it seems to do the trick. > > > >Thanks, I've committed these. They're sufficiently minor that we > >don't really NEED a license agreement, but if you'd like to be > >added to the LyX credits (especially if we're going to have you > >help us clean up the DocBook export), just send a message to this > >list saying something like: "I hereby license my contributions to > >LyX under the General Public License, version 2 or any later > >version." > > For my contributions to LyX, I hereby license my contributions to > LyX under the General Public License, version 2 or any later > version. > > -Martin > > P.S. Thanks for the direction on that, Richard. Thanks for your patch, Martin. docbook in LyX hasn't received much love lately. If you're curious about trying to solve other LyX + docbook issues, take a look at the following open issues: http://www.lyx.org/trac/query?status=!closed=docbook+export Scott signature.asc Description: PGP signature
Re: public identifier for DocBook XML export + unescaped '&' in
Hello, >> Yes, I can and I yes it works. I have attached the patch. I >> have never touched C++ before, so this is just the dumbest thing >> I could suggest, though it seems to do the trick. > >Thanks, I've committed these. They're sufficiently minor that we >don't really NEED a license agreement, but if you'd like to be >added to the LyX credits (especially if we're going to have you >help us clean up the DocBook export), just send a message to this >list saying something like: "I hereby license my contributions to >LyX under the General Public License, version 2 or any later >version." For my contributions to LyX, I hereby license my contributions to LyX under the General Public License, version 2 or any later version. -Martin P.S. Thanks for the direction on that, Richard. -- Martin A. Brown http://linux-ip.net/
Re: public identifier for DocBook XML export + unescaped '&' in
On 01/07/2017 06:55 PM, Martin A. Brown wrote: > Richard, > >>> I have used LyX on and off for many years and, working >>> sporadically with TLDP [0], I have handled a few documents that >>> were written in LyX. Thank you to the LyX team for your work on >>> this tool over the years. >>> >>> I have two questions today, after examining the DocBook XML >>> output from the 2.2.x series. >>> >>> Question 1 >>> -- >>> Is it possible to change the public identifier for the DocBook >>> XML 4.2 output processor to use: >>> >>> -//OASIS//DTD DocBook XML V4.2//EN # -- my suggestion >>> -//OASIS//DTD DocBook XML//EN# -- current identifier [1] >>> >>> I have checked the XML catalogs on several different platforms >>> and I cannot find a reference to the latter identifier, and I >>> think it may simply be an oversight. The system identifier (the >>> URL [2]) is correct. >> This code goes way, way back to 2004. It seems to have been >> introduced at 33243f700. It appears that the one without "V4.2" was >> meant to be for XML, whereas the other one was meant to be for >> SGML. It's easy enough to change it so we output the same thing >> both times, but let me cc José and see if he has any thoughts. > OK, great! And thank you for the quick reply! > > The SGML public identifier (on line 2032) is correct. > > -//OASIS//DTD DocBook V4.2//EN # -- for DocBook SGML at V4.2 > -//OASIS//DTD DocBook XML V4.2//EN # -- for DocBook XML at V4.2 > >>> Question 2 >>> -- >>> When running the DocBook XML export function, I discover that not >>> all text with '&' is not getting properly escaped with the XML >>> entity . There's clearly code to handle that: >>> >>> http://www.lyx.org/trac/browser/lyxgit/src/sgml.cpp#L46 >>> >>> To the best of my ability I traced down a case of a Hyperlink >>> whose text is not properly XML-escaped. I think this is the >>> line, but I'm not certain: >>> >>> http://www.lyx.org/trac/browser/lyxgit/src/insets/InsetHyperlink.cpp#L235 >>> >>> int InsetHyperlink::docbook(odocstream & os, OutputParams const &) const >>> { >>> os << ">> << subst(getParam("target"), from_ascii("&"), >>> from_ascii("")) >>> << "\">" >>> << getParam("name") >>> << ""; >>> return 0; >>> } >>> >>> I think that getParam("name") also needs to be run through >>> sgml::escapeString. >> Yes, that seems right. Since you have the git repo, can you make >> this change and test it? I'm not sure anyone on the development >> team actually uses the docbook classes. > Yes, I can and I yes it works. I have attached the patch. I have never > touched C++ before, so this is just the dumbest thing I could suggest, though > it seems to do the trick. Thanks, I've committed these. They're sufficiently minor that we don't really NEED a license agreement, but if you'd like to be added to the LyX credits (especially if we're going to have you help us clean up the DocBook export), just send a message to this list saying something like: "I hereby license my contributions to LyX under the General Public License, version 2 or any later version." Richard
Re: public identifier for DocBook XML export + unescaped '&' in
>> I have two questions today, after examining the DocBook XML output >> from the 2.2.x series. > >You might want to read http://www.lyx.org/trac/ticket/7009 I see. >I believe if someone competent told us (hint) what we are supposed >to output, the patch to port docbook for something up-to-date would >be actually pretty small. Well, I think the only tricky bit for generating valid DocBook XML from LyX has been the bits and author specification. While I'm very comfortable with DocBook, I confess to being little more than a sometime user of LyX. I may be able to help and provide some suggestions, but if I can produce valid DocBook XML 4.2 with the export tool, that's great with me! I think there would be a bit of work to switch the output routines to produce a DocBook in the 5.x series, but I have only stumbled across the two (minor) issues as a result of a recent resubmission of documents to TLDP. I cannot claim significant experience with DocBook 5.x. I have a bit more with DocBook 4.x and would be happy to advise if there are pending questions. I will trawl the lyx ticket queue. Thank you again to all contributors for the LyX tool, -Martin -- Martin A. Brown http://linux-ip.net/
Re: public identifier for DocBook XML export + unescaped '&' in
Richard, >> I have used LyX on and off for many years and, working >> sporadically with TLDP [0], I have handled a few documents that >> were written in LyX. Thank you to the LyX team for your work on >> this tool over the years. >> >> I have two questions today, after examining the DocBook XML >> output from the 2.2.x series. >> >> Question 1 >> -- >> Is it possible to change the public identifier for the DocBook >> XML 4.2 output processor to use: >> >> -//OASIS//DTD DocBook XML V4.2//EN # -- my suggestion >> -//OASIS//DTD DocBook XML//EN# -- current identifier [1] >> >> I have checked the XML catalogs on several different platforms >> and I cannot find a reference to the latter identifier, and I >> think it may simply be an oversight. The system identifier (the >> URL [2]) is correct. > >This code goes way, way back to 2004. It seems to have been >introduced at 33243f700. It appears that the one without "V4.2" was >meant to be for XML, whereas the other one was meant to be for >SGML. It's easy enough to change it so we output the same thing >both times, but let me cc José and see if he has any thoughts. OK, great! And thank you for the quick reply! The SGML public identifier (on line 2032) is correct. -//OASIS//DTD DocBook V4.2//EN # -- for DocBook SGML at V4.2 -//OASIS//DTD DocBook XML V4.2//EN # -- for DocBook XML at V4.2 >> Question 2 >> -- >> When running the DocBook XML export function, I discover that not >> all text with '&' is not getting properly escaped with the XML >> entity . There's clearly code to handle that: >> >> http://www.lyx.org/trac/browser/lyxgit/src/sgml.cpp#L46 >> >> To the best of my ability I traced down a case of a Hyperlink >> whose text is not properly XML-escaped. I think this is the >> line, but I'm not certain: >> >> http://www.lyx.org/trac/browser/lyxgit/src/insets/InsetHyperlink.cpp#L235 >> >> int InsetHyperlink::docbook(odocstream & os, OutputParams const &) const >> { >> os << "> << subst(getParam("target"), from_ascii("&"), >> from_ascii("")) >> << "\">" >> << getParam("name") >> << ""; >> return 0; >> } >> >> I think that getParam("name") also needs to be run through >> sgml::escapeString. > >Yes, that seems right. Since you have the git repo, can you make >this change and test it? I'm not sure anyone on the development >team actually uses the docbook classes. Yes, I can and I yes it works. I have attached the patch. I have never touched C++ before, so this is just the dumbest thing I could suggest, though it seems to do the trick. Best regards, -Martin -- Martin A. Brown http://linux-ip.net/diff --git a/src/Buffer.cpp b/src/Buffer.cpp index 2b2660e..94e94fa 100644 --- a/src/Buffer.cpp +++ b/src/Buffer.cpp @@ -2026,7 +2026,7 @@ void Buffer::writeDocBookSource(odocstream & os, string const & fname, if (! tclass.class_header().empty()) os << from_ascii(tclass.class_header()); else if (runparams.flavor == OutputParams::XML) - os << "PUBLIC \"-//OASIS//DTD DocBook XML//EN\" " + os << "PUBLIC \"-//OASIS//DTD DocBook XML V4.2//EN\" " << "\"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd\";; else os << " PUBLIC \"-//OASIS//DTD DocBook V4.2//EN\""; diff --git a/src/insets/InsetHyperlink.cpp b/src/insets/InsetHyperlink.cpp index 54f1f2c..039f553 100644 --- a/src/insets/InsetHyperlink.cpp +++ b/src/insets/InsetHyperlink.cpp @@ -22,6 +22,7 @@ #include "LaTeXFeatures.h" #include "OutputParams.h" #include "output_xhtml.h" +#include "sgml.h" #include "support/docstream.h" #include "support/FileName.h" @@ -232,7 +233,7 @@ int InsetHyperlink::docbook(odocstream & os, OutputParams const &) const os << "" - << getParam("name") + << sgml::escapeString(getParam("name")) << ""; return 0; }
Re: public identifier for DocBook XML export + unescaped '&' in
Martin A. Brown wrote: > I have two questions today, after examining the DocBook XML output > from the 2.2.x series. You mmight want to read http://www.lyx.org/trac/ticket/7009 I believe if someone competent told us (hint) what we are supposed to output, the patch to port docbook for something up-to-date would be actually pretty small. Pavel
Re: public identifier for DocBook XML export + unescaped '&' in
On 01/07/2017 12:46 PM, Martin A. Brown wrote: > Hello all, > > I have used LyX on and off for many years and, working sporadically with TLDP > [0], I have handled a few documents that were written in LyX. Thank you to > the LyX team for your work on this tool over the years. > > I have two questions today, after examining the DocBook XML output from the > 2.2.x series. > > Question 1 > -- > Is it possible to change the public identifier for the DocBook XML 4.2 output > processor to use: > > -//OASIS//DTD DocBook XML V4.2//EN # -- my suggestion > -//OASIS//DTD DocBook XML//EN# -- current identifier [1] > > I have checked the XML catalogs on several different platforms and I cannot > find a reference to the latter identifier, and I think it may simply be an > oversight. The system identifier (the URL [2]) is correct. This code goes way, way back to 2004. It seems to have been introduced at 33243f700. It appears that the one without "V4.2" was meant to be for XML, whereas the other one was meant to be for SGML. It's easy enough to change it so we output the same thing both times, but let me cc José and see if he has any thoughts. > Question 2 > -- > When running the DocBook XML export function, I discover that not all text > with '&' is not getting properly escaped with the XML entity . There's > clearly code to handle that: > > http://www.lyx.org/trac/browser/lyxgit/src/sgml.cpp#L46 > > To the best of my ability I traced down a case of a Hyperlink whose text is > not properly XML-escaped. I think this is the line, but I'm not certain: > > http://www.lyx.org/trac/browser/lyxgit/src/insets/InsetHyperlink.cpp#L235 > > int InsetHyperlink::docbook(odocstream & os, OutputParams const &) const > { > os << " << subst(getParam("target"), from_ascii("&"), > from_ascii("")) > << "\">" > << getParam("name") > << ""; > return 0; > } > > I think that getParam("name") also needs to be run through sgml::escapeString. Yes, that seems right. Since you have the git repo, can you make this change and test it? I'm not sure anyone on the development team actually uses the docbook classes. Richard
public identifier for DocBook XML export + unescaped '&' in
Hello all, I have used LyX on and off for many years and, working sporadically with TLDP [0], I have handled a few documents that were written in LyX. Thank you to the LyX team for your work on this tool over the years. I have two questions today, after examining the DocBook XML output from the 2.2.x series. Question 1 -- Is it possible to change the public identifier for the DocBook XML 4.2 output processor to use: -//OASIS//DTD DocBook XML V4.2//EN # -- my suggestion -//OASIS//DTD DocBook XML//EN# -- current identifier [1] I have checked the XML catalogs on several different platforms and I cannot find a reference to the latter identifier, and I think it may simply be an oversight. The system identifier (the URL [2]) is correct. (You may wonder why I noticed this. Any network-connected system will likely use the system identifier to fetch the DocBook DTDs. If running 'xsltproc --nonet', then the DTDs have to be located using the public identifier, which is used as a key into the local system's XML catalogs. No matching DTD can be found for the above public identifier, hence the document cannot be processed.) I have not contributed to LyX before, though I have a checkout of the LyX git repo and could provide a patch. (If this is an acceptable change, though, then I'd be happy if anybody wished to make the change; no worries around attribution.) Question 2 -- When running the DocBook XML export function, I discover that not all text with '&' is not getting properly escaped with the XML entity . There's clearly code to handle that: http://www.lyx.org/trac/browser/lyxgit/src/sgml.cpp#L46 To the best of my ability I traced down a case of a Hyperlink whose text is not properly XML-escaped. I think this is the line, but I'm not certain: http://www.lyx.org/trac/browser/lyxgit/src/insets/InsetHyperlink.cpp#L235 int InsetHyperlink::docbook(odocstream & os, OutputParams const &) const { os << "" << getParam("name") << ""; return 0; } I think that getParam("name") also needs to be run through sgml::escapeString. Thank you in advance for any consideration, -Martin [0] http://tldp.org/ [1] http://www.lyx.org/trac/browser/lyxgit/src/Buffer.cpp#L2026 [2] http://www.lyx.org/trac/browser/lyxgit/src/Buffer.cpp#L2027 -- Martin A. Brown http://linux-ip.net/
Re: LyX--word -- Latex to XML
http://jblevins.org/log/xml-tools This is an other link where the author has written about his experiences with various Latex to XML converters ᐧ On Sun, Mar 2, 2014 at 11:35 AM, Prannoy Pilligundla prannoy.b...@gmail.com wrote: I found an other software which does the same thing http://www-sop.inria.fr/marelle/tralics/ ᐧ On Sat, Mar 1, 2014 at 8:33 PM, stefano franchi stefano.fran...@gmail.com wrote: I just discovered the LaTeXtoXML project: http://dlmf.nist.gov/LaTeXML/ which is actively developed and may actually come very close to what we are aiming at. It is a perl-based attempt to recreate a subset of TeX with XML output. It is very math-oriented and, from a first look, not so bibliorgaphy oriented (although it does parse bibtex). did anyone know of it? I am going to try it on our test document and will report back on its current performance. In the meanwhile, if you have any reactions to such an approach, do not hesitate to share them. Stefano -- __ Stefano Franchi Associate Research Professor Department of Hispanic Studies Ph: +1 (979) 845-2125 Texas AM University Fax: +1 (979) 845-6421 College Station, Texas, USA stef...@tamu.edu http://stefano.cleinias.org
Re: LyX<-->word --> Latex to XML
http://jblevins.org/log/xml-tools This is an other link where the author has written about his experiences with various Latex to XML converters ᐧ On Sun, Mar 2, 2014 at 11:35 AM, Prannoy Pilligundla <prannoy.b...@gmail.com > wrote: > I found an other software which does the same thing > http://www-sop.inria.fr/marelle/tralics/ > ᐧ > > > On Sat, Mar 1, 2014 at 8:33 PM, stefano franchi <stefano.fran...@gmail.com > > wrote: > >> I just discovered the LaTeXtoXML project: >> >> http://dlmf.nist.gov/LaTeXML/ >> >> which is actively developed and may actually come very close to what >> we are aiming at. It is a perl-based attempt to recreate a subset of >> TeX with XML output. It is very math-oriented and, from a first look, >> not so bibliorgaphy oriented (although it does parse bibtex). >> >> did anyone know of it? I am going to try it on our test document and >> will report back on its current performance. >> In the meanwhile, if you have any reactions to such an approach, do >> not hesitate to share them. >> >> >> Stefano >> >> -- >> __ >> Stefano Franchi >> Associate Research Professor >> Department of Hispanic Studies Ph: +1 (979) 845-2125 >> Texas A University Fax: +1 (979) 845-6421 >> College Station, Texas, USA >> >> stef...@tamu.edu >> http://stefano.cleinias.org >> > >
LyX--word -- Latex to XML
I just discovered the LaTeXtoXML project: http://dlmf.nist.gov/LaTeXML/ which is actively developed and may actually come very close to what we are aiming at. It is a perl-based attempt to recreate a subset of TeX with XML output. It is very math-oriented and, from a first look, not so bibliorgaphy oriented (although it does parse bibtex). did anyone know of it? I am going to try it on our test document and will report back on its current performance. In the meanwhile, if you have any reactions to such an approach, do not hesitate to share them. Stefano -- __ Stefano Franchi Associate Research Professor Department of Hispanic Studies Ph: +1 (979) 845-2125 Texas AM University Fax: +1 (979) 845-6421 College Station, Texas, USA stef...@tamu.edu http://stefano.cleinias.org
Re: LyX--word -- Latex to XML
I found an other software which does the same thing http://www-sop.inria.fr/marelle/tralics/ ᐧ On Sat, Mar 1, 2014 at 8:33 PM, stefano franchi stefano.fran...@gmail.comwrote: I just discovered the LaTeXtoXML project: http://dlmf.nist.gov/LaTeXML/ which is actively developed and may actually come very close to what we are aiming at. It is a perl-based attempt to recreate a subset of TeX with XML output. It is very math-oriented and, from a first look, not so bibliorgaphy oriented (although it does parse bibtex). did anyone know of it? I am going to try it on our test document and will report back on its current performance. In the meanwhile, if you have any reactions to such an approach, do not hesitate to share them. Stefano -- __ Stefano Franchi Associate Research Professor Department of Hispanic Studies Ph: +1 (979) 845-2125 Texas AM University Fax: +1 (979) 845-6421 College Station, Texas, USA stef...@tamu.edu http://stefano.cleinias.org
LyX<-->word --> Latex to XML
I just discovered the LaTeXtoXML project: http://dlmf.nist.gov/LaTeXML/ which is actively developed and may actually come very close to what we are aiming at. It is a perl-based attempt to recreate a subset of TeX with XML output. It is very math-oriented and, from a first look, not so bibliorgaphy oriented (although it does parse bibtex). did anyone know of it? I am going to try it on our test document and will report back on its current performance. In the meanwhile, if you have any reactions to such an approach, do not hesitate to share them. Stefano -- __ Stefano Franchi Associate Research Professor Department of Hispanic Studies Ph: +1 (979) 845-2125 Texas A University Fax: +1 (979) 845-6421 College Station, Texas, USA stef...@tamu.edu http://stefano.cleinias.org
Re: LyX<-->word --> Latex to XML
I found an other software which does the same thing http://www-sop.inria.fr/marelle/tralics/ ᐧ On Sat, Mar 1, 2014 at 8:33 PM, stefano franchi <stefano.fran...@gmail.com>wrote: > I just discovered the LaTeXtoXML project: > > http://dlmf.nist.gov/LaTeXML/ > > which is actively developed and may actually come very close to what > we are aiming at. It is a perl-based attempt to recreate a subset of > TeX with XML output. It is very math-oriented and, from a first look, > not so bibliorgaphy oriented (although it does parse bibtex). > > did anyone know of it? I am going to try it on our test document and > will report back on its current performance. > In the meanwhile, if you have any reactions to such an approach, do > not hesitate to share them. > > > Stefano > > -- > __ > Stefano Franchi > Associate Research Professor > Department of Hispanic Studies Ph: +1 (979) 845-2125 > Texas A University Fax: +1 (979) 845-6421 > College Station, Texas, USA > > stef...@tamu.edu > http://stefano.cleinias.org >
Re: About XML LyX file format [was: [GSoC 2014]Interested in Round trip conversion between LyX and .docx formats]
On 02/27/2014 03:58 PM, Georg Baum wrote: Richard Heck wrote: On 02/27/2014 11:27 AM, Alex Vergara Gil wrote: I´m a LyX enthusiast and I can see how great this software is because I have used it for 5 years by now. I´ve always asked in this list for a static target lyx format that should be an intrinsic xml format, which can evolve without change its structure and has some great advantages over the current plain text format. Conversely the elyxer, lyx2lyx and other scripts should need an upgrade. My point is, Unless you have defined a static lyx format in which every one can work without worry of format changes you cannot have a robust plugin system. Developers can have more time to develop new features than parsing every new format. If xml is selected as static format, then a docx roundtrip will became easier to achieve because it is a matter of converting xml structures and the xml handling is very vast! I think it's broadly agreed that LyX should have such a format. The problem is finding the time to do it. It's on my radar, hopefully for this summer. Do you mean a native LyX file format which would be the primary format, or an auxiliary format for interfacing with external tools? The native file format would need to be non-static and would need format changes as LyX develops (so would face similar problems to external tools as current .lyx), and an intermediate static format would face similar problems regarding new features as current lyx2lyx when converting into an older format. In both cases I don't see how the format change problem could be solved. I just meant an XML-based format that would at least always be parsable. Of course new constructs would be added, e.g., new kinds of insets. Richard
Re: About XML LyX file format [was: [GSoC 2014]Interested in Round trip conversion between LyX and .docx formats]
Alex Vergara Gil wrote: Do you mean a native LyX file format which would be the primary format, or an auxiliary format for interfacing with external tools? I mean xml as primary format. Well, I wanted to know what Richard meant, but your opinion is welcome as well;-) If you have xml as primary format you doesn´t need anymore to change the format, just the content which can be dinamically added as xml is self descriptive, so you doesn´t have the problem of format changing, new features can be added as new self described objects. So a conversor must handle only with xml objects, and of course older versions may not be fully compatible with new features but they would still be able to read most of the files created with newer versions. Moreover a converter can handle insets, figures, etc in a more robust way if they are grouped into xml objects. I mean the objects it can interpret it can handle with them, and those who cannot well put as text or metadata as you wish, but it would still be readable. OK, so you mean by static something else than I thought: A format that has some standard ways how the contents is formatted, but it would not forbid new stuff, e.g. a new inset. This is something we have more or less already (of course not in XML, but an inset is inbetween \begin_inset and \end_inset, its parameters use a standard form as well etc). This is of course possible and desirable. My point is that any converter which should be part of a reliable export or round trip must be written for a specific version of the LyX (native or intermediate) file format. This is also true for XML based formats. They make it easier to handle unknown stuff gracefully and have a lot of other advantages, but if you want to really understand stuff and export it to a different format, then you need additional knowledge which you only have if you know the version. Let me give an example: Suppose an inset has a parameter named foo which can take two different values: 0 and 1. Now some LyX developer extends the inset so that it can be used in different ways, and this requires to tweak its parameters. The result is that the parameter foo can now be one of the four values a, b, c and d. The old value 0 is 1:1 equivalent to a, the old value 1 becomes either b or c, depending on another setting, and the new value d has no equivalent old value. Now assume that this is an important inset that is converted to some object in the output file of the converter. If the converter now gets the new format, how should it know how to handle the new values? Of course it could ignore them and assume some default, but this would result in a less accurate conversion. If you feed it the correct file format version, then all goes well. Therefore I am pretty sure that any conversion which would operate on a file produced by LyX (native or intermediate, XML or not) needs to be written for a specific version of this file format, and if the format evolves there should be a way to convert the newer versions into the one expected by the converter (either using something like lyx2lyx, some XSLT transformations or whatever is suitable). Otherwise you need too much guessing, and this does not work (the tex2lyx predecessor reLyX proved that). Georg
Re: About XML LyX file format [was: [GSoC 2014]Interested in Round trip conversion between LyX and .docx formats]
On 02/27/2014 03:58 PM, Georg Baum wrote: Richard Heck wrote: On 02/27/2014 11:27 AM, Alex Vergara Gil wrote: I´m a LyX enthusiast and I can see how great this software is because I have used it for 5 years by now. I´ve always asked in this list for a static target lyx format that should be an intrinsic xml format, which can evolve without change its structure and has some great advantages over the current plain text format. Conversely the elyxer, lyx2lyx and other scripts should need an upgrade. My point is, Unless you have defined a static lyx format in which every one can work without worry of format changes you cannot have a robust plugin system. Developers can have more time to develop new features than parsing every new format. If xml is selected as static format, then a docx roundtrip will became easier to achieve because it is a matter of converting xml structures and the xml handling is very vast! I think it's broadly agreed that LyX should have such a format. The problem is finding the time to do it. It's on my radar, hopefully for this summer. Do you mean a native LyX file format which would be the primary format, or an auxiliary format for interfacing with external tools? The native file format would need to be non-static and would need format changes as LyX develops (so would face similar problems to external tools as current .lyx), and an intermediate static format would face similar problems regarding new features as current lyx2lyx when converting into an older format. In both cases I don't see how the "format change" problem could be solved. I just meant an XML-based format that would at least always be parsable. Of course new constructs would be added, e.g., new kinds of insets. Richard
Re: About XML LyX file format [was: [GSoC 2014]Interested in Round trip conversion between LyX and .docx formats]
Alex Vergara Gil wrote: >> Do you mean a native LyX file format which would be the primary format, >> or an auxiliary format for interfacing with external tools? > > I mean xml as primary format. Well, I wanted to know what Richard meant, but your opinion is welcome as well;-) > If you have xml as primary format you doesn´t need anymore to change the > format, just the content which can be dinamically added as xml is self > descriptive, so you doesn´t have the problem of format changing, new > features can be added as new self described objects. So a conversor must > handle only with xml objects, and of course older versions may not be > fully compatible with new features but they would still be able to read > most of the files created with newer versions. Moreover a converter can > handle insets, figures, etc in a more robust way if they are grouped into > xml objects. I mean the objects it can interpret it can handle with them, > and those who cannot well put as text or metadata as you wish, but it > would still be readable. OK, so you mean by static something else than I thought: A format that has some standard ways how the contents is formatted, but it would not forbid new stuff, e.g. a new inset. This is something we have more or less already (of course not in XML, but an inset is inbetween \begin_inset and \end_inset, its parameters use a standard form as well etc). This is of course possible and desirable. My point is that any converter which should be part of a reliable export or round trip must be written for a specific version of the LyX (native or intermediate) file format. This is also true for XML based formats. They make it easier to handle unknown stuff gracefully and have a lot of other advantages, but if you want to really understand stuff and export it to a different format, then you need additional knowledge which you only have if you know the version. Let me give an example: Suppose an inset has a parameter named "foo" which can take two different values: "0" and "1". Now some LyX developer extends the inset so that it can be used in different ways, and this requires to tweak its parameters. The result is that the parameter "foo" can now be one of the four values "a", "b", "c" and "d". The old value "0" is 1:1 equivalent to "a", the old value "1" becomes either "b" or "c", depending on another setting, and the new value "d" has no equivalent old value. Now assume that this is an important inset that is converted to some object in the output file of the converter. If the converter now gets the new format, how should it know how to handle the new values? Of course it could ignore them and assume some default, but this would result in a less accurate conversion. If you feed it the correct file format version, then all goes well. Therefore I am pretty sure that any conversion which would operate on a file produced by LyX (native or intermediate, XML or not) needs to be written for a specific version of this file format, and if the format evolves there should be a way to convert the newer versions into the one expected by the converter (either using something like lyx2lyx, some XSLT transformations or whatever is suitable). Otherwise you need too much guessing, and this does not work (the tex2lyx predecessor reLyX proved that). Georg
About XML LyX file format [was: [GSoC 2014]Interested in Round trip conversion between LyX and .docx formats]
The downside to any python-based approach, though, is that the LyX format is a moving target. The script would need to be updated with every syntax change. I assume this problem would persist with a pandoc approach, isn't it? The Lyx reader module would still be format-dependent, unless we go with LaTeX. Stefano Dear all I´m a LyX enthusiast and I can see how great this software is because I have used it for 5 years by now. I´ve always asked in this list for a static target lyx format that should be an intrinsic xml format, which can evolve without change its structure and has some great advantages over the current plain text format. Conversely the elyxer, lyx2lyx and other scripts should need an upgrade. My point is, Unless you have defined a static lyx format in which every one can work without worry of format changes you cannot have a robust plugin system. Developers can have more time to develop new features than parsing every new format. If xml is selected as static format, then a docx roundtrip will became easier to achieve because it is a matter of converting xml structures and the xml handling is very vast! I only wanted to drop 2 cents in this discussion, so see this mail as a personal point of view Regards! Alex Vergara Gil MSc Nuclear Physics SSDL, CPHR, Havana Cuba http://www.cphr.edu.cu
Re: About XML LyX file format [was: [GSoC 2014]Interested in Round trip conversion between LyX and .docx formats]
On 02/27/2014 11:27 AM, Alex Vergara Gil wrote: The downside to any python-based approach, though, is that the LyX format is a moving target. The script would need to be updated with every syntax change. I assume this problem would persist with a pandoc approach, isn't it? The Lyx reader module would still be format-dependent, unless we go with LaTeX. Stefano Dear all I´m a LyX enthusiast and I can see how great this software is because I have used it for 5 years by now. I´ve always asked in this list for a static target lyx format that should be an intrinsic xml format, which can evolve without change its structure and has some great advantages over the current plain text format. Conversely the elyxer, lyx2lyx and other scripts should need an upgrade. My point is, Unless you have defined a static lyx format in which every one can work without worry of format changes you cannot have a robust plugin system. Developers can have more time to develop new features than parsing every new format. If xml is selected as static format, then a docx roundtrip will became easier to achieve because it is a matter of converting xml structures and the xml handling is very vast! I think it's broadly agreed that LyX should have such a format. The problem is finding the time to do it. It's on my radar, hopefully for this summer. Richard
Re: About XML LyX file format [was: [GSoC 2014]Interested in Round trip conversion between LyX and .docx formats]
Richard Heck wrote: On 02/27/2014 11:27 AM, Alex Vergara Gil wrote: I´m a LyX enthusiast and I can see how great this software is because I have used it for 5 years by now. I´ve always asked in this list for a static target lyx format that should be an intrinsic xml format, which can evolve without change its structure and has some great advantages over the current plain text format. Conversely the elyxer, lyx2lyx and other scripts should need an upgrade. My point is, Unless you have defined a static lyx format in which every one can work without worry of format changes you cannot have a robust plugin system. Developers can have more time to develop new features than parsing every new format. If xml is selected as static format, then a docx roundtrip will became easier to achieve because it is a matter of converting xml structures and the xml handling is very vast! I think it's broadly agreed that LyX should have such a format. The problem is finding the time to do it. It's on my radar, hopefully for this summer. Do you mean a native LyX file format which would be the primary format, or an auxiliary format for interfacing with external tools? The native file format would need to be non-static and would need format changes as LyX develops (so would face similar problems to external tools as current .lyx), and an intermediate static format would face similar problems regarding new features as current lyx2lyx when converting into an older format. In both cases I don't see how the format change problem could be solved. Georg
Re: About XML LyX file format [was: [GSoC 2014]Interested in Round trip conversion between LyX and .docx formats]
Do you mean a native LyX file format which would be the primary format, or an auxiliary format for interfacing with external tools? I mean xml as primary format. The native file format would need to be non-static and would need format changes as LyX develops (so would face similar problems to external tools as current .lyx), and an intermediate static format would face similar problems regarding new features as current lyx2lyx when converting into an older format. In both cases I don't see how the format change problem could be solved. If you have xml as primary format you doesn´t need anymore to change the format, just the content which can be dinamically added as xml is self descriptive, so you doesn´t have the problem of format changing, new features can be added as new self described objects. So a conversor must handle only with xml objects, and of course older versions may not be fully compatible with new features but they would still be able to read most of the files created with newer versions. Moreover a converter can handle insets, figures, etc in a more robust way if they are grouped into xml objects. I mean the objects it can interpret it can handle with them, and those who cannot well put as text or metadata as you wish, but it would still be readable. Georg Alex
About XML LyX file format [was: [GSoC 2014]Interested in Round trip conversion between LyX and .docx formats]
The downside to any python-based approach, though, is that the LyX format is a moving target. The script would need to be updated with every syntax change. I assume this problem would persist with a pandoc approach, isn't it? The Lyx reader module would still be format-dependent, unless we go with LaTeX. Stefano Dear all I´m a LyX enthusiast and I can see how great this software is because I have used it for 5 years by now. I´ve always asked in this list for a static target lyx format that should be an intrinsic xml format, which can evolve without change its structure and has some great advantages over the current plain text format. Conversely the elyxer, lyx2lyx and other scripts should need an upgrade. My point is, Unless you have defined a static lyx format in which every one can work without worry of format changes you cannot have a robust plugin system. Developers can have more time to develop new features than parsing every new format. If xml is selected as static format, then a docx roundtrip will became easier to achieve because it is a matter of converting xml structures and the xml handling is very vast! I only wanted to drop 2 cents in this discussion, so see this mail as a personal point of view Regards! Alex Vergara Gil MSc Nuclear Physics SSDL, CPHR, Havana Cuba http://www.cphr.edu.cu
Re: About XML LyX file format [was: [GSoC 2014]Interested in Round trip conversion between LyX and .docx formats]
On 02/27/2014 11:27 AM, Alex Vergara Gil wrote: The downside to any python-based approach, though, is that the LyX format is a moving target. The script would need to be updated with every syntax change. I assume this problem would persist with a pandoc approach, isn't it? The Lyx reader module would still be format-dependent, unless we go with LaTeX. Stefano Dear all I´m a LyX enthusiast and I can see how great this software is because I have used it for 5 years by now. I´ve always asked in this list for a static target lyx format that should be an intrinsic xml format, which can evolve without change its structure and has some great advantages over the current plain text format. Conversely the elyxer, lyx2lyx and other scripts should need an upgrade. My point is, Unless you have defined a static lyx format in which every one can work without worry of format changes you cannot have a robust plugin system. Developers can have more time to develop new features than parsing every new format. If xml is selected as static format, then a docx roundtrip will became easier to achieve because it is a matter of converting xml structures and the xml handling is very vast! I think it's broadly agreed that LyX should have such a format. The problem is finding the time to do it. It's on my radar, hopefully for this summer. Richard
Re: About XML LyX file format [was: [GSoC 2014]Interested in Round trip conversion between LyX and .docx formats]
Richard Heck wrote: > On 02/27/2014 11:27 AM, Alex Vergara Gil wrote: >> >> I´m a LyX enthusiast and I can see how great this software is because >> I have used it for 5 years by now. I´ve always asked in this list for >> a static target lyx format that should be an intrinsic xml format, >> which can evolve without change its structure and has some great >> advantages over the current plain text format. Conversely the elyxer, >> lyx2lyx and other scripts should need an upgrade. >> My point is, Unless you have defined a static lyx format in which >> every one can work without worry of format changes you cannot have a >> robust plugin system. Developers can have more time to develop new >> features than parsing every new format. >> If xml is selected as static format, then a docx roundtrip will became >> easier to achieve because it is a matter of converting xml structures >> and the xml handling is very vast! > > I think it's broadly agreed that LyX should have such a format. The > problem is finding the time to do it. It's on my radar, hopefully for > this summer. Do you mean a native LyX file format which would be the primary format, or an auxiliary format for interfacing with external tools? The native file format would need to be non-static and would need format changes as LyX develops (so would face similar problems to external tools as current .lyx), and an intermediate static format would face similar problems regarding new features as current lyx2lyx when converting into an older format. In both cases I don't see how the "format change" problem could be solved. Georg
Re: About XML LyX file format [was: [GSoC 2014]Interested in Round trip conversion between LyX and .docx formats]
Do you mean a native LyX file format which would be the primary format, or an auxiliary format for interfacing with external tools? I mean xml as primary format. The native file format would need to be non-static and would need format changes as LyX develops (so would face similar problems to external tools as current .lyx), and an intermediate static format would face similar problems regarding new features as current lyx2lyx when converting into an older format. In both cases I don't see how the "format change" problem could be solved. If you have xml as primary format you doesn´t need anymore to change the format, just the content which can be dinamically added as xml is self descriptive, so you doesn´t have the problem of format changing, new features can be added as new self described objects. So a conversor must handle only with xml objects, and of course older versions may not be fully compatible with new features but they would still be able to read most of the files created with newer versions. Moreover a converter can handle insets, figures, etc in a more robust way if they are grouped into xml objects. I mean the objects it can interpret it can handle with them, and those who cannot well put as text or metadata as you wish, but it would still be readable. Georg Alex
Re: XML Parsing Library [was Re: XML For LyX]
On 05/11/2013 07:11 AM, Abdelrazak Younes wrote: On Sat, May 11, 2013 at 12:03 PM, Abdelrazak Younes you...@lyx.org mailto:you...@lyx.org wrote: On Sat, May 11, 2013 at 8:40 AM, Pavel Sanda sa...@lyx.org mailto:sa...@lyx.org wrote: Abdelrazak Younes wrote: I will discuss that face2face during the meeting. You should bring mirror then, no one else in this thread is in Milano. Anyway it's too late, Richard already barricaded in underground garage of his house and won't show until 378 patches implementing xml is done as I infer from the last testament. I just discussed with Lars. He agrees that using Qt is a good option... what a shock ! :-) Vincent and JMarc don't care what we use. I am talking about QXmlStreamReader and (as a second step) QXmlStreamWriter. Our lexer class can just use QXmlStreamReader internally, we don't have to spread the use of this call all other the place. So Richard, let's use a new feature repo for that. This is agreed with Lars, Vincent and JMarc. Then we would create an xml branch in that repo. Lars and Vincent are setting this up right now :-) So now it is set up, look (and check) at the documentation here: http://wiki.lyx.org/Devel/LyXGit I have erased all old branches because we want only feature branches based on master. I just created xml branch in there. Richard, I guess you are still sleeping so I hope you agree with all that. My goal is that we collaborate on the XML support using this shared branch and repo. This is all fine with me. I'll look at the feature branch business probably tomorrow. Busy today. My intention was to work on writing a LyX file first, to try to stabilize the format, and then work on reading it. As far as the Lexer goes, is the proposal to add some XML methods that will be implemented using QXmlStreamReader? If so, I'm not sure I see the advantage of adding them to the Lexer, as opposed to creating a new class for reading XML files. Is the suggestion then also to write some sort of wrapper for QXmlStreamWriter rather than to use its methods directly? While we're at this, I note that QXmlStream* wants a QIoDevice on which to operate, probably a QFile in our case. Any idea about how that should be handled? What about zipped files? Richard
Re: XML Parsing Library [was Re: XML For LyX]
On 05/11/2013 07:11 AM, Abdelrazak Younes wrote: On Sat, May 11, 2013 at 12:03 PM, Abdelrazak Younes <you...@lyx.org <mailto:you...@lyx.org>> wrote: On Sat, May 11, 2013 at 8:40 AM, Pavel Sanda <sa...@lyx.org <mailto:sa...@lyx.org>> wrote: Abdelrazak Younes wrote: > I will discuss that face2face during the meeting. You should bring mirror then, no one else in this thread is in Milano. Anyway it's too late, Richard already barricaded in underground garage of his house and won't show until 378 patches implementing xml is done as I infer from the last testament. I just discussed with Lars. He agrees that using Qt is a good option... what a shock ! :-) Vincent and JMarc don't care what we use. I am talking about QXmlStreamReader and (as a second step) QXmlStreamWriter. Our lexer class can just use QXmlStreamReader internally, we don't have to spread the use of this call all other the place. So Richard, let's use a new "feature" repo for that. This is agreed with Lars, Vincent and JMarc. Then we would create an "xml" branch in that repo. Lars and Vincent are setting this up right now :-) So now it is set up, look (and check) at the documentation here: http://wiki.lyx.org/Devel/LyXGit I have erased all old branches because we want only feature branches based on "master". I just created "xml" branch in there. Richard, I guess you are still sleeping so I hope you agree with all that. My goal is that we collaborate on the XML support using this shared branch and repo. This is all fine with me. I'll look at the feature branch business probably tomorrow. Busy today. My intention was to work on writing a LyX file first, to try to stabilize the format, and then work on reading it. As far as the Lexer goes, is the proposal to add some XML methods that will be implemented using QXmlStreamReader? If so, I'm not sure I see the advantage of adding them to the Lexer, as opposed to creating a new class for reading XML files. Is the suggestion then also to write some sort of wrapper for QXmlStreamWriter rather than to use its methods directly? While we're at this, I note that QXmlStream* wants a QIoDevice on which to operate, probably a QFile in our case. Any idea about how that should be handled? What about zipped files? Richard