subject:"\[NTG\-context\] EPUB XHTML Format"

Re: [NTG-context] EPUB XHTML Format

2013-09-12 Thread Alan BRASLAU

On Thu, 5 Sep 2013 19:22:42
Aditya Mahajan adit...@umich.edu wrote:

 How easy is it to create a new export format. IIRC, context keeps track of 
 the entire document tree, and flushes the XML output only at the end. Is 
 it possible to make this pluggable so that users can write their own 
 transformers (in lua) on how the document tree can be written. This will 
 enable more output formats (opendocument and (shudder) latex).

Or, (gasp!) MSword .docx

Alan
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

2013-09-07 Thread Hans Hagen


On 9/6/2013 10:20 PM, Thangalin wrote:

Hi,

The best reader imho is iBooks on the iPad, nothing else, from what
I've seen, comes close. But that is one expensive eReader. :(


We'll just have everybody in the world who has a Kindle, Kobo, or other
reader exchange their existing hardware, and then purchase an iPad plus
iBook. Problem solved? ;-)

ConTeXT TeX reading xml - export - optional transform - EPUB + CSS*
you want 'direct epub html from context' (no xslt) but on the other
hand use xslt to map onto context while context can do xml directly
... chicken egg


Well, given that ConTeXt doesn't actually produce validating EPUB
documents, I suspect not many people will actually use that feature.
It's great in theory, but if it produces books that don't actually work
on the Kindle or Kobo, then it's unusable in practice -- never mind not
being able to add the books to online marketplaces (such as Amazon)
because, again, the output does not validate.


context doesn't produce epub (which at this moment is so floating that i 
would keep updating, which is fine if i'd use it myself or in projects 
at pragma, but not for the sake of keeping up) but does an export to xml 
(*.export)


as a bonus it can output some extra stuff so that in a browser that can 
deal with xml+css (and a few xhtml tags for hyperlinks) we can preview


then there is mtx-epub that can make an epub but that is a moving target 
(at some point we stopped extending waiting for a decent standard)


so, i'd never claim that context produces epub but it can be used in a 
workflow that involves epub as it outputs xml which can be transformed


supporting all variants of epub in the backend would be the same as 
hardcoding all kind of xml dts in the frontend (docbook, tei, whatever); 
instead we provide a general xml handler and a general xml export


Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

2013-09-07 Thread Thangalin

Hi,

so, i'd never claim that context produces epub but it can be used in a
 workflow that involves epub as it outputs xml which can be transformed


That's a distinction that either might not matter or sometimes is lost:

http://tex.stackexchange.com/a/17642/2148
http://wiki.contextgarden.net/epub
ConTeXt has preliminary epub http://en.wikipedia.org/wiki/EPUBsupport...

Does ConTeXt refer to a suite of tools, or only the context command?
Either way, it appears that the line between the command and the tool set
is blurred a bit. This is completely understandable, too, as you wouldn't
want to write, the ConTeXt suite of tools includes a command, mtxrun, that
can produce EPUB files all the time when talking about EPUBs.


 supporting all variants of epub in the backend would be the same as
 hardcoding all kind of xml dts in the frontend (docbook, tei, whatever);
 instead we provide a general xml handler and a general xml export


That paragraph would be an excellent addition to the wiki; not sure where
though.

Kind regards.
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

2013-09-06 Thread Mica Semrick

Another small note, since I just walked down the ePUB path: you'll be very
sad to find out that a lot of rendering engines for popular readers are not
consistent, won't render standard XHTML markup correctly (nest an ordered
list within an unordered list and then look at it in adobe digital editions
and several other readers). But it is just XHML + CSS! you'll cry, How
can they not render it correctly? I don't know, but it was an extremely
frustrating process. I even contacted adobe to try and report this nested
list bug to them... their suggestion was that I could *pay* them to work
with content experts who would help me correct my source so that it
would render correctly.

The best reader imho is iBooks on the iPad, nothing else, from what I've
seen, comes close. But that is one expensive eReader. :(


On Thu, Sep 5, 2013 at 3:00 PM, Thangalin thanga...@gmail.com wrote:

 Hi,

 handle XML+CSS well. However, most (all?) EPUB readers don't. So, the
 question is asking if instead ConTeXt could generate a XHTML


 Precisely.


  If you need both EPUB and PDF, start with a semantically rich XML
 vocabulary, e.g. DocBook. In this case you can relatively easy transfrom


 My database doesn't generate DocBook. It generates a custom XML document
 from which I generate a web page, and a LaTeX document (though soon to be
 ConTeXt!). There is no reason, technically, why I cannot convert the source
 XML to either DocBook or directly to EPUB. There are, however, problems
 doing that, which Aditya correctly surmises:


 - Automatic section numbering taking care of different conversions.
 - Automatic index generation and sorting
 - Inserting hyphenation points at the appropriate place in the generated
 output (so that the browser can effectively rely on TeX's hyphenation
 algorithm to do line-breaking).

 - Convert TeX math to MathML.

 The current ConTeXT XML source can translate a well formed ConTeXt
 document into a XML document with the above features.


 Those are exactly the issues that I would love to resolve using ConTeXt
 for generating an EPUB. (The MathML isn't as important to me, but I can see
 other people wanting such a feature.)

 What about accessibility? I expect that visually impaired people would
 depend on document structure rather than its visualisation.


 That is a good point. The current XML structure produced by ConTeXt (Hans
 correct me here if I'm mistaken) is not accessible, as it doesn't adhere to
 strict XHTML. I suspect that div tags would not be accessible -- the only
 way to provide true accessibility in EPUB format would be by using the
 strict XHTML tags.

 for instance, we have more levels than H1..H6, so how to do H7? if someone
 has to deal with that, he/she can as well transform all into H1 with some
 class which is a local solution then


 I realize there is not going to be a one-to-one map of all possible
 ConTeXt macros to XHTML. For someone who has 7 levels of nested sections
 they would either have to rewrite some Lua or perform some post-processing
 (e.g., with XSLT). I would posit that a document with 7 levels of nested
 sections is not going to be a common occurrence.

 When I talk about strict XHTML, I'm proposing that a _simple_ ConTeXt
 document (up to 6 header levels, numbered and unnumbered lists, images,
 text emphasis, etc.) should generate a simple, validating XHTML document.
 Trying to attain 100% coverage of ConTeXt transmogrification to XHTML is
 ridiculous when, I suspect, 80% coverage would meet most needs. :-)

 It is definitely possible to translate the ConTeXt EPUB output to XHTML.
 However, there are practical realities that hinder such an approach.
 Architecturally, if anyone is going to translate an XML document to EPUB
 format, it certainly won't be this way:

 *XML + XSLT - ConTeXT File - ConTeXt EPUB XML + XSLT - EPUB + CSS*

 It'll be this way, which is less time-consuming, less complex, and less
 susceptible to err:

 *XML + XSLT (or API) - EPUB + CSS*

 However, it does not, as we all know, produce as feature rich output as
 leveraging the ConTeXt abilities that Aditya mentioned, which was the point:

 *XML + XSLT - ConTeXT TeX - EPUB + CSS*

 Kindest regards.


 ___
 If your question is of interest to others as well, please add an entry to
 the Wiki!

 maillist : ntg-context@ntg.nl /
 http://www.ntg.nl/mailman/listinfo/ntg-context
 webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
 archive  : http://foundry.supelec.fr/projects/contextrev/
 wiki : http://contextgarden.net

 ___

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl /

Re: [NTG-context] EPUB XHTML Format

2013-09-06 Thread Hans Hagen


On 9/6/2013 12:00 AM, Thangalin wrote:


That is a good point. The current XML structure produced by ConTeXt
(Hans correct me here if I'm mistaken) is not accessible, as it doesn't
adhere to strict XHTML. I suspect that div tags would not be
accessible -- the only way to provide true accessibility in EPUB format
would be by using the strict XHTML tags.


html is not rich enough .. one ends up with abusing tags which in turn 
is confusing for accesibility ... i once saw an epub where h1 was used 
for the chapter number and h2 for the chapter title



When I talk about strict XHTML, I'm proposing that a _simple_ ConTeXt
document (up to 6 header levels, numbered and unnumbered lists, images,
text emphasis, etc.) should generate a simple, validating XHTML
document. Trying to attain 100% coverage of ConTeXt transmogrification
to XHTML is ridiculous when, I suspect, 80% coverage would meet most
needs.. :-)


in that case a few page transformation could do, isn't it?


*XML + XSLT - ConTeXT TeX - EPUB + CSS*


probably ok for novels but who there is no way to limit the user ... so 
in the end we still have a complex mix to deal with ... i'd rather have


ConTeXT TeX reading xml - export - optional transform - EPUB + CSS*

you want 'direct epub html from context' (no xslt) but on the other hand 
use xslt to map onto context while context can do xml directly ... 
chicken egg


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

2013-09-06 Thread Thangalin

Hi,

The best reader imho is iBooks on the iPad, nothing else, from what I've
 seen, comes close. But that is one expensive eReader. :(


We'll just have everybody in the world who has a Kindle, Kobo, or other
reader exchange their existing hardware, and then purchase an iPad plus
iBook. Problem solved? ;-)

ConTeXT TeX reading xml - export - optional transform - EPUB + CSS*
 you want 'direct epub html from context' (no xslt) but on the other hand
 use xslt to map onto context while context can do xml directly ... chicken
 egg


Well, given that ConTeXt doesn't actually produce validating EPUB
documents, I suspect not many people will actually use that feature. It's
great in theory, but if it produces books that don't actually work on the
Kindle or Kobo, then it's unusable in practice -- never mind not being able
to add the books to online marketplaces (such as Amazon) because, again,
the output does not validate.

Kind regards.
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

2013-09-06 Thread Aditya Mahajan


On Fri, 6 Sep 2013, Thangalin wrote:


Hi,

never mind not being able to add the books to online marketplaces (such as

Amazon) because, again, the output does not validate.



I think the simplest thing to do would be to update the wiki and have a
note that informs readers that while ConTeXt can be used to generate an
EPUB, it is likely that that EPUB will be unusable for devices without
further transformation of the XML content. At least that way the knowledge
is out there and people are forewarned that not all EPUB documents are
equivalent.


It will also be nice to add a table that lists the EPUB readers (hardware 
and software) and tells whether ConTeXt produced EPUB documents work on 
them.


Aditya
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

2013-09-06 Thread Thangalin

Hi,

never mind not being able to add the books to online marketplaces (such as
 Amazon) because, again, the output does not validate.


I think the simplest thing to do would be to update the wiki and have a
note that informs readers that while ConTeXt can be used to generate an
EPUB, it is likely that that EPUB will be unusable for devices without
further transformation of the XML content. At least that way the knowledge
is out there and people are forewarned that not all EPUB documents are
equivalent.

Kindest regards.
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

2013-09-05 Thread Hans Hagen


On 9/4/2013 7:55 PM, Thangalin wrote:

Hi.

of course we could alternatively export all as div
class=tag-subtag-... but i don't like that too much; html itself
is not rich enough for our purpose

What about giving developers the ability to change the destination
element? For example:

\setuplist[chapter][
   xml={\starttag[h1]#1\stoptag}
]

Would produce, upon export:

h1Chapter/h1


export doesn't happen at that level; something like that would add an 
ugly overhead; it's way easier to make some xslt script that converts 
the rather systematic export to something like that and it only has to 
be written once by someone (not me)



Or (using export instead of xml; I don't care what it is named):

\setuplist[chapter][


export={\starttag[div]\startattribute[class]{chapter}#1\stopattribute\stoptag}}
]

Similarly, this would produce:

div class=chapterChapter/div


you use some tex syntax but it all happens in lua; also, the only way to 
provide some kind of different tagging is to support plugins (read: lua 
functions) that could override default behaviour (but again, it's quite 
easy to do that as a postprocessing step)



This would offer the flexibility of custom XML documents without
affecting the default behaviour.

   * Generates XHTML headers (including !DOCTYPE and html...)

not needed as we're 'standalone'

Having the ability to produce the !DOCTYPE... and htmnl elements
could be as simple as:

\setupexport[
   standalone=no,
]

   * Produces images as img tags, rather than float tags.

the css can deal with them (info is written to files for that)

Yes, but they aren't standard. There is an ecosystem of tools (e.g.,
Calibre, normalizing CSS templates, etc.), not to mention a widespread
knowledge-base, that groks the minimal XHTML specification. Plus, using
XML tags that are not in the minimal XHTML spec. means more testing on
more devices to make sure that their XHTML parsers render correctly.


most of the xml we get here is a funny mix of whatever tags and html 
(often for tables) and normaly there is way more structure than in the 
average html document; the export is meant to be close to the source and 
turning it into some html / div mixture makes it messy


for instance, we have more levels than H1..H6, so how to do H7? if 
someone has to deal with that, he/she can as well transform all into H1 
with some class which is a local solution then



xhtml has no typical tags .. it's xml + css (or xslt) ...
unfortunately browsers have

That is, a Strictly Conforming XHTML Document, as per:

http://www.w3.org/TR/2000/REC-xhtml1-2126/#docconf

the export of context is in fact just xml, and by tagging it as
xhtml we can apply css to it; but if someone has a workflow for
producing epub an option if to postprocess that xml file into
whatever epub one wants


indeed. that was the idea: export xml, tag it as xhtml (with the option 
to provide hyperlinks, an exception), provide some standard css as 
starter and then let users deal with matters the way they like; you can 
be pretty sure that what you want is not the same as what someone else 
wants; and if more people want it, they can together write a 
transformation script (or hire someone)


keep in mind that the export itself is already tricky enough and for me 
it doesn't pay off to provide tons of additional functionality (well, it 
doesn't pay of to export anyway)



I could transform the ConTeXt-generated XML into strictly conforming
XHTML, but it was a step I was hoping to avoid. Right now my process is:

 1. Convert XML data to a ConTeXt .tex file.
 2. Convert ConTeXt to either PDF or EPUB.
 3. Stylize EPUB using CSS.


but writing the transform that suits you is just one step (with yuou 
spending the time on it) while extending the export into a complete 
transformation and configuration thing would put the burden on me -)



I want to use ConTeXt here (instead of going directly from XML data to
EPUB) because ConTeXt provides functionality such as multiple indexes,
table-of-contents, and bundling the .epub. Having an extra step to
generate strictly conforming XHTML is architecturally painful as it
means transforming the document three times (XML - ConTeXt, ConTeXt -
XML, then XML - XHTML).


why is it painful? the export if quite generic and will not change; it 
is also flexible as it honors user defined sectioning and styling



Everytime we look into epub there's another issue ... it's not a
standard but reversed engineered application mess (happen soften
with xml: turn some application data structures into xml and call it
a standard)


Some book vendors only accept validating EPUBs. ConTeXt is documented as
being able to generate EPUBs. The documentation should state the EPUBs
do not validate and do not generate strictly conforming XHTML.


well, i, luigi and some others did tests: the thing is that epub is 
evolving

Re: [NTG-context] EPUB XHTML Format

2013-09-05 Thread Aditya Mahajan


On Thu, 5 Sep 2013, Hans Hagen wrote:


On 9/4/2013 11:20 AM, Hans Hagen wrote:


you get a representation in xml indeed, but not verbatim, but as close
as possible to the genaric (parent) structure elements in context


probably the most straightforward xhtml export is file with only

div class=section ...
div class=... ...
   div
/div

i.e. only divs and spans


How easy is it to create a new export format. IIRC, context keeps track of 
the entire document tree, and flushes the XML output only at the end. Is 
it possible to make this pluggable so that users can write their own 
transformers (in lua) on how the document tree can be written. This will 
enable more output formats (opendocument and (shudder) latex).


Aditya
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

2013-09-05 Thread Hans Hagen


On 9/5/2013 7:57 PM, Khaled Hosny wrote:

On Thu, Sep 05, 2013 at 09:57:59AM -0700, Thangalin wrote:

Hi,

div class=section ...

 div class=... ...
 div
/div

i.e. only divs and spans



I think that would be a more robust output format, technically, easier to
adapt, and more readily conform to the strict XHTML tag subset.


What about accessibility? I expect that visually impaired people would
depend on document structure rather than its visualisation.


For that purpose I'd make a nice special doc. But the basic export has 
at least the similar structure as the original. (After all, it's one of 
the reasons why we *can do* an export.


Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

2013-09-05 Thread Hans Hagen


On 9/5/2013 8:20 PM, Aditya Mahajan wrote:


The typical ConTeXt document has a lot of structure, and the XML export
generates a well structured XML output. That can be directly used in
most modern browsers that handle XML+CSS well. However, most (all?) EPUB
readers don't. So, the question is asking if instead ConTeXt could
generate a XHTML


but how hard would it be to make an xslt tranformation from 
context.export to epub variants (ok, at some point i can look into it 
but only if there is a robust standard and i have devices to test it on)


and indeed the quality of the source is important

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

2013-09-05 Thread Khaled Hosny

On Thu, Sep 05, 2013 at 09:57:59AM -0700, Thangalin wrote:
 Hi,
 
 div class=section ...
  div class=... ...
  div
  /div
 
  i.e. only divs and spans
 
 
 I think that would be a more robust output format, technically, easier to
 adapt, and more readily conform to the strict XHTML tag subset.

What about accessibility? I expect that visually impaired people would
depend on document structure rather than its visualisation.

Regards,
Khaled
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

2013-09-05 Thread Michael Hallgren

Le 05/09/2013 20:24, Hans Hagen a écrit :
 On 9/5/2013 8:20 PM, Aditya Mahajan wrote:

 The typical ConTeXt document has a lot of structure, and the XML export
 generates a well structured XML output. That can be directly used in
 most modern browsers that handle XML+CSS well. However, most (all?) EPUB
 readers don't. So, the question is asking if instead ConTeXt could
 generate a XHTML

 but how hard would it be to make an xslt tranformation from
 context.export to epub variants (ok, at some point i can look into it
 but only if there is a robust standard and i have devices to test it on)

 and indeed the quality of the source is important


Sounds by far to be the cleanest approach.

Cheers,

mh


 Hans

 -
   Hans Hagen | PRAGMA ADE
   Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
  | www.pragma-pod.nl
 -
 ___

 If your question is of interest to others as well, please add an entry
 to the Wiki!

 maillist : ntg-context@ntg.nl /
 http://www.ntg.nl/mailman/listinfo/ntg-context
 webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
 archive  : http://foundry.supelec.fr/projects/contextrev/
 wiki : http://contextgarden.net
 ___


___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

2013-09-05 Thread Hans Hagen


On 9/4/2013 11:20 AM, Hans Hagen wrote:


you get a representation in xml indeed, but not verbatim, but as close
as possible to the genaric (parent) structure elements in context


probably the most straightforward xhtml export is file with only

div class=section ...
div class=... ...
div
/div

i.e. only divs and spans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

2013-09-05 Thread Thangalin

Hi,

div class=section ...
 div class=... ...
 div
 /div

 i.e. only divs and spans


I think that would be a more robust output format, technically, easier to
adapt, and more readily conform to the strict XHTML tag subset.

The other issue I encountered was this:

\startfrontmatter
  \startstandardmakeup
Title page
  \stopstandardmakeup

  \startstandardmakeup
Copyright
  \stopstandardmakeup

  \completecontent
\stopfrontmatter


This produced *Title pageCopyright* as text without any markup, which
makes the EPUB output a bit difficult to parse. I thought the software
should output something like:

div class=frontmatter
  div id=standardmakeup1 class=standardmakeupTitle page/div
  div id=standardmakeup2 class=standardmakeupCopyright/div
  div class=contents!-- etc... --/div
/div


This way the title and copyright pages can be styled independently.

Kindest regards.
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

2013-09-05 Thread Aditya Mahajan


On Thu, 5 Sep 2013, honyk wrote:


On 2013-09-04 Thangalin wrote:


What needs to happen to take a minimal ConTeXt file (such as the
attached) to produce a minimum viable EPUB that:


It is always difficult to parse and further process not well structured
plain text without advanced semantics. Garbage in, garbage out.


The typical ConTeXt document has a lot of structure, and the XML export 
generates a well structured XML output. That can be directly used in most 
modern browsers that handle XML+CSS well. However, most (all?) EPUB 
readers don't. So, the question is asking if instead ConTeXt could 
generate a XHTML



If you need both EPUB and PDF, start with a semantically rich XML
vocabulary, e.g. DocBook. In this case you can relatively easy transfrom
(XSLT) input data into almost any format. These basic outputs like EPUB or
PDF (via XSL-FO) you can get out-of-the-box. The Context output can be
generated using dbcontext: http://dblatex.sourceforge.net/

In sum, use XML as your primary source and from it derive everything else.


I haven't used XML-only toolchains. Is it possible to handle:

- Automatic section numbering taking care of different conversions.
- Automatic index generation and sorting
- Inserting hyphenation points at the approriate place in the generated 
ouput (so that the browser can effectively rely on TeX's hyphenation 
algorithm to do linebreaking).

- Convert TeX math to MathML.

The current ConTeXT XML source can translate a well formed ConTeXt 
document into a XML document with the above features.


Aditya
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

2013-09-05 Thread Hans Hagen


On 9/5/2013 7:22 PM, Aditya Mahajan wrote:

On Thu, 5 Sep 2013, Hans Hagen wrote:


On 9/4/2013 11:20 AM, Hans Hagen wrote:


you get a representation in xml indeed, but not verbatim, but as close
as possible to the genaric (parent) structure elements in context


probably the most straightforward xhtml export is file with only

div class=section ...
div class=... ...
   div
/div

i.e. only divs and spans


How easy is it to create a new export format. IIRC, context keeps track
of the entire document tree, and flushes the XML output only at the end.
Is it possible to make this pluggable so that users can write their own
transformers (in lua) on how the document tree can be written. This will
enable more output formats (opendocument and (shudder) latex).


sure, but first i want to clean up some code (it's rather complex) ... 
in principle there is a document tree so one can plug into that; 
alternatively one can load the xml tree and mess with that (probably 
easier if we provide some styles for it)


Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

2013-09-05 Thread honyk

On 2013-09-04 Thangalin wrote:
 
 What needs to happen to take a minimal ConTeXt file (such as the
 attached) to produce a minimum viable EPUB that:
 

It is always difficult to parse and further process not well structured
plain text without advanced semantics. Garbage in, garbage out.

If you need both EPUB and PDF, start with a semantically rich XML
vocabulary, e.g. DocBook. In this case you can relatively easy transfrom
(XSLT) input data into almost any format. These basic outputs like EPUB or
PDF (via XSL-FO) you can get out-of-the-box. The Context output can be
generated using dbcontext: http://dblatex.sourceforge.net/

In sum, use XML as your primary source and from it derive everything else.

Jan

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

2013-09-05 Thread Thangalin

Hi,

handle XML+CSS well. However, most (all?) EPUB readers don't. So, the
 question is asking if instead ConTeXt could generate a XHTML


Precisely.


  If you need both EPUB and PDF, start with a semantically rich XML
 vocabulary, e.g. DocBook. In this case you can relatively easy transfrom


My database doesn't generate DocBook. It generates a custom XML document
from which I generate a web page, and a LaTeX document (though soon to be
ConTeXt!). There is no reason, technically, why I cannot convert the source
XML to either DocBook or directly to EPUB. There are, however, problems
doing that, which Aditya correctly surmises:


 - Automatic section numbering taking care of different conversions.
 - Automatic index generation and sorting
 - Inserting hyphenation points at the appropriate place in the generated
 output (so that the browser can effectively rely on TeX's hyphenation
 algorithm to do line-breaking).
 - Convert TeX math to MathML.

 The current ConTeXT XML source can translate a well formed ConTeXt
 document into a XML document with the above features.


Those are exactly the issues that I would love to resolve using ConTeXt for
generating an EPUB. (The MathML isn't as important to me, but I can see
other people wanting such a feature.)

What about accessibility? I expect that visually impaired people would
 depend on document structure rather than its visualisation.


That is a good point. The current XML structure produced by ConTeXt (Hans
correct me here if I'm mistaken) is not accessible, as it doesn't adhere to
strict XHTML. I suspect that div tags would not be accessible -- the only
way to provide true accessibility in EPUB format would be by using the
strict XHTML tags.

for instance, we have more levels than H1..H6, so how to do H7? if someone
 has to deal with that, he/she can as well transform all into H1 with some
 class which is a local solution then


I realize there is not going to be a one-to-one map of all possible ConTeXt
macros to XHTML. For someone who has 7 levels of nested sections they would
either have to rewrite some Lua or perform some post-processing (e.g., with
XSLT). I would posit that a document with 7 levels of nested sections is
not going to be a common occurrence.

When I talk about strict XHTML, I'm proposing that a _simple_ ConTeXt
document (up to 6 header levels, numbered and unnumbered lists, images,
text emphasis, etc.) should generate a simple, validating XHTML document.
Trying to attain 100% coverage of ConTeXt transmogrification to XHTML is
ridiculous when, I suspect, 80% coverage would meet most needs. :-)

It is definitely possible to translate the ConTeXt EPUB output to XHTML.
However, there are practical realities that hinder such an approach.
Architecturally, if anyone is going to translate an XML document to EPUB
format, it certainly won't be this way:

*XML + XSLT - ConTeXT File - ConTeXt EPUB XML + XSLT - EPUB + CSS*

It'll be this way, which is less time-consuming, less complex, and less
susceptible to err:

*XML + XSLT (or API) - EPUB + CSS*

However, it does not, as we all know, produce as feature rich output as
leveraging the ConTeXt abilities that Aditya mentioned, which was the point:

*XML + XSLT - ConTeXT TeX - EPUB + CSS*

Kindest regards.
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

2013-09-05 Thread Mica Semrick

I'd say use an xml source (docbook, TEI, or DITA) and then write a ConTeXt
stylesheet to typeset your XML. See http://wiki.contextgarden.net/TEI_xml

I think that TEI-lite is a nice, very general XML vocabulary...

Best,
Mica


On Thu, Sep 5, 2013 at 11:24 AM, Hans Hagen pra...@wxs.nl wrote:

 On 9/5/2013 8:20 PM, Aditya Mahajan wrote:

  The typical ConTeXt document has a lot of structure, and the XML export
 generates a well structured XML output. That can be directly used in
 most modern browsers that handle XML+CSS well. However, most (all?) EPUB
 readers don't. So, the question is asking if instead ConTeXt could
 generate a XHTML


 but how hard would it be to make an xslt tranformation from context.export
 to epub variants (ok, at some point i can look into it but only if there is
 a robust standard and i have devices to test it on)

 and indeed the quality of the source is important


 Hans

 --**--**-
   Hans Hagen | PRAGMA ADE
   Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
 tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
  | www.pragma-pod.nl
 --**--**-
 __**__**
 ___
 If your question is of interest to others as well, please add an entry to
 the Wiki!

 maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/**
 listinfo/ntg-context http://www.ntg.nl/mailman/listinfo/ntg-context
 webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
 archive  : 
 http://foundry.supelec.fr/**projects/contextrev/http://foundry.supelec.fr/projects/contextrev/
 wiki : http://contextgarden.net
 __**__**
 ___

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

2013-09-04 Thread Hans Hagen


On 9/4/2013 3:19 AM, Thangalin wrote:

Hi,

The attached t.tex file produces the attached t.xhtml file. I have
looked at the following documents:

  * http://en.wikipedia.org/wiki/EPUB#Open_Publication_Structure_2.0..1
http://en.wikipedia.org/wiki/EPUB#Open_Publication_Structure_2.0.1
  * http://en.wikipedia.org/wiki/DTBook
  * http://www.idpf.org/epub/20/spec/OPS_2.0.1_draft.htm
  * http://www.w3.org/TR/xhtml11/doctype.html
  * http://www.w3.org/TR/html5/sections.html

It seems that the macros in t.tex are being written out as XML elements,
verbatim. It is my understanding that these XML elements, however, do
not conform to the minimal content models associated with XHTML 1.1.


you get a representation in xml indeed, but not verbatim, but as close 
as possible to the genaric (parent) structure elements in context


of course we could alternatively export all as div 
class=tag-subtag-... but i don't like that too much; html itself is 
not rich enough for our purpose



What needs to happen to take a minimal ConTeXt file (such as the
attached) to produce a minimum viable EPUB that:

  * Generates XHTML headers (including !DOCTYPE and html...)


not needed as we're 'standalone'


  * Produces images as img tags, rather than float tags.


the css can deal with them (info is written to files for that)

the only real problematic thing is hyperlinks as css has no provision 
for that so there's an option to inject a...



  * Uses typical XHTML tags for body elements (e.g., ol for ordered
lists).


xhtml has no typical tags .. it's xml + css (or xslt) ... unfortunately 
browsers have messed up html so much (extensions, too tolerant support 
for unmatched tags, different rendering models) that xhtml never really 
took off


the export of context is in fact just xml, and by tagging it as xhtml we 
can apply css to it; but if someone has a workflow for producing epub an 
option if to postprocess that xml file into whatever epub one wants 
(i.e. the export is generic and carries as much info as possible)



Ideally, I would like to do something such as:

  * context t.tex
  * mtxrun --script epub --make t.specification

to generate an EPUB that passes validation of epubcheck
http://code.google.com/p/epubcheck/wiki/Library, with an output XHTML
file that more closely matches the XHTML specification.


Everytime we look into epub there's another issue ... it's not a 
standard but reversed engineered application mess (happen soften with 
xml: turn some application data structures into xml and call it a standard)


I only tested (long ago already) with some firefox plugin (i don't have 
a recent epub device, only an old firts generation one which is dead 
slow, never relly used, probably broken by now) and i refuse to buy a 
new one till resolution is decent (and i only want generic devices, not 
something bound to some shop)



How can I help?


by testing

as i have no real use/demand for epub it's not something i look into on 
a daily basis


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
 | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

2013-09-04 Thread Thangalin

Hi.

of course we could alternatively export all as div class=tag-subtag-...
 but i don't like that too much; html itself is not rich enough for our
 purpose


What about giving developers the ability to change the destination element?
For example:

\setuplist[chapter][
  xml={\starttag[h1]#1\stoptag}
]


Would produce, upon export:

h1Chapter/h1


Or (using export instead of xml; I don't care what it is named):

\setuplist[chapter][

export={\starttag[div]\startattribute[class]{chapter}#1\stopattribute\stoptag}}
]


Similarly, this would produce:

div class=chapterChapter/div


This would offer the flexibility of custom XML documents without affecting
the default behaviour.

  * Generates XHTML headers (including !DOCTYPE and html...)

 not needed as we're 'standalone'


Having the ability to produce the !DOCTYPE... and htmnl elements could
be as simple as:

\setupexport[
  standalone=no,
]



   * Produces images as img tags, rather than float tags.

 the css can deal with them (info is written to files for that)


Yes, but they aren't standard. There is an ecosystem of tools (e.g.,
Calibre, normalizing CSS templates, etc.), not to mention a widespread
knowledge-base, that groks the minimal XHTML specification. Plus, using XML
tags that are not in the minimal XHTML spec. means more testing on more
devices to make sure that their XHTML parsers render correctly.


 xhtml has no typical tags .. it's xml + css (or xslt) ... unfortunately
 browsers have


That is, a Strictly Conforming XHTML Document, as per:

http://www.w3.org/TR/2000/REC-xhtml1-2126/#docconf

the export of context is in fact just xml, and by tagging it as xhtml we
 can apply css to it; but if someone has a workflow for producing epub an
 option if to postprocess that xml file into whatever epub one wants


I could transform the ConTeXt-generated XML into strictly conforming XHTML,
but it was a step I was hoping to avoid. Right now my process is:

   1. Convert XML data to a ConTeXt .tex file.
   2. Convert ConTeXt to either PDF or EPUB.
   3. Stylize EPUB using CSS.

I want to use ConTeXt here (instead of going directly from XML data to
EPUB) because ConTeXt provides functionality such as multiple indexes,
table-of-contents, and bundling the .epub. Having an extra step to generate
strictly conforming XHTML is architecturally painful as it means
transforming the document three times (XML - ConTeXt, ConTeXt - XML, then
XML - XHTML).


 Everytime we look into epub there's another issue ... it's not a standard
 but reversed engineered application mess (happen soften with xml: turn some
 application data structures into xml and call it a standard)


Some book vendors only accept validating EPUBs. ConTeXt is documented as
being able to generate EPUBs. The documentation should state the EPUBs do
not validate and do not generate strictly conforming XHTML.

I have spent the last three weeks converting documents from LaTeX to
ConTeXt because the documentation stated that ConTeXt can produce EPUBs.
While true, the documentation did not mention its shortcomings. Had I known
in advance, I probably would have gone straight to EPUB using Java or, with
a little revulsion, PHP classes. ;-) That said, I probably should have
tested this feature sooner. :-)

as i have no real use/demand for epub it's not something i look into on a
 daily basis


How can I help resolve these issues?

Merely testing (which I am happy to do) isn't going to produce a strictly
conforming XHTML document.

Kindest regards.
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

[NTG-context] EPUB XHTML Format

2013-09-03 Thread Thangalin

Hi,

The attached t.tex file produces the attached t.xhtml file. I have looked
at the following documents:

   - http://en.wikipedia.org/wiki/EPUB#Open_Publication_Structure_2.0.1
   - http://en.wikipedia.org/wiki/DTBook
   - http://www.idpf.org/epub/20/spec/OPS_2.0.1_draft.htm
   - http://www.w3.org/TR/xhtml11/doctype.html
   - http://www.w3.org/TR/html5/sections.html

It seems that the macros in t.tex are being written out as XML elements,
verbatim. It is my understanding that these XML elements, however, do not
conform to the minimal content models associated with XHTML 1.1.

What needs to happen to take a minimal ConTeXt file (such as the attached)
to produce a minimum viable EPUB that:

   - Generates XHTML headers (including !DOCTYPE and html...)
   - Produces images as img tags, rather than float tags.
   - Uses typical XHTML tags for body elements (e.g., ol for ordered
   lists).

Ideally, I would like to do something such as:

   - context t.tex
   - mtxrun --script epub --make t.specification

to generate an EPUB that passes validation of
epubcheckhttp://code.google.com/p/epubcheck/wiki/Library,
with an output XHTML file that more closely matches the XHTML specification.

How can I help?

Kind regards.


t.tex
Description: TeX document


t.xhtml
Description: application/xhtml


epub-errors.log
Description: Binary data
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

Re: [NTG-context] EPUB XHTML Format

[NTG-context] EPUB XHTML Format

24 matches

Site Navigation

Mail list logo

Footer information