Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-04 Thread Steve Litt
On Tue, 4 Feb 2014 07:34:24 +0100
Liviu Andronic landronim...@gmail.com wrote:

 On Tue, Feb 4, 2014 at 2:38 AM, Steve Litt
 sl...@troubleshooters.com wrote:

  I just created a brand new LyX file in 2.0.6, which is the packaged
  LyX for Ubuntu 13.10, and there wasn't a bit of XML in it, well
  formed or otherwise. It was basically the same format human
  parseable format it's been for 10 years, but with a lot more insets
  and options. But as far
 
 Quick question: What are your reasons / needs for being able to
 humanly parse the file format? Would a tool like pLyX address these
 needs: http://wiki.lyx.org/Examples/FindAndReplaceLyXFormatElements ?
 
 Liviu

Thanks for asking, Liviu. Let me answer your second question first:
From the URL you gave, I can't determine how much or how little pLyX
would help me. I can't even tell whether pLyX is runnable as a
command: a must for putting it into shellscripts. But either way, it's
not a substitute for parsability...

Which brings us to your first question: Why do I need a parsable native
format? The best answer I can give you is general principle. Over
many years with many different programs on many different operating
systems, every time I had one of those programs emitting magic
format  native formats, eventually I wished I could parse.

I have tons of files in Micrografx Windows Draw format: best vector
graphics program ever. But it doesn't work in Linux, and in fact the
latest version came out last century. If I could parse it, I could
convert it to SVG and continue with Inkscape. But Noo, it's a trade
secret binary.

I had a Clarion (Rapid Application Development) app whose .app file
suffered a little bit of corruption. It would have been pretty
easy to tweak it back to life with a text editor, but it was a binary
file. I had to drop back to a three day old backup: I lost three days
of work.

I needed an automated tweak of an OpenOffice doc, so I could watermark
it before sending it to customers. But OpenOffice/LibreOffice docs are
groups of XML files that, if they were a database, would be considered
horribly denormalized. For all practical purposes, OpenOffice format
might as well be considered secret binary: I had to do it in (gulp), MS
Powerpoint binary.

A few weeks ago I upgraded my daily driver computer to Ubuntu 13.10, and
a few of my Gnumeric files wouldn't open in Gnumeric. The file format
was very complex XML I couldn't understand. I ended up taking the file
to a computer with a different Gnumeric version, exporting to MSExcel,
and then importing that in my daily driver's Gnumeric. If the XML had
been easy enough to understand, I could have used the old repeatedly
remove half, see if the symptom went away technique. But with XML,
especially the kind where a single fact is represented in several
branches and they all must agree (denormalized), you're likely to bust
it further, so it's not good for troubleshooting.

Now let's look at the other side of the equation: Times I've had an
accessible and parsable native format...

VimOutliner's (VO) native format is tab-indented ascii. So far I've
created VO to HTML, VO to Easy Menu Definition Language, VO to
Troubleshooters.Com Linux Library web page, and many, many more. Others
have created all sorts of extensions for VO, one guy made it into a
calendaring program, and a lot of these guys weren't professional
programmers.

Xhtml is easily parsable with Python's lxml.etree parser. So I used it
to convert a Bluefish-created Xhtml file into ePub. The hardest part of
the job was understanding the ePub spec, complete with device
idiosyncrasies. 

About 5 to 10 times in the 12 years I've used LyX, there have been LyX
files that couldn't be opened in LyX. No problem, I edited them in Vim,
did the old repeatedly remove half, see if the symptom went away
technique, eventually finding that one factor that prevented opening.

My (LyX created) instructor notes have a slide-by-slide explanation of
the accompanying (Powerpoint) presentation. From time to time I change
the slides or slide order, so I really don't want to hard-code numbers
into the Instructor Notes: I want the slides to be numbered on the fly
in the Instructor Notes. At the time I first made this, I didn't know
enough about LaTeX counters and LaTeX commands (and LyX didn't have
char styles with which to implement the commands anyway) to do it that
way. So instead, I built a preprocessor that went through my file, and
every time it found blue text, it assigned an incremented number within
LyX itself.

Most of my books are written with LyX. With eBooks, I hate DRM but want
my recipients to understand it's not cool to unauthorizedly copy my
books, because I feed my family by selling those books. So I
personalize every book with the person's name: The footer says This
book created expressly for John Q Public or whatever the guy's name
is. Doing this was trivial: The LyX file has a command called Licensee,
and I have a script copies the LyX file to a 

Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-04 Thread Steve Litt
On Tue, 4 Feb 2014 07:34:24 +0100
Liviu Andronic landronim...@gmail.com wrote:

 On Tue, Feb 4, 2014 at 2:38 AM, Steve Litt
 sl...@troubleshooters.com wrote:

  I just created a brand new LyX file in 2.0.6, which is the packaged
  LyX for Ubuntu 13.10, and there wasn't a bit of XML in it, well
  formed or otherwise. It was basically the same format human
  parseable format it's been for 10 years, but with a lot more insets
  and options. But as far
 
 Quick question: What are your reasons / needs for being able to
 humanly parse the file format? Would a tool like pLyX address these
 needs: http://wiki.lyx.org/Examples/FindAndReplaceLyXFormatElements ?
 
 Liviu

Thanks for asking, Liviu. Let me answer your second question first:
From the URL you gave, I can't determine how much or how little pLyX
would help me. I can't even tell whether pLyX is runnable as a
command: a must for putting it into shellscripts. But either way, it's
not a substitute for parsability...

Which brings us to your first question: Why do I need a parsable native
format? The best answer I can give you is general principle. Over
many years with many different programs on many different operating
systems, every time I had one of those programs emitting magic
format  native formats, eventually I wished I could parse.

I have tons of files in Micrografx Windows Draw format: best vector
graphics program ever. But it doesn't work in Linux, and in fact the
latest version came out last century. If I could parse it, I could
convert it to SVG and continue with Inkscape. But Noo, it's a trade
secret binary.

I had a Clarion (Rapid Application Development) app whose .app file
suffered a little bit of corruption. It would have been pretty
easy to tweak it back to life with a text editor, but it was a binary
file. I had to drop back to a three day old backup: I lost three days
of work.

I needed an automated tweak of an OpenOffice doc, so I could watermark
it before sending it to customers. But OpenOffice/LibreOffice docs are
groups of XML files that, if they were a database, would be considered
horribly denormalized. For all practical purposes, OpenOffice format
might as well be considered secret binary: I had to do it in (gulp), MS
Powerpoint binary.

A few weeks ago I upgraded my daily driver computer to Ubuntu 13.10, and
a few of my Gnumeric files wouldn't open in Gnumeric. The file format
was very complex XML I couldn't understand. I ended up taking the file
to a computer with a different Gnumeric version, exporting to MSExcel,
and then importing that in my daily driver's Gnumeric. If the XML had
been easy enough to understand, I could have used the old repeatedly
remove half, see if the symptom went away technique. But with XML,
especially the kind where a single fact is represented in several
branches and they all must agree (denormalized), you're likely to bust
it further, so it's not good for troubleshooting.

Now let's look at the other side of the equation: Times I've had an
accessible and parsable native format...

VimOutliner's (VO) native format is tab-indented ascii. So far I've
created VO to HTML, VO to Easy Menu Definition Language, VO to
Troubleshooters.Com Linux Library web page, and many, many more. Others
have created all sorts of extensions for VO, one guy made it into a
calendaring program, and a lot of these guys weren't professional
programmers.

Xhtml is easily parsable with Python's lxml.etree parser. So I used it
to convert a Bluefish-created Xhtml file into ePub. The hardest part of
the job was understanding the ePub spec, complete with device
idiosyncrasies. 

About 5 to 10 times in the 12 years I've used LyX, there have been LyX
files that couldn't be opened in LyX. No problem, I edited them in Vim,
did the old repeatedly remove half, see if the symptom went away
technique, eventually finding that one factor that prevented opening.

My (LyX created) instructor notes have a slide-by-slide explanation of
the accompanying (Powerpoint) presentation. From time to time I change
the slides or slide order, so I really don't want to hard-code numbers
into the Instructor Notes: I want the slides to be numbered on the fly
in the Instructor Notes. At the time I first made this, I didn't know
enough about LaTeX counters and LaTeX commands (and LyX didn't have
char styles with which to implement the commands anyway) to do it that
way. So instead, I built a preprocessor that went through my file, and
every time it found blue text, it assigned an incremented number within
LyX itself.

Most of my books are written with LyX. With eBooks, I hate DRM but want
my recipients to understand it's not cool to unauthorizedly copy my
books, because I feed my family by selling those books. So I
personalize every book with the person's name: The footer says This
book created expressly for John Q Public or whatever the guy's name
is. Doing this was trivial: The LyX file has a command called Licensee,
and I have a script copies the LyX file to a 

Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-04 Thread Steve Litt
On Tue, 4 Feb 2014 07:34:24 +0100
Liviu Andronic  wrote:

> On Tue, Feb 4, 2014 at 2:38 AM, Steve Litt
>  wrote:

> > I just created a brand new LyX file in 2.0.6, which is the packaged
> > LyX for Ubuntu 13.10, and there wasn't a bit of XML in it, well
> > formed or otherwise. It was basically the same format human
> > parseable format it's been for 10 years, but with a lot more insets
> > and options. But as far
> >
> Quick question: What are your reasons / needs for being able to
> "humanly" parse the file format? Would a tool like pLyX address these
> needs: http://wiki.lyx.org/Examples/FindAndReplaceLyXFormatElements ?
> 
> Liviu

Thanks for asking, Liviu. Let me answer your second question first:
From the URL you gave, I can't determine how much or how little pLyX
would help me. I can't even tell whether pLyX is runnable as a
command: a must for putting it into shellscripts. But either way, it's
not a substitute for parsability...

Which brings us to your first question: Why do I need a parsable native
format? The best answer I can give you is "general principle". Over
many years with many different programs on many different operating
systems, every time I had one of those programs emitting "magic
format"  native formats, eventually I wished I could parse.

I have tons of files in Micrografx Windows Draw format: best vector
graphics program ever. But it doesn't work in Linux, and in fact the
latest version came out last century. If I could parse it, I could
convert it to SVG and continue with Inkscape. But Noo, it's a trade
secret binary.

I had a Clarion (Rapid Application Development) app whose .app file
suffered a little bit of corruption. It would have been pretty
easy to tweak it back to life with a text editor, but it was a binary
file. I had to drop back to a three day old backup: I lost three days
of work.

I needed an automated tweak of an OpenOffice doc, so I could watermark
it before sending it to customers. But OpenOffice/LibreOffice docs are
groups of XML files that, if they were a database, would be considered
horribly denormalized. For all practical purposes, OpenOffice format
might as well be considered secret binary: I had to do it in (gulp), MS
Powerpoint binary.

A few weeks ago I upgraded my daily driver computer to Ubuntu 13.10, and
a few of my Gnumeric files wouldn't open in Gnumeric. The file format
was very complex XML I couldn't understand. I ended up taking the file
to a computer with a different Gnumeric version, exporting to MSExcel,
and then importing that in my daily driver's Gnumeric. If the XML had
been easy enough to understand, I could have used the old "repeatedly
remove half, see if the symptom went away" technique. But with XML,
especially the kind where a single fact is represented in several
branches and they all must agree (denormalized), you're likely to bust
it further, so it's not good for troubleshooting.

Now let's look at the other side of the equation: Times I've had an
accessible and parsable native format...

VimOutliner's (VO) native format is tab-indented ascii. So far I've
created VO to HTML, VO to Easy Menu Definition Language, VO to
Troubleshooters.Com Linux Library web page, and many, many more. Others
have created all sorts of extensions for VO, one guy made it into a
calendaring program, and a lot of these guys weren't professional
programmers.

Xhtml is easily parsable with Python's lxml.etree parser. So I used it
to convert a Bluefish-created Xhtml file into ePub. The hardest part of
the job was understanding the ePub spec, complete with device
idiosyncrasies. 

About 5 to 10 times in the 12 years I've used LyX, there have been LyX
files that couldn't be opened in LyX. No problem, I edited them in Vim,
did the old "repeatedly remove half, see if the symptom went away"
technique, eventually finding that one factor that prevented opening.

My (LyX created) instructor notes have a slide-by-slide explanation of
the accompanying (Powerpoint) presentation. From time to time I change
the slides or slide order, so I really don't want to hard-code numbers
into the Instructor Notes: I want the slides to be numbered on the fly
in the Instructor Notes. At the time I first made this, I didn't know
enough about LaTeX counters and LaTeX commands (and LyX didn't have
char styles with which to implement the commands anyway) to do it that
way. So instead, I built a preprocessor that went through my file, and
every time it found blue text, it assigned an incremented number within
LyX itself.

Most of my books are written with LyX. With eBooks, I hate DRM but want
my recipients to understand it's not cool to unauthorizedly copy my
books, because I feed my family by selling those books. So I
personalize every book with the person's name: The footer says "This
book created expressly for John Q Public" or whatever the guy's name
is. Doing this was trivial: The LyX file has a command called Licensee,
and I 

Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Richard Heck

On 02/03/2014 01:53 AM, Liviu Andronic wrote:

Dear Steve and Alex,

On Mon, Feb 3, 2014 at 2:48 AM, Steve Litt sl...@troubleshooters.com wrote:

On Fri, 31 Jan 2014 22:33:02 +0100
Alex Fernandez ely...@gmail.com wrote:

Ladies and gentlemen, if the preceding paragraph doesn't convince
us we need a good, solid, LyX to ePub and LyX to Mobi conversion



Last year we actually had a GSoC project specifically dealing with
ePub. Josh and Richard made progress on this front, and the code
simply awaits someone with the motivation and the skills to finish the
job. The almost finished feature is available in several GIT branches
here: http://git.lyx.org/?p=gsoc.git;a=summary .


Actually, mostly what this awaits is the release of 2.1, since the main 
work left is to integrate this into trunk. There are still bugs to be 
fixed, of course, in the underlying XHTML export routines, but that's 
always true. Most of the ones Steve mentioned earlier have already been 
fixed.


Richard



Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Steve Litt
On Mon, 03 Feb 2014 10:05:57 -0500
Richard Heck rgh...@lyx.org wrote:

 On 02/03/2014 01:53 AM, Liviu Andronic wrote:
  Dear Steve and Alex,
 
  On Mon, Feb 3, 2014 at 2:48 AM, Steve Litt
  sl...@troubleshooters.com wrote:
  On Fri, 31 Jan 2014 22:33:02 +0100
  Alex Fernandez ely...@gmail.com wrote:
  Ladies and gentlemen, if the preceding paragraph doesn't convince
  us we need a good, solid, LyX to ePub and LyX to Mobi conversion
 
  Last year we actually had a GSoC project specifically dealing with
  ePub. Josh and Richard made progress on this front, and the code
  simply awaits someone with the motivation and the skills to finish
  the job. The almost finished feature is available in several GIT
  branches here: http://git.lyx.org/?p=gsoc.git;a=summary .
 
 Actually, mostly what this awaits is the release of 2.1, since the
 main work left is to integrate this into trunk. There are still bugs
 to be fixed, of course, in the underlying XHTML export routines, but
 that's always true. Most of the ones Steve mentioned earlier have
 already been fixed.
 
 Richard

The instant 2.1 comes out, I'll compile it separate from my
Ubuntu-provided LyX, and try it out.

Thanks,

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Steve Litt
On Mon, 3 Feb 2014 07:53:49 +0100
Liviu Andronic landronim...@gmail.com wrote:

 Dear Steve and Alex,
 
 On Mon, Feb 3, 2014 at 2:48 AM, Steve Litt
 sl...@troubleshooters.com wrote:
  On Fri, 31 Jan 2014 22:33:02 +0100
  Alex Fernandez ely...@gmail.com wrote:
   Ladies and gentlemen, if the preceding paragraph doesn't convince
   us we need a good, solid, LyX to ePub and LyX to Mobi conversion
 
 
 Last year we actually had a GSoC project specifically dealing with
 ePub. Josh and Richard made progress on this front, 

Did the progress include an Xhtml export that preserved semantics and
respected HTML semantics such as using h? for leveled headings, p
for paragraphs, and pre for lyx-code and environments derived from
it, and span for character styles? If you got that far and can write
it to disk, I can take it from there.

 and the code
 simply awaits someone with the motivation and the skills to finish the
 job. The almost finished feature is available in several GIT branches
 here: http://git.lyx.org/?p=gsoc.git;a=summary .

If I ever get a day or two to familiarize myself with it, I will. But
of course, I'm really lousy at C++, and in fact hate C++.

I think there are several missed opportunities:

1) If LyX insists on having a native format of XML, which by its nature
   is human-confusing, it should at least be *well formed* XML so that
   it can be handled by an XML parser, and if the programmer is more
   skilled than I, XSLT. The day LyX native format is well formed, and
   maybe even with a DTD, I'll make the converter: me, myself and I.

2) The ePub converter shouldn't be a part of LyX: Most of it should be
   standalone. There's no reason for LyX to need to know *anything*
   about ePub. Just provide a sane Xhtml export, one that doesn't
   substitute div for p, or p for h?, and calls a pre a
   pre. Don't throw away the semantics: Don't convert styles to
   appearances, and don't use div instead of HTML structural elements
   like p, pre, and h?. Give me that, and I can do the rest.

3) I honestly don't think you're ever going to get one programmer to do
   this whole thing. The likelihood is infinitesimal of finding one
   person who: A) has a burning ePub itch to scratch and B) is great at
   C++ and C) Is familiar with LyX's code base. The way to do this
   is as a pipeline, where person A exports what person B needs, and
   person B converts that to what person C needs, etc. The beauty of
   this would be is that, as LyX changes, only person A would need to
   change his code, unless, of course, the rest of the chain needs an
   enhancement. The way I envision it would be either that person A
   would be the person who makes a sane Xhtml export, and person B is
   me. Or, perhaps, if LyX native format is ever well formed and valid
   XML, Person A is me, converting LyX to an Xhtml subset, and person B
   is me, converting the subset to ePub. That way, as LyX changed, I'd
   need to change only the first program.

4) Back when Josh was doing this, neither I nor Rob Oakes knew what we
   know about ePub today. If we had, we could have given Josh and his
   crew much better guidance about what was needed. If anybody wants to
   continue on the road to LyX-ePub, I suggest that person(s) ask us
   lots of questions.

Thanks,

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Richard Heck

On 02/03/2014 11:36 AM, Steve Litt wrote:

On Mon, 03 Feb 2014 10:05:57 -0500
Richard Heck rgh...@lyx.org wrote:


On 02/03/2014 01:53 AM, Liviu Andronic wrote:

Dear Steve and Alex,

On Mon, Feb 3, 2014 at 2:48 AM, Steve Litt
sl...@troubleshooters.com wrote:

On Fri, 31 Jan 2014 22:33:02 +0100
Alex Fernandez ely...@gmail.com wrote:

Ladies and gentlemen, if the preceding paragraph doesn't convince
us we need a good, solid, LyX to ePub and LyX to Mobi conversion

Last year we actually had a GSoC project specifically dealing with
ePub. Josh and Richard made progress on this front, and the code
simply awaits someone with the motivation and the skills to finish
the job. The almost finished feature is available in several GIT
branches here: http://git.lyx.org/?p=gsoc.git;a=summary .

Actually, mostly what this awaits is the release of 2.1, since the
main work left is to integrate this into trunk. There are still bugs
to be fixed, of course, in the underlying XHTML export routines, but
that's always true. Most of the ones Steve mentioned earlier have
already been fixed.

Richard

The instant 2.1 comes out, I'll compile it separate from my
Ubuntu-provided LyX, and try it out.


Sorry, to be clear: The ePub stuff will not be in 2.1. It came too late 
for that. It will go into trunk shortly after 2.1 and be in 2.2, which 
we expect to follow 2.1 much more quickly than usual.


Richard



Thanks,

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance




Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Richard Heck

On 02/03/2014 11:34 AM, Steve Litt wrote:

On Mon, 3 Feb 2014 07:53:49 +0100
Liviu Andronic landronim...@gmail.com wrote:


Dear Steve and Alex,

On Mon, Feb 3, 2014 at 2:48 AM, Steve Litt
sl...@troubleshooters.com wrote:

On Fri, 31 Jan 2014 22:33:02 +0100
Alex Fernandez ely...@gmail.com wrote:

Ladies and gentlemen, if the preceding paragraph doesn't convince
us we need a good, solid, LyX to ePub and LyX to Mobi conversion



Last year we actually had a GSoC project specifically dealing with
ePub. Josh and Richard made progress on this front,

Did the progress include an Xhtml export that preserved semantics and
respected HTML semantics such as using h? for leveled headings, p
for paragraphs, and pre for lyx-code and environments derived from
it, and span for character styles? If you got that far and can write
it to disk, I can take it from there.


and the code
simply awaits someone with the motivation and the skills to finish the
job. The almost finished feature is available in several GIT branches
here: http://git.lyx.org/?p=gsoc.git;a=summary .

If I ever get a day or two to familiarize myself with it, I will. But
of course, I'm really lousy at C++, and in fact hate C++.

I think there are several missed opportunities:

1) If LyX insists on having a native format of XML, which by its nature
is human-confusing, it should at least be *well formed* XML so that
it can be handled by an XML parser, and if the programmer is more
skilled than I, XSLT. The day LyX native format is well formed, and
maybe even with a DTD, I'll make the converter: me, myself and I.


Unclear if this will ever happen. I started the project, but then got 
busy with other things. It is pretty monumental. But the goal would be 
to use Qt's built-in XML writing and reading, which means it will be 
well-formed.



2) The ePub converter shouldn't be a part of LyX: Most of it should be
standalone. There's no reason for LyX to need to know *anything*
about ePub. Just provide a sane Xhtml export, one that doesn't
substitute div for p, or p for h?, and calls a pre a
pre. Don't throw away the semantics: Don't convert styles to
appearances, and don't use div instead of HTML structural elements
like p, pre, and h?. Give me that, and I can do the rest.


That is what Josh did last summer. The XHTML -- ePub converter is 
written in Python.


Richard



Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Liviu Andronic
On Mon, Feb 3, 2014 at 8:02 PM, Richard Heck rgh...@lyx.org wrote:
 1) If LyX insists on having a native format of XML, which by its nature
 is human-confusing, it should at least be *well formed* XML so that
 it can be handled by an XML parser, and if the programmer is more
 skilled than I, XSLT. The day LyX native format is well formed, and
 maybe even with a DTD, I'll make the converter: me, myself and I.


 Unclear if this will ever happen. I started the project, but then got busy
 with other things. It is pretty monumental. But the goal would be to use
 Qt's built-in XML writing and reading, which means it will be well-formed.

The advantages to having an XML-based native format are numerous:
proper diff of LyX files; more standardized converters to and from
other XML-based formats; tools like pLyX would likely benefit from
this standardization, too. Some of our long-standing issues would be
addressed by such a change, so well worth a try I'd say.

Liviu


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Steve Litt
On Mon, 3 Feb 2014 20:39:09 +0100
Liviu Andronic landronim...@gmail.com wrote:

 On Mon, Feb 3, 2014 at 8:02 PM, Richard Heck rgh...@lyx.org wrote:
  1) If LyX insists on having a native format of XML, which by its
  nature is human-confusing, it should at least be *well formed* XML
  so that it can be handled by an XML parser, and if the programmer
  is more skilled than I, XSLT. The day LyX native format is well
  formed, and maybe even with a DTD, I'll make the converter: me,
  myself and I.
 
 
  Unclear if this will ever happen. I started the project, but then
  got busy with other things. It is pretty monumental. But the goal
  would be to use Qt's built-in XML writing and reading, which means
  it will be well-formed.
 
 The advantages to having an XML-based native format are numerous:
 proper diff of LyX files; more standardized converters to and from
 other XML-based formats; tools like pLyX would likely benefit from
 this standardization, too. Some of our long-standing issues would be
 addressed by such a change, so well worth a try I'd say.

Liviu,

You *do* mean *well formed* XML-based native format, right?

Yeah, we might as well: We've already done about all the human
readability damage we can do to the format. If it were well formed, at
least it would be simple to make a pretty-print for it.

If we *do* get well formed XML for LyX files, at the same time we
should probably make layout files XML too.  Each paragraph character and
character style (do we really still want to call paragraph styles by
the LaTeX centric name environment?) would have one section
describing its appearance in LyX, and sections describing its
appearance in each of several other output formats. The one for LaTeX
would be LaTeX. The one for Xhtml would be CSS. If it were done like
this, an arbitrary conversion program for a new format could just
list the styles, show the file with those styles applied, and allow the
converter program to supply its own definitions of each style.

This would go a long way toward changing the mission statement of LyX
from a front end for LaTeX to the fastest and easiest way to author
absolutely anything.

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Steve Litt
On Mon, 3 Feb 2014 20:20:03 -0500
Steve Litt sl...@troubleshooters.com wrote:


 Liviu,
 
 You *do* mean *well formed* XML-based native format, right?
 
 Yeah, we might as well: We've already done about all the human
 readability damage we can do to the format. If it were well formed, at
 least it would be simple to make a pretty-print for it.

My apologies, everyone, the preceding is a false statement on my part.

I just created a brand new LyX file in 2.0.6, which is the packaged LyX
for Ubuntu 13.10, and there wasn't a bit of XML in it, well formed or
otherwise. It was basically the same format human parseable format it's
been for 10 years, but with a lot more insets and options. But as far
as I can tell, the newlines in the format are meaningful, meaning one
can do a line by line parse, although with today's proliferation of
great features, many implemented as insets, the parse isn't as simple
as it was in 2005.

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Liviu Andronic
On Tue, Feb 4, 2014 at 2:38 AM, Steve Litt sl...@troubleshooters.com wrote:
 On Mon, 3 Feb 2014 20:20:03 -0500
 Steve Litt sl...@troubleshooters.com wrote:


 Liviu,

 You *do* mean *well formed* XML-based native format, right?

 Yeah, we might as well: We've already done about all the human
 readability damage we can do to the format. If it were well formed, at
 least it would be simple to make a pretty-print for it.

 My apologies, everyone, the preceding is a false statement on my part.

 I just created a brand new LyX file in 2.0.6, which is the packaged LyX
 for Ubuntu 13.10, and there wasn't a bit of XML in it, well formed or
 otherwise. It was basically the same format human parseable format it's
 been for 10 years, but with a lot more insets and options. But as far

Quick question: What are your reasons / needs for being able to
humanly parse the file format? Would a tool like pLyX address these
needs: http://wiki.lyx.org/Examples/FindAndReplaceLyXFormatElements ?

Liviu

 as I can tell, the newlines in the format are meaningful, meaning one
 can do a line by line parse, although with today's proliferation of
 great features, many implemented as insets, the parse isn't as simple
 as it was in 2005.

 SteveT

 Steve Litt*  http://www.troubleshooters.com/
 Troubleshooting Training  *  Human Performance



-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Richard Heck

On 02/03/2014 01:53 AM, Liviu Andronic wrote:

Dear Steve and Alex,

On Mon, Feb 3, 2014 at 2:48 AM, Steve Litt sl...@troubleshooters.com wrote:

On Fri, 31 Jan 2014 22:33:02 +0100
Alex Fernandez ely...@gmail.com wrote:

Ladies and gentlemen, if the preceding paragraph doesn't convince
us we need a good, solid, LyX to ePub and LyX to Mobi conversion



Last year we actually had a GSoC project specifically dealing with
ePub. Josh and Richard made progress on this front, and the code
simply awaits someone with the motivation and the skills to finish the
job. The almost finished feature is available in several GIT branches
here: http://git.lyx.org/?p=gsoc.git;a=summary .


Actually, mostly what this awaits is the release of 2.1, since the main 
work left is to integrate this into trunk. There are still bugs to be 
fixed, of course, in the underlying XHTML export routines, but that's 
always true. Most of the ones Steve mentioned earlier have already been 
fixed.


Richard



Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Steve Litt
On Mon, 03 Feb 2014 10:05:57 -0500
Richard Heck rgh...@lyx.org wrote:

 On 02/03/2014 01:53 AM, Liviu Andronic wrote:
  Dear Steve and Alex,
 
  On Mon, Feb 3, 2014 at 2:48 AM, Steve Litt
  sl...@troubleshooters.com wrote:
  On Fri, 31 Jan 2014 22:33:02 +0100
  Alex Fernandez ely...@gmail.com wrote:
  Ladies and gentlemen, if the preceding paragraph doesn't convince
  us we need a good, solid, LyX to ePub and LyX to Mobi conversion
 
  Last year we actually had a GSoC project specifically dealing with
  ePub. Josh and Richard made progress on this front, and the code
  simply awaits someone with the motivation and the skills to finish
  the job. The almost finished feature is available in several GIT
  branches here: http://git.lyx.org/?p=gsoc.git;a=summary .
 
 Actually, mostly what this awaits is the release of 2.1, since the
 main work left is to integrate this into trunk. There are still bugs
 to be fixed, of course, in the underlying XHTML export routines, but
 that's always true. Most of the ones Steve mentioned earlier have
 already been fixed.
 
 Richard

The instant 2.1 comes out, I'll compile it separate from my
Ubuntu-provided LyX, and try it out.

Thanks,

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Steve Litt
On Mon, 3 Feb 2014 07:53:49 +0100
Liviu Andronic landronim...@gmail.com wrote:

 Dear Steve and Alex,
 
 On Mon, Feb 3, 2014 at 2:48 AM, Steve Litt
 sl...@troubleshooters.com wrote:
  On Fri, 31 Jan 2014 22:33:02 +0100
  Alex Fernandez ely...@gmail.com wrote:
   Ladies and gentlemen, if the preceding paragraph doesn't convince
   us we need a good, solid, LyX to ePub and LyX to Mobi conversion
 
 
 Last year we actually had a GSoC project specifically dealing with
 ePub. Josh and Richard made progress on this front, 

Did the progress include an Xhtml export that preserved semantics and
respected HTML semantics such as using h? for leveled headings, p
for paragraphs, and pre for lyx-code and environments derived from
it, and span for character styles? If you got that far and can write
it to disk, I can take it from there.

 and the code
 simply awaits someone with the motivation and the skills to finish the
 job. The almost finished feature is available in several GIT branches
 here: http://git.lyx.org/?p=gsoc.git;a=summary .

If I ever get a day or two to familiarize myself with it, I will. But
of course, I'm really lousy at C++, and in fact hate C++.

I think there are several missed opportunities:

1) If LyX insists on having a native format of XML, which by its nature
   is human-confusing, it should at least be *well formed* XML so that
   it can be handled by an XML parser, and if the programmer is more
   skilled than I, XSLT. The day LyX native format is well formed, and
   maybe even with a DTD, I'll make the converter: me, myself and I.

2) The ePub converter shouldn't be a part of LyX: Most of it should be
   standalone. There's no reason for LyX to need to know *anything*
   about ePub. Just provide a sane Xhtml export, one that doesn't
   substitute div for p, or p for h?, and calls a pre a
   pre. Don't throw away the semantics: Don't convert styles to
   appearances, and don't use div instead of HTML structural elements
   like p, pre, and h?. Give me that, and I can do the rest.

3) I honestly don't think you're ever going to get one programmer to do
   this whole thing. The likelihood is infinitesimal of finding one
   person who: A) has a burning ePub itch to scratch and B) is great at
   C++ and C) Is familiar with LyX's code base. The way to do this
   is as a pipeline, where person A exports what person B needs, and
   person B converts that to what person C needs, etc. The beauty of
   this would be is that, as LyX changes, only person A would need to
   change his code, unless, of course, the rest of the chain needs an
   enhancement. The way I envision it would be either that person A
   would be the person who makes a sane Xhtml export, and person B is
   me. Or, perhaps, if LyX native format is ever well formed and valid
   XML, Person A is me, converting LyX to an Xhtml subset, and person B
   is me, converting the subset to ePub. That way, as LyX changed, I'd
   need to change only the first program.

4) Back when Josh was doing this, neither I nor Rob Oakes knew what we
   know about ePub today. If we had, we could have given Josh and his
   crew much better guidance about what was needed. If anybody wants to
   continue on the road to LyX-ePub, I suggest that person(s) ask us
   lots of questions.

Thanks,

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Richard Heck

On 02/03/2014 11:36 AM, Steve Litt wrote:

On Mon, 03 Feb 2014 10:05:57 -0500
Richard Heck rgh...@lyx.org wrote:


On 02/03/2014 01:53 AM, Liviu Andronic wrote:

Dear Steve and Alex,

On Mon, Feb 3, 2014 at 2:48 AM, Steve Litt
sl...@troubleshooters.com wrote:

On Fri, 31 Jan 2014 22:33:02 +0100
Alex Fernandez ely...@gmail.com wrote:

Ladies and gentlemen, if the preceding paragraph doesn't convince
us we need a good, solid, LyX to ePub and LyX to Mobi conversion

Last year we actually had a GSoC project specifically dealing with
ePub. Josh and Richard made progress on this front, and the code
simply awaits someone with the motivation and the skills to finish
the job. The almost finished feature is available in several GIT
branches here: http://git.lyx.org/?p=gsoc.git;a=summary .

Actually, mostly what this awaits is the release of 2.1, since the
main work left is to integrate this into trunk. There are still bugs
to be fixed, of course, in the underlying XHTML export routines, but
that's always true. Most of the ones Steve mentioned earlier have
already been fixed.

Richard

The instant 2.1 comes out, I'll compile it separate from my
Ubuntu-provided LyX, and try it out.


Sorry, to be clear: The ePub stuff will not be in 2.1. It came too late 
for that. It will go into trunk shortly after 2.1 and be in 2.2, which 
we expect to follow 2.1 much more quickly than usual.


Richard



Thanks,

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance




Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Richard Heck

On 02/03/2014 11:34 AM, Steve Litt wrote:

On Mon, 3 Feb 2014 07:53:49 +0100
Liviu Andronic landronim...@gmail.com wrote:


Dear Steve and Alex,

On Mon, Feb 3, 2014 at 2:48 AM, Steve Litt
sl...@troubleshooters.com wrote:

On Fri, 31 Jan 2014 22:33:02 +0100
Alex Fernandez ely...@gmail.com wrote:

Ladies and gentlemen, if the preceding paragraph doesn't convince
us we need a good, solid, LyX to ePub and LyX to Mobi conversion



Last year we actually had a GSoC project specifically dealing with
ePub. Josh and Richard made progress on this front,

Did the progress include an Xhtml export that preserved semantics and
respected HTML semantics such as using h? for leveled headings, p
for paragraphs, and pre for lyx-code and environments derived from
it, and span for character styles? If you got that far and can write
it to disk, I can take it from there.


and the code
simply awaits someone with the motivation and the skills to finish the
job. The almost finished feature is available in several GIT branches
here: http://git.lyx.org/?p=gsoc.git;a=summary .

If I ever get a day or two to familiarize myself with it, I will. But
of course, I'm really lousy at C++, and in fact hate C++.

I think there are several missed opportunities:

1) If LyX insists on having a native format of XML, which by its nature
is human-confusing, it should at least be *well formed* XML so that
it can be handled by an XML parser, and if the programmer is more
skilled than I, XSLT. The day LyX native format is well formed, and
maybe even with a DTD, I'll make the converter: me, myself and I.


Unclear if this will ever happen. I started the project, but then got 
busy with other things. It is pretty monumental. But the goal would be 
to use Qt's built-in XML writing and reading, which means it will be 
well-formed.



2) The ePub converter shouldn't be a part of LyX: Most of it should be
standalone. There's no reason for LyX to need to know *anything*
about ePub. Just provide a sane Xhtml export, one that doesn't
substitute div for p, or p for h?, and calls a pre a
pre. Don't throw away the semantics: Don't convert styles to
appearances, and don't use div instead of HTML structural elements
like p, pre, and h?. Give me that, and I can do the rest.


That is what Josh did last summer. The XHTML -- ePub converter is 
written in Python.


Richard



Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Liviu Andronic
On Mon, Feb 3, 2014 at 8:02 PM, Richard Heck rgh...@lyx.org wrote:
 1) If LyX insists on having a native format of XML, which by its nature
 is human-confusing, it should at least be *well formed* XML so that
 it can be handled by an XML parser, and if the programmer is more
 skilled than I, XSLT. The day LyX native format is well formed, and
 maybe even with a DTD, I'll make the converter: me, myself and I.


 Unclear if this will ever happen. I started the project, but then got busy
 with other things. It is pretty monumental. But the goal would be to use
 Qt's built-in XML writing and reading, which means it will be well-formed.

The advantages to having an XML-based native format are numerous:
proper diff of LyX files; more standardized converters to and from
other XML-based formats; tools like pLyX would likely benefit from
this standardization, too. Some of our long-standing issues would be
addressed by such a change, so well worth a try I'd say.

Liviu


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Steve Litt
On Mon, 3 Feb 2014 20:39:09 +0100
Liviu Andronic landronim...@gmail.com wrote:

 On Mon, Feb 3, 2014 at 8:02 PM, Richard Heck rgh...@lyx.org wrote:
  1) If LyX insists on having a native format of XML, which by its
  nature is human-confusing, it should at least be *well formed* XML
  so that it can be handled by an XML parser, and if the programmer
  is more skilled than I, XSLT. The day LyX native format is well
  formed, and maybe even with a DTD, I'll make the converter: me,
  myself and I.
 
 
  Unclear if this will ever happen. I started the project, but then
  got busy with other things. It is pretty monumental. But the goal
  would be to use Qt's built-in XML writing and reading, which means
  it will be well-formed.
 
 The advantages to having an XML-based native format are numerous:
 proper diff of LyX files; more standardized converters to and from
 other XML-based formats; tools like pLyX would likely benefit from
 this standardization, too. Some of our long-standing issues would be
 addressed by such a change, so well worth a try I'd say.

Liviu,

You *do* mean *well formed* XML-based native format, right?

Yeah, we might as well: We've already done about all the human
readability damage we can do to the format. If it were well formed, at
least it would be simple to make a pretty-print for it.

If we *do* get well formed XML for LyX files, at the same time we
should probably make layout files XML too.  Each paragraph character and
character style (do we really still want to call paragraph styles by
the LaTeX centric name environment?) would have one section
describing its appearance in LyX, and sections describing its
appearance in each of several other output formats. The one for LaTeX
would be LaTeX. The one for Xhtml would be CSS. If it were done like
this, an arbitrary conversion program for a new format could just
list the styles, show the file with those styles applied, and allow the
converter program to supply its own definitions of each style.

This would go a long way toward changing the mission statement of LyX
from a front end for LaTeX to the fastest and easiest way to author
absolutely anything.

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Steve Litt
On Mon, 3 Feb 2014 20:20:03 -0500
Steve Litt sl...@troubleshooters.com wrote:


 Liviu,
 
 You *do* mean *well formed* XML-based native format, right?
 
 Yeah, we might as well: We've already done about all the human
 readability damage we can do to the format. If it were well formed, at
 least it would be simple to make a pretty-print for it.

My apologies, everyone, the preceding is a false statement on my part.

I just created a brand new LyX file in 2.0.6, which is the packaged LyX
for Ubuntu 13.10, and there wasn't a bit of XML in it, well formed or
otherwise. It was basically the same format human parseable format it's
been for 10 years, but with a lot more insets and options. But as far
as I can tell, the newlines in the format are meaningful, meaning one
can do a line by line parse, although with today's proliferation of
great features, many implemented as insets, the parse isn't as simple
as it was in 2005.

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Liviu Andronic
On Tue, Feb 4, 2014 at 2:38 AM, Steve Litt sl...@troubleshooters.com wrote:
 On Mon, 3 Feb 2014 20:20:03 -0500
 Steve Litt sl...@troubleshooters.com wrote:


 Liviu,

 You *do* mean *well formed* XML-based native format, right?

 Yeah, we might as well: We've already done about all the human
 readability damage we can do to the format. If it were well formed, at
 least it would be simple to make a pretty-print for it.

 My apologies, everyone, the preceding is a false statement on my part.

 I just created a brand new LyX file in 2.0.6, which is the packaged LyX
 for Ubuntu 13.10, and there wasn't a bit of XML in it, well formed or
 otherwise. It was basically the same format human parseable format it's
 been for 10 years, but with a lot more insets and options. But as far

Quick question: What are your reasons / needs for being able to
humanly parse the file format? Would a tool like pLyX address these
needs: http://wiki.lyx.org/Examples/FindAndReplaceLyXFormatElements ?

Liviu

 as I can tell, the newlines in the format are meaningful, meaning one
 can do a line by line parse, although with today's proliferation of
 great features, many implemented as insets, the parse isn't as simple
 as it was in 2005.

 SteveT

 Steve Litt*  http://www.troubleshooters.com/
 Troubleshooting Training  *  Human Performance



-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Richard Heck

On 02/03/2014 01:53 AM, Liviu Andronic wrote:

Dear Steve and Alex,

On Mon, Feb 3, 2014 at 2:48 AM, Steve Litt  wrote:

On Fri, 31 Jan 2014 22:33:02 +0100
Alex Fernandez  wrote:

Ladies and gentlemen, if the preceding paragraph doesn't convince
us we need a good, solid, LyX to ePub and LyX to Mobi conversion



Last year we actually had a GSoC project specifically dealing with
ePub. Josh and Richard made progress on this front, and the code
simply awaits someone with the motivation and the skills to finish the
job. The almost finished feature is available in several GIT branches
here: http://git.lyx.org/?p=gsoc.git;a=summary .


Actually, mostly what this awaits is the release of 2.1, since the main 
work left is to integrate this into trunk. There are still bugs to be 
fixed, of course, in the underlying XHTML export routines, but that's 
always true. Most of the ones Steve mentioned earlier have already been 
fixed.


Richard



Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Steve Litt
On Mon, 03 Feb 2014 10:05:57 -0500
Richard Heck  wrote:

> On 02/03/2014 01:53 AM, Liviu Andronic wrote:
> > Dear Steve and Alex,
> >
> > On Mon, Feb 3, 2014 at 2:48 AM, Steve Litt
> >  wrote:
> >> On Fri, 31 Jan 2014 22:33:02 +0100
> >> Alex Fernandez  wrote:
>  Ladies and gentlemen, if the preceding paragraph doesn't convince
>  us we need a good, solid, LyX to ePub and LyX to Mobi conversion
> >>
> > Last year we actually had a GSoC project specifically dealing with
> > ePub. Josh and Richard made progress on this front, and the code
> > simply awaits someone with the motivation and the skills to finish
> > the job. The almost finished feature is available in several GIT
> > branches here: http://git.lyx.org/?p=gsoc.git;a=summary .
> 
> Actually, mostly what this awaits is the release of 2.1, since the
> main work left is to integrate this into trunk. There are still bugs
> to be fixed, of course, in the underlying XHTML export routines, but
> that's always true. Most of the ones Steve mentioned earlier have
> already been fixed.
> 
> Richard

The instant 2.1 comes out, I'll compile it separate from my
Ubuntu-provided LyX, and try it out.

Thanks,

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Steve Litt
On Mon, 3 Feb 2014 07:53:49 +0100
Liviu Andronic  wrote:

> Dear Steve and Alex,
> 
> On Mon, Feb 3, 2014 at 2:48 AM, Steve Litt
>  wrote:
> > On Fri, 31 Jan 2014 22:33:02 +0100
> > Alex Fernandez  wrote:
> >> > Ladies and gentlemen, if the preceding paragraph doesn't convince
> >> > us we need a good, solid, LyX to ePub and LyX to Mobi conversion
> >
> >
> Last year we actually had a GSoC project specifically dealing with
> ePub. Josh and Richard made progress on this front, 

Did the progress include an Xhtml export that preserved semantics and
respected HTML semantics such as using  for leveled headings, 
for paragraphs, and  for lyx-code and environments derived from
it, and  for character styles? If you got that far and can write
it to disk, I can take it from there.

> and the code
> simply awaits someone with the motivation and the skills to finish the
> job. The almost finished feature is available in several GIT branches
> here: http://git.lyx.org/?p=gsoc.git;a=summary .

If I ever get a day or two to familiarize myself with it, I will. But
of course, I'm really lousy at C++, and in fact hate C++.

I think there are several missed opportunities:

1) If LyX insists on having a native format of XML, which by its nature
   is human-confusing, it should at least be *well formed* XML so that
   it can be handled by an XML parser, and if the programmer is more
   skilled than I, XSLT. The day LyX native format is well formed, and
   maybe even with a DTD, I'll make the converter: me, myself and I.

2) The ePub converter shouldn't be a part of LyX: Most of it should be
   standalone. There's no reason for LyX to need to know *anything*
   about ePub. Just provide a sane Xhtml export, one that doesn't
   substitute  for , or  for , and calls a  a
   . Don't throw away the semantics: Don't convert styles to
   appearances, and don't use  instead of HTML structural elements
   like , , and . Give me that, and I can do the rest.

3) I honestly don't think you're ever going to get one programmer to do
   this whole thing. The likelihood is infinitesimal of finding one
   person who: A) has a burning ePub itch to scratch and B) is great at
   C++ and C) Is familiar with LyX's code base. The way to do this
   is as a pipeline, where person A exports what person B needs, and
   person B converts that to what person C needs, etc. The beauty of
   this would be is that, as LyX changes, only person A would need to
   change his code, unless, of course, the rest of the chain needs an
   enhancement. The way I envision it would be either that person A
   would be the person who makes a sane Xhtml export, and person B is
   me. Or, perhaps, if LyX native format is ever well formed and valid
   XML, Person A is me, converting LyX to an Xhtml subset, and person B
   is me, converting the subset to ePub. That way, as LyX changed, I'd
   need to change only the first program.

4) Back when Josh was doing this, neither I nor Rob Oakes knew what we
   know about ePub today. If we had, we could have given Josh and his
   crew much better guidance about what was needed. If anybody wants to
   continue on the road to LyX->ePub, I suggest that person(s) ask us
   lots of questions.

Thanks,

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Richard Heck

On 02/03/2014 11:36 AM, Steve Litt wrote:

On Mon, 03 Feb 2014 10:05:57 -0500
Richard Heck  wrote:


On 02/03/2014 01:53 AM, Liviu Andronic wrote:

Dear Steve and Alex,

On Mon, Feb 3, 2014 at 2:48 AM, Steve Litt
 wrote:

On Fri, 31 Jan 2014 22:33:02 +0100
Alex Fernandez  wrote:

Ladies and gentlemen, if the preceding paragraph doesn't convince
us we need a good, solid, LyX to ePub and LyX to Mobi conversion

Last year we actually had a GSoC project specifically dealing with
ePub. Josh and Richard made progress on this front, and the code
simply awaits someone with the motivation and the skills to finish
the job. The almost finished feature is available in several GIT
branches here: http://git.lyx.org/?p=gsoc.git;a=summary .

Actually, mostly what this awaits is the release of 2.1, since the
main work left is to integrate this into trunk. There are still bugs
to be fixed, of course, in the underlying XHTML export routines, but
that's always true. Most of the ones Steve mentioned earlier have
already been fixed.

Richard

The instant 2.1 comes out, I'll compile it separate from my
Ubuntu-provided LyX, and try it out.


Sorry, to be clear: The ePub stuff will not be in 2.1. It came too late 
for that. It will go into trunk shortly after 2.1 and be in 2.2, which 
we expect to follow 2.1 much more quickly than usual.


Richard



Thanks,

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance




Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Richard Heck

On 02/03/2014 11:34 AM, Steve Litt wrote:

On Mon, 3 Feb 2014 07:53:49 +0100
Liviu Andronic  wrote:


Dear Steve and Alex,

On Mon, Feb 3, 2014 at 2:48 AM, Steve Litt
 wrote:

On Fri, 31 Jan 2014 22:33:02 +0100
Alex Fernandez  wrote:

Ladies and gentlemen, if the preceding paragraph doesn't convince
us we need a good, solid, LyX to ePub and LyX to Mobi conversion



Last year we actually had a GSoC project specifically dealing with
ePub. Josh and Richard made progress on this front,

Did the progress include an Xhtml export that preserved semantics and
respected HTML semantics such as using  for leveled headings, 
for paragraphs, and  for lyx-code and environments derived from
it, and  for character styles? If you got that far and can write
it to disk, I can take it from there.


and the code
simply awaits someone with the motivation and the skills to finish the
job. The almost finished feature is available in several GIT branches
here: http://git.lyx.org/?p=gsoc.git;a=summary .

If I ever get a day or two to familiarize myself with it, I will. But
of course, I'm really lousy at C++, and in fact hate C++.

I think there are several missed opportunities:

1) If LyX insists on having a native format of XML, which by its nature
is human-confusing, it should at least be *well formed* XML so that
it can be handled by an XML parser, and if the programmer is more
skilled than I, XSLT. The day LyX native format is well formed, and
maybe even with a DTD, I'll make the converter: me, myself and I.


Unclear if this will ever happen. I started the project, but then got 
busy with other things. It is pretty monumental. But the goal would be 
to use Qt's built-in XML writing and reading, which means it will be 
well-formed.



2) The ePub converter shouldn't be a part of LyX: Most of it should be
standalone. There's no reason for LyX to need to know *anything*
about ePub. Just provide a sane Xhtml export, one that doesn't
substitute  for , or  for , and calls a  a
. Don't throw away the semantics: Don't convert styles to
appearances, and don't use  instead of HTML structural elements
like , , and . Give me that, and I can do the rest.


That is what Josh did last summer. The XHTML --> ePub converter is 
written in Python.


Richard



Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Liviu Andronic
On Mon, Feb 3, 2014 at 8:02 PM, Richard Heck  wrote:
>> 1) If LyX insists on having a native format of XML, which by its nature
>> is human-confusing, it should at least be *well formed* XML so that
>> it can be handled by an XML parser, and if the programmer is more
>> skilled than I, XSLT. The day LyX native format is well formed, and
>> maybe even with a DTD, I'll make the converter: me, myself and I.
>
>
> Unclear if this will ever happen. I started the project, but then got busy
> with other things. It is pretty monumental. But the goal would be to use
> Qt's built-in XML writing and reading, which means it will be well-formed.
>
The advantages to having an XML-based native format are numerous:
proper diff of LyX files; more standardized converters to and from
other XML-based formats; tools like pLyX would likely benefit from
this standardization, too. Some of our long-standing issues would be
addressed by such a change, so well worth a try I'd say.

Liviu


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Steve Litt
On Mon, 3 Feb 2014 20:39:09 +0100
Liviu Andronic  wrote:

> On Mon, Feb 3, 2014 at 8:02 PM, Richard Heck  wrote:
> >> 1) If LyX insists on having a native format of XML, which by its
> >> nature is human-confusing, it should at least be *well formed* XML
> >> so that it can be handled by an XML parser, and if the programmer
> >> is more skilled than I, XSLT. The day LyX native format is well
> >> formed, and maybe even with a DTD, I'll make the converter: me,
> >> myself and I.
> >
> >
> > Unclear if this will ever happen. I started the project, but then
> > got busy with other things. It is pretty monumental. But the goal
> > would be to use Qt's built-in XML writing and reading, which means
> > it will be well-formed.
> >
> The advantages to having an XML-based native format are numerous:
> proper diff of LyX files; more standardized converters to and from
> other XML-based formats; tools like pLyX would likely benefit from
> this standardization, too. Some of our long-standing issues would be
> addressed by such a change, so well worth a try I'd say.

Liviu,

You *do* mean *well formed* XML-based native format, right?

Yeah, we might as well: We've already done about all the human
readability damage we can do to the format. If it were well formed, at
least it would be simple to make a pretty-print for it.

If we *do* get well formed XML for LyX files, at the same time we
should probably make layout files XML too.  Each paragraph character and
character style (do we really still want to call paragraph styles by
the LaTeX centric name "environment?") would have one section
describing its appearance in LyX, and sections describing its
appearance in each of several other output formats. The one for LaTeX
would be LaTeX. The one for Xhtml would be CSS. If it were done like
this, an arbitrary conversion program for a "new" format could just
list the styles, show the file with those styles applied, and allow the
converter program to supply its own definitions of each style.

This would go a long way toward changing the "mission statement" of LyX
from "a front end for LaTeX" to "the fastest and easiest way to author
absolutely anything".

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Steve Litt
On Mon, 3 Feb 2014 20:20:03 -0500
Steve Litt  wrote:


> Liviu,
> 
> You *do* mean *well formed* XML-based native format, right?
> 
> Yeah, we might as well: We've already done about all the human
> readability damage we can do to the format. If it were well formed, at
> least it would be simple to make a pretty-print for it.

My apologies, everyone, the preceding is a false statement on my part.

I just created a brand new LyX file in 2.0.6, which is the packaged LyX
for Ubuntu 13.10, and there wasn't a bit of XML in it, well formed or
otherwise. It was basically the same format human parseable format it's
been for 10 years, but with a lot more insets and options. But as far
as I can tell, the newlines in the format are meaningful, meaning one
can do a line by line parse, although with today's proliferation of
great features, many implemented as insets, the parse isn't as simple
as it was in 2005.

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-03 Thread Liviu Andronic
On Tue, Feb 4, 2014 at 2:38 AM, Steve Litt  wrote:
> On Mon, 3 Feb 2014 20:20:03 -0500
> Steve Litt  wrote:
>
>
>> Liviu,
>>
>> You *do* mean *well formed* XML-based native format, right?
>>
>> Yeah, we might as well: We've already done about all the human
>> readability damage we can do to the format. If it were well formed, at
>> least it would be simple to make a pretty-print for it.
>
> My apologies, everyone, the preceding is a false statement on my part.
>
> I just created a brand new LyX file in 2.0.6, which is the packaged LyX
> for Ubuntu 13.10, and there wasn't a bit of XML in it, well formed or
> otherwise. It was basically the same format human parseable format it's
> been for 10 years, but with a lot more insets and options. But as far
>
Quick question: What are your reasons / needs for being able to
"humanly" parse the file format? Would a tool like pLyX address these
needs: http://wiki.lyx.org/Examples/FindAndReplaceLyXFormatElements ?

Liviu

> as I can tell, the newlines in the format are meaningful, meaning one
> can do a line by line parse, although with today's proliferation of
> great features, many implemented as insets, the parse isn't as simple
> as it was in 2005.
>
> SteveT
>
> Steve Litt*  http://www.troubleshooters.com/
> Troubleshooting Training  *  Human Performance



-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-02 Thread Steve Litt
On Sun, 02 Feb 2014 05:53:59 +1100
Alan L Tyree alanty...@gmail.com wrote:

 Well, that's a bit disappointing. Thanks for the full report. I've
 only used Pandoc for simple conversions, so haven't looked deeply at
 the configuration options that might help with the problems that you
 identify below.
 
 One last approach might be interesting to try: What about LyX -
 (X)html, then process through Pandoc. 

Once I have good, solid, semantic Xhtml, Pandoc isn't needed. I already
have most of the code to ePubize a single Xhtml file with lots of
chapters. The tough part is getting your content out of LyX or LaTeX
without dropping the styles.

 And instead of using the LyX
 exporters, try tex4ht to make the html file. 

I think I tried lyx-latex-tex4ht once before, and something went
wrong, though I no longer remember what.

 And process the html file
 through tidy before using Pandoc.

That's a good idea, although Python's lxml.etree can read well formed
XML in any form, and has its own pretty print.
 
 I haven't tried this, so please don't waste your time on it if
 inconvenient. Pandoc seems at its strongest when starting with a
 Markdown file.

LOL, if it leads to LyX to ePub, it will be anything but a waste of
time. I'm going to try that tex4ht again and see what happens.

Thanks,

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-02 Thread Alan L Tyree

SNIP
 And instead of using the LyX
 exporters, try tex4ht to make the html file. 

 I think I tried lyx-latex-tex4ht once before, and something went
 wrong, though I no longer remember what.

 And process the html file
 through tidy before using Pandoc.

 That's a good idea, although Python's lxml.etree can read well formed
 XML in any form, and has its own pretty print.
 
 I haven't tried this, so please don't waste your time on it if
 inconvenient. Pandoc seems at its strongest when starting with a
 Markdown file.

 LOL, if it leads to LyX to ePub, it will be anything but a waste of
 time. I'm going to try that tex4ht again and see what happens.

Hi Steve,
According to my notes from 2009, I used tex4ht using the following
commands:

   - htlatex file.tex xhtml,mathml -cunihtf -cvalidate
 
   - tidy -m -asxhtml name.html

I was just trying to make presentable XHTML files, so I don't know how
'good' they are for your purposes.

When the htlatex command runs, it will stop once in a while waiting for
input. Again, my notes say to use 'R ret'. I just ran it on a
reasonable size file and need to use the 'R' command about 3 times.

I should mention that I am on Debian Wheezy.

HTH,
Alan


 Thanks,

 SteveT

 Steve Litt*  http://www.troubleshooters.com/
 Troubleshooting Training  *  Human Performance


-- 
Alan L Tyree   http://www2.austlii.edu.au/~alan
Tel:  04 2748 6206 sip:172...@iptel.org


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-02 Thread Steve Litt
On Fri, 31 Jan 2014 22:33:02 +0100
Alex Fernandez ely...@gmail.com wrote:

 Hi Steve,
 
 On Fri, Jan 31, 2014 at 5:47 PM, Steve Litt
 sl...@troubleshooters.comwrote:
 
  Ladies and gentlemen, if the preceding paragraph doesn't convince
  us we need a good, solid, LyX to ePub and LyX to Mobi conversion


[clipped shame, commiseration, and eLyXer's not doing the bad stuff I
spoke of]

Hi Alex...

 The shame is, in theory, LyX to ePub is simple. Every environment
  becomes p class=environmentname, every character style becomes
  span class=charstylename. Leave div out of it except for every
  special cases. Even lyx-code should become pre class=lyxcode,
  not div class=lyxcode.
 
 
 It should not be difficult to change eLyXer output to be as you
 desire; just take a look at base.cfg
   https://github.com/alexfernandez/elyxer/blob/master/src/conf/base.cfg

:-)

I've looked at the main eLyXer Python program before. It was almost 10K
lines of code. I'd imagine making a change to it would be difficult.
Later in the email you thought it would take me 1 day to understand
eLyXer's programming. This either greatly overstates my abilities, or
understates the (necessary for the problem domain solved) complexity of
eLyXer.

 where most elements of the output can be configured. There are many
 special cases and a few ugly-ish tricks (such as h? to denote h1-h6),
 but it is mostly there. Or should be.
 
[clip]
 
 I briefly considered writing Yet Another LyX to HTML Exporter, but
  found out that in spite of LyX's native format being Non Human
  Friendly XML, it's not *well formed* XML, so I can't use Python's
  lxml.etree, let alone Python's xml.etree.ElementTree, to parse it.
  Perhaps if LyX offered an export to well-formed XML, hopefully with
  a DTD, I could parse that to produce ePub-friendly Xhtml, but as
  far as I know that doesn't exist either.
 
 
 With eLyXer I have already done all the heavy work myself of
 converting LyX documents to an in-memory structure of containers and
 insets. In theory you might just tweak the configuration file
 base.cfg and generate a completely different document structure such
 as ePub. In practice, and as far as I know, it works: I was able to
 make the transition from LyX 1.x to 2.0 just by adding a few insets
 and containers to the configuration file.

Alex, I can't justify working with 10K lines just to, basically, pass
environment and character style names, with their applied text, into
xhtml. eLyXer was designed to do a hugely greater superset of what I
need. 

Of course, the right way would be to capture semantic styles plus
text with Python's lxml.etree, and (easily) convert to Xhtml. If native
LyX ever becomes well-formed XML, I can easily have my way with it
using Python's lxml.etree package. If native LyX were still 2005 all
text all the time format, I could have text-parsed it and converted to
Xhtml. But with this neither-here-nor-there native format, what should
be a very easy task becomes a riot of detours.

[clip]


 If you are already conversant in Python, know ePub and are willing to
 do the pretty boring task of translating between LyX and ePub, then
 you can take eLyXer as the starting point to do the job. It can
 probably be done in a few days:
 - one to understand eLyXer internals,
 - one to solve any stupid design errors that I may have made that
 make ePub support hard, such as configure it to use a different
 base.cfg file,
 - and one or two to recode all commands to output ePub.

You greatly overestimate my coding abilities.

 
 I would encourage you to take eLyXer and run from there, but the high
 probability that you will find strong opposition to integrate the
 resulting converter into eLyXer will probably mean that your effort
 will mostly be useful for you. 

Assuming I'm the one who eventually makes the converter, I'm not at all
concerned whether it gets integrated into the official LyX project. As
a matter of fact, I'd feel better about it if it were just an add-on
people would download from Troubleshooters.Com. I'm a big fan of
modularity, and the less the conversion and LyX need to know about each
other, the better I like it. Also, my philosophy differs from the LyX
developers' in that I believe that if a user doesn't have the knowledge
to make a shellscript, batch file, powerscript, whatever it's called
on the various platforms, and doesn't want to learn how to do this
simple task, he/she isn't a good candidate for Free Software.


 So, my advice would be to keep away
 from it. If you are still interested I can of course give you any
 support you need with the source code.

I agree, but for different reasons. I keep coming back to the fact that
it's a huge superset of what I need, and it's just short of 10K lines
of code. IMHO there's *got* to be a better way.

That being said, I'd like to congratulate you on being one of the very
few converters of various types that studiously retains semantic
content and passes it to the output.

Thanks,


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-02 Thread Liviu Andronic
Dear Steve and Alex,

On Mon, Feb 3, 2014 at 2:48 AM, Steve Litt sl...@troubleshooters.com wrote:
 On Fri, 31 Jan 2014 22:33:02 +0100
 Alex Fernandez ely...@gmail.com wrote:
  Ladies and gentlemen, if the preceding paragraph doesn't convince
  us we need a good, solid, LyX to ePub and LyX to Mobi conversion


Last year we actually had a GSoC project specifically dealing with
ePub. Josh and Richard made progress on this front, and the code
simply awaits someone with the motivation and the skills to finish the
job. The almost finished feature is available in several GIT branches
here: http://git.lyx.org/?p=gsoc.git;a=summary .

Regards,
Liviu


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-02 Thread Steve Litt
On Sun, 02 Feb 2014 05:53:59 +1100
Alan L Tyree alanty...@gmail.com wrote:

 Well, that's a bit disappointing. Thanks for the full report. I've
 only used Pandoc for simple conversions, so haven't looked deeply at
 the configuration options that might help with the problems that you
 identify below.
 
 One last approach might be interesting to try: What about LyX -
 (X)html, then process through Pandoc. 

Once I have good, solid, semantic Xhtml, Pandoc isn't needed. I already
have most of the code to ePubize a single Xhtml file with lots of
chapters. The tough part is getting your content out of LyX or LaTeX
without dropping the styles.

 And instead of using the LyX
 exporters, try tex4ht to make the html file. 

I think I tried lyx-latex-tex4ht once before, and something went
wrong, though I no longer remember what.

 And process the html file
 through tidy before using Pandoc.

That's a good idea, although Python's lxml.etree can read well formed
XML in any form, and has its own pretty print.
 
 I haven't tried this, so please don't waste your time on it if
 inconvenient. Pandoc seems at its strongest when starting with a
 Markdown file.

LOL, if it leads to LyX to ePub, it will be anything but a waste of
time. I'm going to try that tex4ht again and see what happens.

Thanks,

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-02 Thread Alan L Tyree

SNIP
 And instead of using the LyX
 exporters, try tex4ht to make the html file. 

 I think I tried lyx-latex-tex4ht once before, and something went
 wrong, though I no longer remember what.

 And process the html file
 through tidy before using Pandoc.

 That's a good idea, although Python's lxml.etree can read well formed
 XML in any form, and has its own pretty print.
 
 I haven't tried this, so please don't waste your time on it if
 inconvenient. Pandoc seems at its strongest when starting with a
 Markdown file.

 LOL, if it leads to LyX to ePub, it will be anything but a waste of
 time. I'm going to try that tex4ht again and see what happens.

Hi Steve,
According to my notes from 2009, I used tex4ht using the following
commands:

   - htlatex file.tex xhtml,mathml -cunihtf -cvalidate
 
   - tidy -m -asxhtml name.html

I was just trying to make presentable XHTML files, so I don't know how
'good' they are for your purposes.

When the htlatex command runs, it will stop once in a while waiting for
input. Again, my notes say to use 'R ret'. I just ran it on a
reasonable size file and need to use the 'R' command about 3 times.

I should mention that I am on Debian Wheezy.

HTH,
Alan


 Thanks,

 SteveT

 Steve Litt*  http://www.troubleshooters.com/
 Troubleshooting Training  *  Human Performance


-- 
Alan L Tyree   http://www2.austlii.edu.au/~alan
Tel:  04 2748 6206 sip:172...@iptel.org


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-02 Thread Steve Litt
On Fri, 31 Jan 2014 22:33:02 +0100
Alex Fernandez ely...@gmail.com wrote:

 Hi Steve,
 
 On Fri, Jan 31, 2014 at 5:47 PM, Steve Litt
 sl...@troubleshooters.comwrote:
 
  Ladies and gentlemen, if the preceding paragraph doesn't convince
  us we need a good, solid, LyX to ePub and LyX to Mobi conversion


[clipped shame, commiseration, and eLyXer's not doing the bad stuff I
spoke of]

Hi Alex...

 The shame is, in theory, LyX to ePub is simple. Every environment
  becomes p class=environmentname, every character style becomes
  span class=charstylename. Leave div out of it except for every
  special cases. Even lyx-code should become pre class=lyxcode,
  not div class=lyxcode.
 
 
 It should not be difficult to change eLyXer output to be as you
 desire; just take a look at base.cfg
   https://github.com/alexfernandez/elyxer/blob/master/src/conf/base.cfg

:-)

I've looked at the main eLyXer Python program before. It was almost 10K
lines of code. I'd imagine making a change to it would be difficult.
Later in the email you thought it would take me 1 day to understand
eLyXer's programming. This either greatly overstates my abilities, or
understates the (necessary for the problem domain solved) complexity of
eLyXer.

 where most elements of the output can be configured. There are many
 special cases and a few ugly-ish tricks (such as h? to denote h1-h6),
 but it is mostly there. Or should be.
 
[clip]
 
 I briefly considered writing Yet Another LyX to HTML Exporter, but
  found out that in spite of LyX's native format being Non Human
  Friendly XML, it's not *well formed* XML, so I can't use Python's
  lxml.etree, let alone Python's xml.etree.ElementTree, to parse it.
  Perhaps if LyX offered an export to well-formed XML, hopefully with
  a DTD, I could parse that to produce ePub-friendly Xhtml, but as
  far as I know that doesn't exist either.
 
 
 With eLyXer I have already done all the heavy work myself of
 converting LyX documents to an in-memory structure of containers and
 insets. In theory you might just tweak the configuration file
 base.cfg and generate a completely different document structure such
 as ePub. In practice, and as far as I know, it works: I was able to
 make the transition from LyX 1.x to 2.0 just by adding a few insets
 and containers to the configuration file.

Alex, I can't justify working with 10K lines just to, basically, pass
environment and character style names, with their applied text, into
xhtml. eLyXer was designed to do a hugely greater superset of what I
need. 

Of course, the right way would be to capture semantic styles plus
text with Python's lxml.etree, and (easily) convert to Xhtml. If native
LyX ever becomes well-formed XML, I can easily have my way with it
using Python's lxml.etree package. If native LyX were still 2005 all
text all the time format, I could have text-parsed it and converted to
Xhtml. But with this neither-here-nor-there native format, what should
be a very easy task becomes a riot of detours.

[clip]


 If you are already conversant in Python, know ePub and are willing to
 do the pretty boring task of translating between LyX and ePub, then
 you can take eLyXer as the starting point to do the job. It can
 probably be done in a few days:
 - one to understand eLyXer internals,
 - one to solve any stupid design errors that I may have made that
 make ePub support hard, such as configure it to use a different
 base.cfg file,
 - and one or two to recode all commands to output ePub.

You greatly overestimate my coding abilities.

 
 I would encourage you to take eLyXer and run from there, but the high
 probability that you will find strong opposition to integrate the
 resulting converter into eLyXer will probably mean that your effort
 will mostly be useful for you. 

Assuming I'm the one who eventually makes the converter, I'm not at all
concerned whether it gets integrated into the official LyX project. As
a matter of fact, I'd feel better about it if it were just an add-on
people would download from Troubleshooters.Com. I'm a big fan of
modularity, and the less the conversion and LyX need to know about each
other, the better I like it. Also, my philosophy differs from the LyX
developers' in that I believe that if a user doesn't have the knowledge
to make a shellscript, batch file, powerscript, whatever it's called
on the various platforms, and doesn't want to learn how to do this
simple task, he/she isn't a good candidate for Free Software.


 So, my advice would be to keep away
 from it. If you are still interested I can of course give you any
 support you need with the source code.

I agree, but for different reasons. I keep coming back to the fact that
it's a huge superset of what I need, and it's just short of 10K lines
of code. IMHO there's *got* to be a better way.

That being said, I'd like to congratulate you on being one of the very
few converters of various types that studiously retains semantic
content and passes it to the output.

Thanks,


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-02 Thread Liviu Andronic
Dear Steve and Alex,

On Mon, Feb 3, 2014 at 2:48 AM, Steve Litt sl...@troubleshooters.com wrote:
 On Fri, 31 Jan 2014 22:33:02 +0100
 Alex Fernandez ely...@gmail.com wrote:
  Ladies and gentlemen, if the preceding paragraph doesn't convince
  us we need a good, solid, LyX to ePub and LyX to Mobi conversion


Last year we actually had a GSoC project specifically dealing with
ePub. Josh and Richard made progress on this front, and the code
simply awaits someone with the motivation and the skills to finish the
job. The almost finished feature is available in several GIT branches
here: http://git.lyx.org/?p=gsoc.git;a=summary .

Regards,
Liviu


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-02 Thread Steve Litt
On Sun, 02 Feb 2014 05:53:59 +1100
Alan L Tyree  wrote:

> Well, that's a bit disappointing. Thanks for the full report. I've
> only used Pandoc for simple conversions, so haven't looked deeply at
> the configuration options that might help with the problems that you
> identify below.
> 
> One last approach might be interesting to try: What about LyX ->
> (X)html, then process through Pandoc. 

Once I have good, solid, semantic Xhtml, Pandoc isn't needed. I already
have most of the code to ePubize a single Xhtml file with lots of
chapters. The tough part is getting your content out of LyX or LaTeX
without dropping the styles.

> And instead of using the LyX
> exporters, try tex4ht to make the html file. 

I think I tried lyx->latex->tex4ht once before, and something went
wrong, though I no longer remember what.

> And process the html file
> through tidy before using Pandoc.

That's a good idea, although Python's lxml.etree can read well formed
XML in any form, and has its own pretty print.
> 
> I haven't tried this, so please don't waste your time on it if
> inconvenient. Pandoc seems at its strongest when starting with a
> Markdown file.

LOL, if it leads to LyX to ePub, it will be anything but a waste of
time. I'm going to try that tex4ht again and see what happens.

Thanks,

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-02 Thread Alan L Tyree


>> And instead of using the LyX
>> exporters, try tex4ht to make the html file. 
>
> I think I tried lyx->latex->tex4ht once before, and something went
> wrong, though I no longer remember what.
>
>> And process the html file
>> through tidy before using Pandoc.
>
> That's a good idea, although Python's lxml.etree can read well formed
> XML in any form, and has its own pretty print.
>> 
>> I haven't tried this, so please don't waste your time on it if
>> inconvenient. Pandoc seems at its strongest when starting with a
>> Markdown file.
>
> LOL, if it leads to LyX to ePub, it will be anything but a waste of
> time. I'm going to try that tex4ht again and see what happens.
>
Hi Steve,
According to my notes from 2009, I used tex4ht using the following
commands:

   - htlatex file.tex "xhtml,mathml" "-cunihtf" "-cvalidate"
 
   - tidy -m -asxhtml name.html

I was just trying to make presentable XHTML files, so I don't know how
'good' they are for your purposes.

When the htlatex command runs, it will stop once in a while waiting for
input. Again, my notes say to use 'R '. I just ran it on a
reasonable size file and need to use the 'R' command about 3 times.

I should mention that I am on Debian Wheezy.

HTH,
Alan


> Thanks,
>
> SteveT
>
> Steve Litt*  http://www.troubleshooters.com/
> Troubleshooting Training  *  Human Performance


-- 
Alan L Tyree   http://www2.austlii.edu.au/~alan
Tel:  04 2748 6206 sip:172...@iptel.org


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-02 Thread Steve Litt
On Fri, 31 Jan 2014 22:33:02 +0100
Alex Fernandez  wrote:

> Hi Steve,
> 
> On Fri, Jan 31, 2014 at 5:47 PM, Steve Litt
> wrote:
> 
> > Ladies and gentlemen, if the preceding paragraph doesn't convince
> > us we need a good, solid, LyX to ePub and LyX to Mobi conversion


[clipped shame, commiseration, and eLyXer's not doing the bad stuff I
spoke of]

Hi Alex...

> The shame is, in theory, LyX to ePub is simple. Every environment
> > becomes , every character style becomes
> > . Leave  out of it except for every
> > special cases. Even lyx-code should become ,
> > not .
> >
> 
> It should not be difficult to change eLyXer output to be as you
> desire; just take a look at base.cfg
>   https://github.com/alexfernandez/elyxer/blob/master/src/conf/base.cfg

:-)

I've looked at the main eLyXer Python program before. It was almost 10K
lines of code. I'd imagine making a change to it would be difficult.
Later in the email you thought it would take me 1 day to understand
eLyXer's programming. This either greatly overstates my abilities, or
understates the (necessary for the problem domain solved) complexity of
eLyXer.

> where most elements of the output can be configured. There are many
> special cases and a few ugly-ish tricks (such as h? to denote h1-h6),
> but it is mostly there. Or should be.
> 
[clip]
> 
> I briefly considered writing Yet Another LyX to HTML Exporter, but
> > found out that in spite of LyX's native format being Non Human
> > Friendly XML, it's not *well formed* XML, so I can't use Python's
> > lxml.etree, let alone Python's xml.etree.ElementTree, to parse it.
> > Perhaps if LyX offered an export to well-formed XML, hopefully with
> > a DTD, I could parse that to produce ePub-friendly Xhtml, but as
> > far as I know that doesn't exist either.
> >
> 
> With eLyXer I have already done all the heavy work myself of
> converting LyX documents to an in-memory structure of containers and
> insets. In theory you might just tweak the configuration file
> base.cfg and generate a completely different document structure such
> as ePub. In practice, and as far as I know, it works: I was able to
> make the transition from LyX 1.x to 2.0 just by adding a few insets
> and containers to the configuration file.

Alex, I can't justify working with 10K lines just to, basically, pass
environment and character style names, with their applied text, into
xhtml. eLyXer was designed to do a hugely greater superset of what I
need. 

Of course, the right way would be to capture semantic styles plus
text with Python's lxml.etree, and (easily) convert to Xhtml. If native
LyX ever becomes well-formed XML, I can easily have my way with it
using Python's lxml.etree package. If native LyX were still 2005 all
text all the time format, I could have text-parsed it and converted to
Xhtml. But with this neither-here-nor-there native format, what should
be a very easy task becomes a riot of detours.

[clip]


> If you are already conversant in Python, know ePub and are willing to
> do the pretty boring task of translating between LyX and ePub, then
> you can take eLyXer as the starting point to do the job. It can
> probably be done in a few days:
> - one to understand eLyXer internals,
> - one to solve any stupid design errors that I may have made that
> make ePub support hard, such as configure it to use a different
> base.cfg file,
> - and one or two to recode all commands to output ePub.

You greatly overestimate my coding abilities.

> 
> I would encourage you to take eLyXer and run from there, but the high
> probability that you will find strong opposition to integrate the
> resulting converter into eLyXer will probably mean that your effort
> will mostly be useful for you. 

Assuming I'm the one who eventually makes the converter, I'm not at all
concerned whether it gets integrated into the official LyX project. As
a matter of fact, I'd feel better about it if it were just an add-on
people would download from Troubleshooters.Com. I'm a big fan of
modularity, and the less the conversion and LyX need to know about each
other, the better I like it. Also, my philosophy differs from the LyX
developers' in that I believe that if a user doesn't have the knowledge
to make a shellscript, batch file, powerscript, whatever it's called
on the various platforms, and doesn't want to learn how to do this
simple task, he/she isn't a good candidate for Free Software.


> So, my advice would be to keep away
> from it. If you are still interested I can of course give you any
> support you need with the source code.

I agree, but for different reasons. I keep coming back to the fact that
it's a huge superset of what I need, and it's just short of 10K lines
of code. IMHO there's *got* to be a better way.

That being said, I'd like to congratulate you on being one of the very
few converters of various types that studiously retains semantic
content and passes it to the output.

Thanks,

SteveT

Steve 

Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-02 Thread Liviu Andronic
Dear Steve and Alex,

On Mon, Feb 3, 2014 at 2:48 AM, Steve Litt  wrote:
> On Fri, 31 Jan 2014 22:33:02 +0100
> Alex Fernandez  wrote:
>> > Ladies and gentlemen, if the preceding paragraph doesn't convince
>> > us we need a good, solid, LyX to ePub and LyX to Mobi conversion
>
>
Last year we actually had a GSoC project specifically dealing with
ePub. Josh and Richard made progress on this front, and the code
simply awaits someone with the motivation and the skills to finish the
job. The almost finished feature is available in several GIT branches
here: http://git.lyx.org/?p=gsoc.git;a=summary .

Regards,
Liviu


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-01 Thread Steve Litt
On Sat, 01 Feb 2014 05:25:58 +1100
Alan L Tyree alanty...@gmail.com wrote:

 Sorry for the top posting, but this is short. My own view is that an
 ePub exporter for LyX would make it a killer application. 
 
 Thanks for your comments, Steve. Have you looked at Pandoc?
 
 Cheers,
 Alan

Hi Alan,

I hadn't known about Pandoc. Thanks for the heads-up. I looked up Pandoc
here:

http://johnmacfarlane.net/pandoc/

For the following ascii art, please switch to monospace font...

   .___.   .__.
LyXformat-|LyX|-LaTeXformat-|Pandoc|-ePubFormat
   `   

The preceding looked simple enough, so I downloaded and installed
Pandoc, and tried it on one of my simpler books. It indeed produced an
ePub, and in certain respects a good one. But not nearly good enough to
sell. No table of contents. No cover. No images: all images referred to
only by their Alt text. All cross reference labels exposed as text with
arbitrary subscript formatting in places. The good news is it managed
to keep footnotes and the like, but that's not good enough. I spoze
theoretically I could have used other options on my lyx -export
command, or on my pandoc command, so these things wouldn't happen, or
perhaps I could have made restrictions on the authoring of my book in
LyX, but these things would need to be examined later.

Another way to use Pandoc might be this:

   .___.   .__.
LyXformat-|LyX|-LaTeXformat-|Pandoc|-xhtml-.
   `   |
 .-'
 |
 |  ._. ._.
 `-|xhtml Tweaker|-xhtml-|My xhtml2epub|-ePub
``` ```

The preceding would depend on:

A) Pandoc retaining enough info, including semantic styles, to make all
   book elements
B) Pandoc producing xhtml sane enough that the tweaker is something
   that can actually be written.

So I tried using Pandoc to convert my LaTeX book to (X)html. The result
had no head (and this might be an advantage), it had all sorts of
garbage characters (perhaps this could be fixed in the head I would
insert), but, the kiss of death is this: It took all my semantic styles
(environments and character styles), kinda-sorta converted them to
presentation, and discarded the semantic styles, so that in my head
I can't specify the link between semantic styles and presentation.
Once those semantic styles are gone, no matter how clever a programmer
I am, I don't have the necessary input info to govern the look of my
eBook. This is a showstopper that cannot be recovered from. 

So unless somebody knows of a way to prevent Pandoc from pulling an
MSWord move and prematurely converting semantic to presentational, my
opinion is that Pandoc is worthless for this task.

Thanks,

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-01 Thread Alan L Tyree
Well, that's a bit disappointing. Thanks for the full report. I've only
used Pandoc for simple conversions, so haven't looked deeply at the
configuration options that might help with the problems that you
identify below.

One last approach might be interesting to try: What about LyX -
(X)html, then process through Pandoc. And instead of using the LyX
exporters, try tex4ht to make the html file. And process the html file
through tidy before using Pandoc.

I haven't tried this, so please don't waste your time on it if
inconvenient. Pandoc seems at its strongest when starting with a
Markdown file.

Cheers,
Alan


Steve Litt writes:

 On Sat, 01 Feb 2014 05:25:58 +1100
 Alan L Tyree alanty...@gmail.com wrote:

 Sorry for the top posting, but this is short. My own view is that an
 ePub exporter for LyX would make it a killer application. 
 
 Thanks for your comments, Steve. Have you looked at Pandoc?
 
 Cheers,
 Alan

 Hi Alan,

 I hadn't known about Pandoc. Thanks for the heads-up. I looked up Pandoc
 here:

 http://johnmacfarlane.net/pandoc/

 For the following ascii art, please switch to monospace font...

.___.   .__.
 LyXformat-|LyX|-LaTeXformat-|Pandoc|-ePubFormat
`   

 The preceding looked simple enough, so I downloaded and installed
 Pandoc, and tried it on one of my simpler books. It indeed produced an
 ePub, and in certain respects a good one. But not nearly good enough to
 sell. No table of contents. No cover. No images: all images referred to
 only by their Alt text. All cross reference labels exposed as text with
 arbitrary subscript formatting in places. The good news is it managed
 to keep footnotes and the like, but that's not good enough. I spoze
 theoretically I could have used other options on my lyx -export
 command, or on my pandoc command, so these things wouldn't happen, or
 perhaps I could have made restrictions on the authoring of my book in
 LyX, but these things would need to be examined later.

 Another way to use Pandoc might be this:

.___.   .__.
 LyXformat-|LyX|-LaTeXformat-|Pandoc|-xhtml-.
`   |
  .-'
  |
  |  ._. ._.
  `-|xhtml Tweaker|-xhtml-|My xhtml2epub|-ePub
 ``` ```

 The preceding would depend on:

 A) Pandoc retaining enough info, including semantic styles, to make all
book elements
 B) Pandoc producing xhtml sane enough that the tweaker is something
that can actually be written.

 So I tried using Pandoc to convert my LaTeX book to (X)html. The result
 had no head (and this might be an advantage), it had all sorts of
 garbage characters (perhaps this could be fixed in the head I would
 insert), but, the kiss of death is this: It took all my semantic styles
 (environments and character styles), kinda-sorta converted them to
 presentation, and discarded the semantic styles, so that in my head
 I can't specify the link between semantic styles and presentation.
 Once those semantic styles are gone, no matter how clever a programmer
 I am, I don't have the necessary input info to govern the look of my
 eBook. This is a showstopper that cannot be recovered from. 

 So unless somebody knows of a way to prevent Pandoc from pulling an
 MSWord move and prematurely converting semantic to presentational, my
 opinion is that Pandoc is worthless for this task.

 Thanks,

 SteveT

 Steve Litt*  http://www.troubleshooters.com/
 Troubleshooting Training  *  Human Performance


-- 
Alan L Tyree   http://www2.austlii.edu.au/~alan
Tel:  04 2748 6206 sip:172...@iptel.org


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-01 Thread Steve Litt
On Sat, 01 Feb 2014 05:25:58 +1100
Alan L Tyree alanty...@gmail.com wrote:

 Sorry for the top posting, but this is short. My own view is that an
 ePub exporter for LyX would make it a killer application. 
 
 Thanks for your comments, Steve. Have you looked at Pandoc?
 
 Cheers,
 Alan

Hi Alan,

I hadn't known about Pandoc. Thanks for the heads-up. I looked up Pandoc
here:

http://johnmacfarlane.net/pandoc/

For the following ascii art, please switch to monospace font...

   .___.   .__.
LyXformat-|LyX|-LaTeXformat-|Pandoc|-ePubFormat
   `   

The preceding looked simple enough, so I downloaded and installed
Pandoc, and tried it on one of my simpler books. It indeed produced an
ePub, and in certain respects a good one. But not nearly good enough to
sell. No table of contents. No cover. No images: all images referred to
only by their Alt text. All cross reference labels exposed as text with
arbitrary subscript formatting in places. The good news is it managed
to keep footnotes and the like, but that's not good enough. I spoze
theoretically I could have used other options on my lyx -export
command, or on my pandoc command, so these things wouldn't happen, or
perhaps I could have made restrictions on the authoring of my book in
LyX, but these things would need to be examined later.

Another way to use Pandoc might be this:

   .___.   .__.
LyXformat-|LyX|-LaTeXformat-|Pandoc|-xhtml-.
   `   |
 .-'
 |
 |  ._. ._.
 `-|xhtml Tweaker|-xhtml-|My xhtml2epub|-ePub
``` ```

The preceding would depend on:

A) Pandoc retaining enough info, including semantic styles, to make all
   book elements
B) Pandoc producing xhtml sane enough that the tweaker is something
   that can actually be written.

So I tried using Pandoc to convert my LaTeX book to (X)html. The result
had no head (and this might be an advantage), it had all sorts of
garbage characters (perhaps this could be fixed in the head I would
insert), but, the kiss of death is this: It took all my semantic styles
(environments and character styles), kinda-sorta converted them to
presentation, and discarded the semantic styles, so that in my head
I can't specify the link between semantic styles and presentation.
Once those semantic styles are gone, no matter how clever a programmer
I am, I don't have the necessary input info to govern the look of my
eBook. This is a showstopper that cannot be recovered from. 

So unless somebody knows of a way to prevent Pandoc from pulling an
MSWord move and prematurely converting semantic to presentational, my
opinion is that Pandoc is worthless for this task.

Thanks,

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-01 Thread Alan L Tyree
Well, that's a bit disappointing. Thanks for the full report. I've only
used Pandoc for simple conversions, so haven't looked deeply at the
configuration options that might help with the problems that you
identify below.

One last approach might be interesting to try: What about LyX -
(X)html, then process through Pandoc. And instead of using the LyX
exporters, try tex4ht to make the html file. And process the html file
through tidy before using Pandoc.

I haven't tried this, so please don't waste your time on it if
inconvenient. Pandoc seems at its strongest when starting with a
Markdown file.

Cheers,
Alan


Steve Litt writes:

 On Sat, 01 Feb 2014 05:25:58 +1100
 Alan L Tyree alanty...@gmail.com wrote:

 Sorry for the top posting, but this is short. My own view is that an
 ePub exporter for LyX would make it a killer application. 
 
 Thanks for your comments, Steve. Have you looked at Pandoc?
 
 Cheers,
 Alan

 Hi Alan,

 I hadn't known about Pandoc. Thanks for the heads-up. I looked up Pandoc
 here:

 http://johnmacfarlane.net/pandoc/

 For the following ascii art, please switch to monospace font...

.___.   .__.
 LyXformat-|LyX|-LaTeXformat-|Pandoc|-ePubFormat
`   

 The preceding looked simple enough, so I downloaded and installed
 Pandoc, and tried it on one of my simpler books. It indeed produced an
 ePub, and in certain respects a good one. But not nearly good enough to
 sell. No table of contents. No cover. No images: all images referred to
 only by their Alt text. All cross reference labels exposed as text with
 arbitrary subscript formatting in places. The good news is it managed
 to keep footnotes and the like, but that's not good enough. I spoze
 theoretically I could have used other options on my lyx -export
 command, or on my pandoc command, so these things wouldn't happen, or
 perhaps I could have made restrictions on the authoring of my book in
 LyX, but these things would need to be examined later.

 Another way to use Pandoc might be this:

.___.   .__.
 LyXformat-|LyX|-LaTeXformat-|Pandoc|-xhtml-.
`   |
  .-'
  |
  |  ._. ._.
  `-|xhtml Tweaker|-xhtml-|My xhtml2epub|-ePub
 ``` ```

 The preceding would depend on:

 A) Pandoc retaining enough info, including semantic styles, to make all
book elements
 B) Pandoc producing xhtml sane enough that the tweaker is something
that can actually be written.

 So I tried using Pandoc to convert my LaTeX book to (X)html. The result
 had no head (and this might be an advantage), it had all sorts of
 garbage characters (perhaps this could be fixed in the head I would
 insert), but, the kiss of death is this: It took all my semantic styles
 (environments and character styles), kinda-sorta converted them to
 presentation, and discarded the semantic styles, so that in my head
 I can't specify the link between semantic styles and presentation.
 Once those semantic styles are gone, no matter how clever a programmer
 I am, I don't have the necessary input info to govern the look of my
 eBook. This is a showstopper that cannot be recovered from. 

 So unless somebody knows of a way to prevent Pandoc from pulling an
 MSWord move and prematurely converting semantic to presentational, my
 opinion is that Pandoc is worthless for this task.

 Thanks,

 SteveT

 Steve Litt*  http://www.troubleshooters.com/
 Troubleshooting Training  *  Human Performance


-- 
Alan L Tyree   http://www2.austlii.edu.au/~alan
Tel:  04 2748 6206 sip:172...@iptel.org


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-01 Thread Steve Litt
On Sat, 01 Feb 2014 05:25:58 +1100
Alan L Tyree  wrote:

> Sorry for the top posting, but this is short. My own view is that an
> ePub exporter for LyX would make it a killer application. 
> 
> Thanks for your comments, Steve. Have you looked at Pandoc?
> 
> Cheers,
> Alan

Hi Alan,

I hadn't known about Pandoc. Thanks for the heads-up. I looked up Pandoc
here:

http://johnmacfarlane.net/pandoc/

For the following ascii art, please switch to monospace font...

   .___.   .__.
LyXformat->|LyX|->LaTeXformat->|Pandoc|->ePubFormat
   `   

The preceding looked simple enough, so I downloaded and installed
Pandoc, and tried it on one of my simpler books. It indeed produced an
ePub, and in certain respects a good one. But not nearly good enough to
sell. No table of contents. No cover. No images: all images referred to
only by their Alt text. All cross reference labels exposed as text with
arbitrary subscript formatting in places. The good news is it managed
to keep footnotes and the like, but that's not good enough. I spoze
theoretically I could have used other options on my lyx -export
command, or on my pandoc command, so these things wouldn't happen, or
perhaps I could have made restrictions on the authoring of my book in
LyX, but these things would need to be examined later.

Another way to use Pandoc might be this:

   .___.   .__.
LyXformat->|LyX|->LaTeXformat->|Pandoc|->xhtml-.
   `   |
 .-'
 |
 |  ._. ._.
 `->|xhtml Tweaker|->xhtml->|My xhtml2epub|->ePub
``` ```

The preceding would depend on:

A) Pandoc retaining enough info, including semantic styles, to make all
   book elements
B) Pandoc producing xhtml sane enough that the tweaker is something
   that can actually be written.

So I tried using Pandoc to convert my LaTeX book to (X)html. The result
had no  (and this might be an advantage), it had all sorts of
garbage characters (perhaps this could be fixed in the  I would
insert), but, the kiss of death is this: It took all my semantic styles
(environments and character styles), kinda-sorta converted them to
presentation, and discarded the semantic styles, so that in my 
I can't specify the link between semantic styles and presentation.
Once those semantic styles are gone, no matter how clever a programmer
I am, I don't have the necessary input info to govern the look of my
eBook. This is a showstopper that cannot be recovered from. 

So unless somebody knows of a way to prevent Pandoc from pulling an
MSWord move and prematurely converting semantic to presentational, my
opinion is that Pandoc is worthless for this task.

Thanks,

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-02-01 Thread Alan L Tyree
Well, that's a bit disappointing. Thanks for the full report. I've only
used Pandoc for simple conversions, so haven't looked deeply at the
configuration options that might help with the problems that you
identify below.

One last approach might be interesting to try: What about LyX ->
(X)html, then process through Pandoc. And instead of using the LyX
exporters, try tex4ht to make the html file. And process the html file
through tidy before using Pandoc.

I haven't tried this, so please don't waste your time on it if
inconvenient. Pandoc seems at its strongest when starting with a
Markdown file.

Cheers,
Alan


Steve Litt writes:

> On Sat, 01 Feb 2014 05:25:58 +1100
> Alan L Tyree  wrote:
>
>> Sorry for the top posting, but this is short. My own view is that an
>> ePub exporter for LyX would make it a killer application. 
>> 
>> Thanks for your comments, Steve. Have you looked at Pandoc?
>> 
>> Cheers,
>> Alan
>
> Hi Alan,
>
> I hadn't known about Pandoc. Thanks for the heads-up. I looked up Pandoc
> here:
>
> http://johnmacfarlane.net/pandoc/
>
> For the following ascii art, please switch to monospace font...
>
>.___.   .__.
> LyXformat->|LyX|->LaTeXformat->|Pandoc|->ePubFormat
>`   
>
> The preceding looked simple enough, so I downloaded and installed
> Pandoc, and tried it on one of my simpler books. It indeed produced an
> ePub, and in certain respects a good one. But not nearly good enough to
> sell. No table of contents. No cover. No images: all images referred to
> only by their Alt text. All cross reference labels exposed as text with
> arbitrary subscript formatting in places. The good news is it managed
> to keep footnotes and the like, but that's not good enough. I spoze
> theoretically I could have used other options on my lyx -export
> command, or on my pandoc command, so these things wouldn't happen, or
> perhaps I could have made restrictions on the authoring of my book in
> LyX, but these things would need to be examined later.
>
> Another way to use Pandoc might be this:
>
>.___.   .__.
> LyXformat->|LyX|->LaTeXformat->|Pandoc|->xhtml-.
>`   |
>  .-'
>  |
>  |  ._. ._.
>  `->|xhtml Tweaker|->xhtml->|My xhtml2epub|->ePub
> ``` ```
>
> The preceding would depend on:
>
> A) Pandoc retaining enough info, including semantic styles, to make all
>book elements
> B) Pandoc producing xhtml sane enough that the tweaker is something
>that can actually be written.
>
> So I tried using Pandoc to convert my LaTeX book to (X)html. The result
> had no  (and this might be an advantage), it had all sorts of
> garbage characters (perhaps this could be fixed in the  I would
> insert), but, the kiss of death is this: It took all my semantic styles
> (environments and character styles), kinda-sorta converted them to
> presentation, and discarded the semantic styles, so that in my 
> I can't specify the link between semantic styles and presentation.
> Once those semantic styles are gone, no matter how clever a programmer
> I am, I don't have the necessary input info to govern the look of my
> eBook. This is a showstopper that cannot be recovered from. 
>
> So unless somebody knows of a way to prevent Pandoc from pulling an
> MSWord move and prematurely converting semantic to presentational, my
> opinion is that Pandoc is worthless for this task.
>
> Thanks,
>
> SteveT
>
> Steve Litt*  http://www.troubleshooters.com/
> Troubleshooting Training  *  Human Performance


-- 
Alan L Tyree   http://www2.austlii.edu.au/~alan
Tel:  04 2748 6206 sip:172...@iptel.org


We need ePub/Mobi conversion: was: Book Frontmatter

2014-01-31 Thread Steve Litt
On Fri, 31 Jan 2014 09:35:46 +
Anthony Campbell a...@acampbell.org.uk wrote:


 This is for printed books. As regards conversion to ebook format, I've
 done this for several books on Smashwords, but that is quite a
 long-winded process because it has to be Word.doc format, which I do
 in LibreOffice (not much fun). Kindle does accept rtf, which would
 help, but as I'd already made Word.doc files I just used those.
 
 Anthony

Ladies and gentlemen, if the preceding paragraph doesn't convince us we
need a good, solid, LyX to ePub and LyX to Mobi conversion (do ePub
first, you can convert ePub into Mobi), then nothing will. Instead of
slamming out his book in LyX, Anthony must use an outside service
(smashwords), meaning a two word modification is, as Joe Biden would
say, a Big Friggin Deal. Further, to satisfy Smashwords he must write it
in LibreOffice to simulate MS Word. Or, if he's just doing Kindle, he
can submit rtf (what could *possibly* go wrong).

You can't base a LyX to ePub converter off either LyX2Xhtml or Alex's
eLyXer: Those produce great (X)html for stuff like footnotes and
bibliographies, but they discard semantic tags (h1-h6) for variously
named divs (yeah, div, not even p), as I remember they still use
outdated a name=whatever/ instead of giving an ID to a tag. One or
both of them does you the favor of renaming all graphic files to a
numerical sequence: I guess this is to prevent identically named
graphics in different directories from clobbering each other, but there
are better ways of doing this that don't have the anti-debugging
baggage of removing all meaning from graphic names.  Current
LyX exported (X)html files just generally require *huge*
postprocessing, with zillions of special cases, to get them in
reasonable shape to make an ePub. If that were not the case, somebody
would have made a LyX2ePub a long, long time ago, because the demand is
there, and a lot of people have that itch, and I'm not the only one
who has tried to do it. 

Shamefully, because I need to be able to have my books available as
ePubs, after 13 years using LyX to write my books, I'm now using the
Bluefish editor to write my future books. I've written an Xhtml to ePub
converter in Python, and I can write an Xhtml to LaTeX converter just
as easily. But let me ask you something: Have you ever tried to slam
out 2500 words a day in Bluefish? Bluefish will never have the
authoring speed of LyX. But then again, as things stand now, a LyX
authored document will never be convertible to ePub.

The shame is, in theory, LyX to ePub is simple. Every environment
becomes p class=environmentname, every character style becomes
span class=charstylename. Leave div out of it except for every
special cases. Even lyx-code should become pre class=lyxcode, not
div class=lyxcode.

A special one-per-book configuration file (I did mine in YAML) defines
the assignment of Part, Chapter, Section, Subsection etc to h1, h2,
h3, h4 etc, and defines which go in the table of contents, and
which get numbers and what prefix the number gets (Part, Chapter, etc).
I've already done this: It works. Don't worry about converting LyX
environment and char style defs to CSS, just list all paragraph and
char styles, so that the author can make the necessary CSS. CSS is
*much* easier to define than LaTeX environments and commands. And yes,
let the author know that this export requires the author use only a
subset of LyX's capabilities.

I briefly considered writing Yet Another LyX to HTML Exporter, but
found out that in spite of LyX's native format being Non Human Friendly
XML, it's not *well formed* XML, so I can't use Python's lxml.etree,
let alone Python's xml.etree.ElementTree, to parse it. Perhaps if LyX
offered an export to well-formed XML, hopefully with a DTD, I could
parse that to produce ePub-friendly Xhtml, but as far as I know that
doesn't exist either.

Anyway, I would suggest anyone who is working on any portion of a LyX
to ePub conversion talk with me. I'm pretty knowledgeable about ePub,
and I've already identified a lot of the dead ends and blind alleys in
ePub creation, and I know what parts of the LyX document should go into
the ePub, and which parts would be better re-done as either config or
CSS. 

My switch to Bluefish isn't cast in concrete: Once LyX contains a
good, generic, reliable LyX to ePub or even LyX to ePub friendly Xhtml
conversion, I can switch back. If you do it soon enough, I won't even
have to write an Xhtml to LaTeX converter :-)

Thanks,

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-01-31 Thread Alan L Tyree
Sorry for the top posting, but this is short. My own view is that an
ePub exporter for LyX would make it a killer application. 

Thanks for your comments, Steve. Have you looked at Pandoc?

Cheers,
Alan


Steve Litt writes:

 On Fri, 31 Jan 2014 09:35:46 +
 Anthony Campbell a...@acampbell.org.uk wrote:


 This is for printed books. As regards conversion to ebook format, I've
 done this for several books on Smashwords, but that is quite a
 long-winded process because it has to be Word.doc format, which I do
 in LibreOffice (not much fun). Kindle does accept rtf, which would
 help, but as I'd already made Word.doc files I just used those.
 
 Anthony

 Ladies and gentlemen, if the preceding paragraph doesn't convince us we
 need a good, solid, LyX to ePub and LyX to Mobi conversion (do ePub
 first, you can convert ePub into Mobi), then nothing will. Instead of
 slamming out his book in LyX, Anthony must use an outside service
 (smashwords), meaning a two word modification is, as Joe Biden would
 say, a Big Friggin Deal. Further, to satisfy Smashwords he must write it
 in LibreOffice to simulate MS Word. Or, if he's just doing Kindle, he
 can submit rtf (what could *possibly* go wrong).

 You can't base a LyX to ePub converter off either LyX2Xhtml or Alex's
 eLyXer: Those produce great (X)html for stuff like footnotes and
 bibliographies, but they discard semantic tags (h1-h6) for variously
 named divs (yeah, div, not even p), as I remember they still use
 outdated a name=whatever/ instead of giving an ID to a tag. One or
 both of them does you the favor of renaming all graphic files to a
 numerical sequence: I guess this is to prevent identically named
 graphics in different directories from clobbering each other, but there
 are better ways of doing this that don't have the anti-debugging
 baggage of removing all meaning from graphic names.  Current
 LyX exported (X)html files just generally require *huge*
 postprocessing, with zillions of special cases, to get them in
 reasonable shape to make an ePub. If that were not the case, somebody
 would have made a LyX2ePub a long, long time ago, because the demand is
 there, and a lot of people have that itch, and I'm not the only one
 who has tried to do it. 

 Shamefully, because I need to be able to have my books available as
 ePubs, after 13 years using LyX to write my books, I'm now using the
 Bluefish editor to write my future books. I've written an Xhtml to ePub
 converter in Python, and I can write an Xhtml to LaTeX converter just
 as easily. But let me ask you something: Have you ever tried to slam
 out 2500 words a day in Bluefish? Bluefish will never have the
 authoring speed of LyX. But then again, as things stand now, a LyX
 authored document will never be convertible to ePub.

 The shame is, in theory, LyX to ePub is simple. Every environment
 becomes p class=environmentname, every character style becomes
 span class=charstylename. Leave div out of it except for every
 special cases. Even lyx-code should become pre class=lyxcode, not
 div class=lyxcode.

 A special one-per-book configuration file (I did mine in YAML) defines
 the assignment of Part, Chapter, Section, Subsection etc to h1, h2,
 h3, h4 etc, and defines which go in the table of contents, and
 which get numbers and what prefix the number gets (Part, Chapter, etc).
 I've already done this: It works. Don't worry about converting LyX
 environment and char style defs to CSS, just list all paragraph and
 char styles, so that the author can make the necessary CSS. CSS is
 *much* easier to define than LaTeX environments and commands. And yes,
 let the author know that this export requires the author use only a
 subset of LyX's capabilities.

 I briefly considered writing Yet Another LyX to HTML Exporter, but
 found out that in spite of LyX's native format being Non Human Friendly
 XML, it's not *well formed* XML, so I can't use Python's lxml.etree,
 let alone Python's xml.etree.ElementTree, to parse it. Perhaps if LyX
 offered an export to well-formed XML, hopefully with a DTD, I could
 parse that to produce ePub-friendly Xhtml, but as far as I know that
 doesn't exist either.

 Anyway, I would suggest anyone who is working on any portion of a LyX
 to ePub conversion talk with me. I'm pretty knowledgeable about ePub,
 and I've already identified a lot of the dead ends and blind alleys in
 ePub creation, and I know what parts of the LyX document should go into
 the ePub, and which parts would be better re-done as either config or
 CSS. 

 My switch to Bluefish isn't cast in concrete: Once LyX contains a
 good, generic, reliable LyX to ePub or even LyX to ePub friendly Xhtml
 conversion, I can switch back. If you do it soon enough, I won't even
 have to write an Xhtml to LaTeX converter :-)

 Thanks,

 SteveT

 Steve Litt*  http://www.troubleshooters.com/
 Troubleshooting Training  *  Human Performance


-- 
Alan L Tyree   http://www2.austlii.edu.au/~alan
Tel:  04 2748 

Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-01-31 Thread Scott Kostyshak
I'm CC'ing Josh Hieronymus, who has worked on some of this for GSoC.
He is probably busy with other things, but maybe he is still
interested.

Scott

On Fri, Jan 31, 2014 at 1:25 PM, Alan L Tyree alanty...@gmail.com wrote:
 Sorry for the top posting, but this is short. My own view is that an
 ePub exporter for LyX would make it a killer application.

 Thanks for your comments, Steve. Have you looked at Pandoc?

 Cheers,
 Alan


 Steve Litt writes:

 On Fri, 31 Jan 2014 09:35:46 +
 Anthony Campbell a...@acampbell.org.uk wrote:


 This is for printed books. As regards conversion to ebook format, I've
 done this for several books on Smashwords, but that is quite a
 long-winded process because it has to be Word.doc format, which I do
 in LibreOffice (not much fun). Kindle does accept rtf, which would
 help, but as I'd already made Word.doc files I just used those.

 Anthony

 Ladies and gentlemen, if the preceding paragraph doesn't convince us we
 need a good, solid, LyX to ePub and LyX to Mobi conversion (do ePub
 first, you can convert ePub into Mobi), then nothing will. Instead of
 slamming out his book in LyX, Anthony must use an outside service
 (smashwords), meaning a two word modification is, as Joe Biden would
 say, a Big Friggin Deal. Further, to satisfy Smashwords he must write it
 in LibreOffice to simulate MS Word. Or, if he's just doing Kindle, he
 can submit rtf (what could *possibly* go wrong).

 You can't base a LyX to ePub converter off either LyX2Xhtml or Alex's
 eLyXer: Those produce great (X)html for stuff like footnotes and
 bibliographies, but they discard semantic tags (h1-h6) for variously
 named divs (yeah, div, not even p), as I remember they still use
 outdated a name=whatever/ instead of giving an ID to a tag. One or
 both of them does you the favor of renaming all graphic files to a
 numerical sequence: I guess this is to prevent identically named
 graphics in different directories from clobbering each other, but there
 are better ways of doing this that don't have the anti-debugging
 baggage of removing all meaning from graphic names.  Current
 LyX exported (X)html files just generally require *huge*
 postprocessing, with zillions of special cases, to get them in
 reasonable shape to make an ePub. If that were not the case, somebody
 would have made a LyX2ePub a long, long time ago, because the demand is
 there, and a lot of people have that itch, and I'm not the only one
 who has tried to do it.

 Shamefully, because I need to be able to have my books available as
 ePubs, after 13 years using LyX to write my books, I'm now using the
 Bluefish editor to write my future books. I've written an Xhtml to ePub
 converter in Python, and I can write an Xhtml to LaTeX converter just
 as easily. But let me ask you something: Have you ever tried to slam
 out 2500 words a day in Bluefish? Bluefish will never have the
 authoring speed of LyX. But then again, as things stand now, a LyX
 authored document will never be convertible to ePub.

 The shame is, in theory, LyX to ePub is simple. Every environment
 becomes p class=environmentname, every character style becomes
 span class=charstylename. Leave div out of it except for every
 special cases. Even lyx-code should become pre class=lyxcode, not
 div class=lyxcode.

 A special one-per-book configuration file (I did mine in YAML) defines
 the assignment of Part, Chapter, Section, Subsection etc to h1, h2,
 h3, h4 etc, and defines which go in the table of contents, and
 which get numbers and what prefix the number gets (Part, Chapter, etc).
 I've already done this: It works. Don't worry about converting LyX
 environment and char style defs to CSS, just list all paragraph and
 char styles, so that the author can make the necessary CSS. CSS is
 *much* easier to define than LaTeX environments and commands. And yes,
 let the author know that this export requires the author use only a
 subset of LyX's capabilities.

 I briefly considered writing Yet Another LyX to HTML Exporter, but
 found out that in spite of LyX's native format being Non Human Friendly
 XML, it's not *well formed* XML, so I can't use Python's lxml.etree,
 let alone Python's xml.etree.ElementTree, to parse it. Perhaps if LyX
 offered an export to well-formed XML, hopefully with a DTD, I could
 parse that to produce ePub-friendly Xhtml, but as far as I know that
 doesn't exist either.

 Anyway, I would suggest anyone who is working on any portion of a LyX
 to ePub conversion talk with me. I'm pretty knowledgeable about ePub,
 and I've already identified a lot of the dead ends and blind alleys in
 ePub creation, and I know what parts of the LyX document should go into
 the ePub, and which parts would be better re-done as either config or
 CSS.

 My switch to Bluefish isn't cast in concrete: Once LyX contains a
 good, generic, reliable LyX to ePub or even LyX to ePub friendly Xhtml
 conversion, I can switch back. If you do it soon enough, I won't even
 have to write an Xhtml to 

Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-01-31 Thread Alex Fernandez
Hi Steve,

On Fri, Jan 31, 2014 at 5:47 PM, Steve Litt sl...@troubleshooters.comwrote:

 Ladies and gentlemen, if the preceding paragraph doesn't convince us we
 need a good, solid, LyX to ePub and LyX to Mobi conversion (do ePub
 first, you can convert ePub into Mobi), then nothing will. Instead of
 slamming out his book in LyX, Anthony must use an outside service
 (smashwords), meaning a two word modification is, as Joe Biden would
 say, a Big Friggin Deal. Further, to satisfy Smashwords he must write it
 in LibreOffice to simulate MS Word. Or, if he's just doing Kindle, he
 can submit rtf (what could *possibly* go wrong).


Good luck with the convincing part.


 You can't base a LyX to ePub converter off either LyX2Xhtml or Alex's
 eLyXer: Those produce great (X)html for stuff like footnotes and
 bibliographies, but they discard semantic tags (h1-h6) for variously
 named divs (yeah, div, not even p), as I remember they still use
 outdated a name=whatever/ instead of giving an ID to a tag. One or
 both of them does you the favor of renaming all graphic files to a
 numerical sequence: I guess this is to prevent identically named
 graphics in different directories from clobbering each other, but there
 are better ways of doing this that don't have the anti-debugging
 baggage of removing all meaning from graphic names.  Current
 LyX exported (X)html files just generally require *huge*
 postprocessing, with zillions of special cases, to get them in
 reasonable shape to make an ePub. If that were not the case, somebody
 would have made a LyX2ePub a long, long time ago, because the demand is
 there, and a lot of people have that itch, and I'm not the only one
 who has tried to do it.


eLyXer does not rename any graphic files AFAICR. Also, eLyXer should
provide semantic tags like h1-h6, as you can see here:
  http://elyxer.nongnu.org/math-unicode.html

Shamefully, because I need to be able to have my books available as
 ePubs, after 13 years using LyX to write my books, I'm now using the
 Bluefish editor to write my future books. I've written an Xhtml to ePub
 converter in Python, and I can write an Xhtml to LaTeX converter just
 as easily. But let me ask you something: Have you ever tried to slam
 out 2500 words a day in Bluefish? Bluefish will never have the
 authoring speed of LyX. But then again, as things stand now, a LyX
 authored document will never be convertible to ePub.


I feel your pain. Not with Bluefish but with other, also less powerful
editors. For good or for bad I have not needed translation to ePub yet.

The shame is, in theory, LyX to ePub is simple. Every environment
 becomes p class=environmentname, every character style becomes
 span class=charstylename. Leave div out of it except for every
 special cases. Even lyx-code should become pre class=lyxcode, not
 div class=lyxcode.


It should not be difficult to change eLyXer output to be as you desire;
just take a look at base.cfg
  https://github.com/alexfernandez/elyxer/blob/master/src/conf/base.cfg
where most elements of the output can be configured. There are many special
cases and a few ugly-ish tricks (such as h? to denote h1-h6), but it is
mostly there. Or should be.


 A special one-per-book configuration file (I did mine in YAML) defines
 the assignment of Part, Chapter, Section, Subsection etc to h1, h2,
 h3, h4 etc, and defines which go in the table of contents, and
 which get numbers and what prefix the number gets (Part, Chapter, etc).
 I've already done this: It works. Don't worry about converting LyX
 environment and char style defs to CSS, just list all paragraph and
 char styles, so that the author can make the necessary CSS. CSS is
 *much* easier to define than LaTeX environments and commands. And yes,
 let the author know that this export requires the author use only a
 subset of LyX's capabilities.


With luck, LyX devs will propose a native export which is neither here nor
there. Uphill battle in any case.

I briefly considered writing Yet Another LyX to HTML Exporter, but
 found out that in spite of LyX's native format being Non Human Friendly
 XML, it's not *well formed* XML, so I can't use Python's lxml.etree,
 let alone Python's xml.etree.ElementTree, to parse it. Perhaps if LyX
 offered an export to well-formed XML, hopefully with a DTD, I could
 parse that to produce ePub-friendly Xhtml, but as far as I know that
 doesn't exist either.


With eLyXer I have already done all the heavy work myself of converting LyX
documents to an in-memory structure of containers and insets. In theory you
might just tweak the configuration file base.cfg and generate a completely
different document structure such as ePub. In practice, and as far as I
know, it works: I was able to make the transition from LyX 1.x to 2.0 just
by adding a few insets and containers to the configuration file.

Sadly, after 7 years, not using LyX anymore and with little help from LyX
developers (with some *laudable exceptions*) I can no longer justify 

We need ePub/Mobi conversion: was: Book Frontmatter

2014-01-31 Thread Steve Litt
On Fri, 31 Jan 2014 09:35:46 +
Anthony Campbell a...@acampbell.org.uk wrote:


 This is for printed books. As regards conversion to ebook format, I've
 done this for several books on Smashwords, but that is quite a
 long-winded process because it has to be Word.doc format, which I do
 in LibreOffice (not much fun). Kindle does accept rtf, which would
 help, but as I'd already made Word.doc files I just used those.
 
 Anthony

Ladies and gentlemen, if the preceding paragraph doesn't convince us we
need a good, solid, LyX to ePub and LyX to Mobi conversion (do ePub
first, you can convert ePub into Mobi), then nothing will. Instead of
slamming out his book in LyX, Anthony must use an outside service
(smashwords), meaning a two word modification is, as Joe Biden would
say, a Big Friggin Deal. Further, to satisfy Smashwords he must write it
in LibreOffice to simulate MS Word. Or, if he's just doing Kindle, he
can submit rtf (what could *possibly* go wrong).

You can't base a LyX to ePub converter off either LyX2Xhtml or Alex's
eLyXer: Those produce great (X)html for stuff like footnotes and
bibliographies, but they discard semantic tags (h1-h6) for variously
named divs (yeah, div, not even p), as I remember they still use
outdated a name=whatever/ instead of giving an ID to a tag. One or
both of them does you the favor of renaming all graphic files to a
numerical sequence: I guess this is to prevent identically named
graphics in different directories from clobbering each other, but there
are better ways of doing this that don't have the anti-debugging
baggage of removing all meaning from graphic names.  Current
LyX exported (X)html files just generally require *huge*
postprocessing, with zillions of special cases, to get them in
reasonable shape to make an ePub. If that were not the case, somebody
would have made a LyX2ePub a long, long time ago, because the demand is
there, and a lot of people have that itch, and I'm not the only one
who has tried to do it. 

Shamefully, because I need to be able to have my books available as
ePubs, after 13 years using LyX to write my books, I'm now using the
Bluefish editor to write my future books. I've written an Xhtml to ePub
converter in Python, and I can write an Xhtml to LaTeX converter just
as easily. But let me ask you something: Have you ever tried to slam
out 2500 words a day in Bluefish? Bluefish will never have the
authoring speed of LyX. But then again, as things stand now, a LyX
authored document will never be convertible to ePub.

The shame is, in theory, LyX to ePub is simple. Every environment
becomes p class=environmentname, every character style becomes
span class=charstylename. Leave div out of it except for every
special cases. Even lyx-code should become pre class=lyxcode, not
div class=lyxcode.

A special one-per-book configuration file (I did mine in YAML) defines
the assignment of Part, Chapter, Section, Subsection etc to h1, h2,
h3, h4 etc, and defines which go in the table of contents, and
which get numbers and what prefix the number gets (Part, Chapter, etc).
I've already done this: It works. Don't worry about converting LyX
environment and char style defs to CSS, just list all paragraph and
char styles, so that the author can make the necessary CSS. CSS is
*much* easier to define than LaTeX environments and commands. And yes,
let the author know that this export requires the author use only a
subset of LyX's capabilities.

I briefly considered writing Yet Another LyX to HTML Exporter, but
found out that in spite of LyX's native format being Non Human Friendly
XML, it's not *well formed* XML, so I can't use Python's lxml.etree,
let alone Python's xml.etree.ElementTree, to parse it. Perhaps if LyX
offered an export to well-formed XML, hopefully with a DTD, I could
parse that to produce ePub-friendly Xhtml, but as far as I know that
doesn't exist either.

Anyway, I would suggest anyone who is working on any portion of a LyX
to ePub conversion talk with me. I'm pretty knowledgeable about ePub,
and I've already identified a lot of the dead ends and blind alleys in
ePub creation, and I know what parts of the LyX document should go into
the ePub, and which parts would be better re-done as either config or
CSS. 

My switch to Bluefish isn't cast in concrete: Once LyX contains a
good, generic, reliable LyX to ePub or even LyX to ePub friendly Xhtml
conversion, I can switch back. If you do it soon enough, I won't even
have to write an Xhtml to LaTeX converter :-)

Thanks,

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-01-31 Thread Alan L Tyree
Sorry for the top posting, but this is short. My own view is that an
ePub exporter for LyX would make it a killer application. 

Thanks for your comments, Steve. Have you looked at Pandoc?

Cheers,
Alan


Steve Litt writes:

 On Fri, 31 Jan 2014 09:35:46 +
 Anthony Campbell a...@acampbell.org.uk wrote:


 This is for printed books. As regards conversion to ebook format, I've
 done this for several books on Smashwords, but that is quite a
 long-winded process because it has to be Word.doc format, which I do
 in LibreOffice (not much fun). Kindle does accept rtf, which would
 help, but as I'd already made Word.doc files I just used those.
 
 Anthony

 Ladies and gentlemen, if the preceding paragraph doesn't convince us we
 need a good, solid, LyX to ePub and LyX to Mobi conversion (do ePub
 first, you can convert ePub into Mobi), then nothing will. Instead of
 slamming out his book in LyX, Anthony must use an outside service
 (smashwords), meaning a two word modification is, as Joe Biden would
 say, a Big Friggin Deal. Further, to satisfy Smashwords he must write it
 in LibreOffice to simulate MS Word. Or, if he's just doing Kindle, he
 can submit rtf (what could *possibly* go wrong).

 You can't base a LyX to ePub converter off either LyX2Xhtml or Alex's
 eLyXer: Those produce great (X)html for stuff like footnotes and
 bibliographies, but they discard semantic tags (h1-h6) for variously
 named divs (yeah, div, not even p), as I remember they still use
 outdated a name=whatever/ instead of giving an ID to a tag. One or
 both of them does you the favor of renaming all graphic files to a
 numerical sequence: I guess this is to prevent identically named
 graphics in different directories from clobbering each other, but there
 are better ways of doing this that don't have the anti-debugging
 baggage of removing all meaning from graphic names.  Current
 LyX exported (X)html files just generally require *huge*
 postprocessing, with zillions of special cases, to get them in
 reasonable shape to make an ePub. If that were not the case, somebody
 would have made a LyX2ePub a long, long time ago, because the demand is
 there, and a lot of people have that itch, and I'm not the only one
 who has tried to do it. 

 Shamefully, because I need to be able to have my books available as
 ePubs, after 13 years using LyX to write my books, I'm now using the
 Bluefish editor to write my future books. I've written an Xhtml to ePub
 converter in Python, and I can write an Xhtml to LaTeX converter just
 as easily. But let me ask you something: Have you ever tried to slam
 out 2500 words a day in Bluefish? Bluefish will never have the
 authoring speed of LyX. But then again, as things stand now, a LyX
 authored document will never be convertible to ePub.

 The shame is, in theory, LyX to ePub is simple. Every environment
 becomes p class=environmentname, every character style becomes
 span class=charstylename. Leave div out of it except for every
 special cases. Even lyx-code should become pre class=lyxcode, not
 div class=lyxcode.

 A special one-per-book configuration file (I did mine in YAML) defines
 the assignment of Part, Chapter, Section, Subsection etc to h1, h2,
 h3, h4 etc, and defines which go in the table of contents, and
 which get numbers and what prefix the number gets (Part, Chapter, etc).
 I've already done this: It works. Don't worry about converting LyX
 environment and char style defs to CSS, just list all paragraph and
 char styles, so that the author can make the necessary CSS. CSS is
 *much* easier to define than LaTeX environments and commands. And yes,
 let the author know that this export requires the author use only a
 subset of LyX's capabilities.

 I briefly considered writing Yet Another LyX to HTML Exporter, but
 found out that in spite of LyX's native format being Non Human Friendly
 XML, it's not *well formed* XML, so I can't use Python's lxml.etree,
 let alone Python's xml.etree.ElementTree, to parse it. Perhaps if LyX
 offered an export to well-formed XML, hopefully with a DTD, I could
 parse that to produce ePub-friendly Xhtml, but as far as I know that
 doesn't exist either.

 Anyway, I would suggest anyone who is working on any portion of a LyX
 to ePub conversion talk with me. I'm pretty knowledgeable about ePub,
 and I've already identified a lot of the dead ends and blind alleys in
 ePub creation, and I know what parts of the LyX document should go into
 the ePub, and which parts would be better re-done as either config or
 CSS. 

 My switch to Bluefish isn't cast in concrete: Once LyX contains a
 good, generic, reliable LyX to ePub or even LyX to ePub friendly Xhtml
 conversion, I can switch back. If you do it soon enough, I won't even
 have to write an Xhtml to LaTeX converter :-)

 Thanks,

 SteveT

 Steve Litt*  http://www.troubleshooters.com/
 Troubleshooting Training  *  Human Performance


-- 
Alan L Tyree   http://www2.austlii.edu.au/~alan
Tel:  04 2748 

Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-01-31 Thread Scott Kostyshak
I'm CC'ing Josh Hieronymus, who has worked on some of this for GSoC.
He is probably busy with other things, but maybe he is still
interested.

Scott

On Fri, Jan 31, 2014 at 1:25 PM, Alan L Tyree alanty...@gmail.com wrote:
 Sorry for the top posting, but this is short. My own view is that an
 ePub exporter for LyX would make it a killer application.

 Thanks for your comments, Steve. Have you looked at Pandoc?

 Cheers,
 Alan


 Steve Litt writes:

 On Fri, 31 Jan 2014 09:35:46 +
 Anthony Campbell a...@acampbell.org.uk wrote:


 This is for printed books. As regards conversion to ebook format, I've
 done this for several books on Smashwords, but that is quite a
 long-winded process because it has to be Word.doc format, which I do
 in LibreOffice (not much fun). Kindle does accept rtf, which would
 help, but as I'd already made Word.doc files I just used those.

 Anthony

 Ladies and gentlemen, if the preceding paragraph doesn't convince us we
 need a good, solid, LyX to ePub and LyX to Mobi conversion (do ePub
 first, you can convert ePub into Mobi), then nothing will. Instead of
 slamming out his book in LyX, Anthony must use an outside service
 (smashwords), meaning a two word modification is, as Joe Biden would
 say, a Big Friggin Deal. Further, to satisfy Smashwords he must write it
 in LibreOffice to simulate MS Word. Or, if he's just doing Kindle, he
 can submit rtf (what could *possibly* go wrong).

 You can't base a LyX to ePub converter off either LyX2Xhtml or Alex's
 eLyXer: Those produce great (X)html for stuff like footnotes and
 bibliographies, but they discard semantic tags (h1-h6) for variously
 named divs (yeah, div, not even p), as I remember they still use
 outdated a name=whatever/ instead of giving an ID to a tag. One or
 both of them does you the favor of renaming all graphic files to a
 numerical sequence: I guess this is to prevent identically named
 graphics in different directories from clobbering each other, but there
 are better ways of doing this that don't have the anti-debugging
 baggage of removing all meaning from graphic names.  Current
 LyX exported (X)html files just generally require *huge*
 postprocessing, with zillions of special cases, to get them in
 reasonable shape to make an ePub. If that were not the case, somebody
 would have made a LyX2ePub a long, long time ago, because the demand is
 there, and a lot of people have that itch, and I'm not the only one
 who has tried to do it.

 Shamefully, because I need to be able to have my books available as
 ePubs, after 13 years using LyX to write my books, I'm now using the
 Bluefish editor to write my future books. I've written an Xhtml to ePub
 converter in Python, and I can write an Xhtml to LaTeX converter just
 as easily. But let me ask you something: Have you ever tried to slam
 out 2500 words a day in Bluefish? Bluefish will never have the
 authoring speed of LyX. But then again, as things stand now, a LyX
 authored document will never be convertible to ePub.

 The shame is, in theory, LyX to ePub is simple. Every environment
 becomes p class=environmentname, every character style becomes
 span class=charstylename. Leave div out of it except for every
 special cases. Even lyx-code should become pre class=lyxcode, not
 div class=lyxcode.

 A special one-per-book configuration file (I did mine in YAML) defines
 the assignment of Part, Chapter, Section, Subsection etc to h1, h2,
 h3, h4 etc, and defines which go in the table of contents, and
 which get numbers and what prefix the number gets (Part, Chapter, etc).
 I've already done this: It works. Don't worry about converting LyX
 environment and char style defs to CSS, just list all paragraph and
 char styles, so that the author can make the necessary CSS. CSS is
 *much* easier to define than LaTeX environments and commands. And yes,
 let the author know that this export requires the author use only a
 subset of LyX's capabilities.

 I briefly considered writing Yet Another LyX to HTML Exporter, but
 found out that in spite of LyX's native format being Non Human Friendly
 XML, it's not *well formed* XML, so I can't use Python's lxml.etree,
 let alone Python's xml.etree.ElementTree, to parse it. Perhaps if LyX
 offered an export to well-formed XML, hopefully with a DTD, I could
 parse that to produce ePub-friendly Xhtml, but as far as I know that
 doesn't exist either.

 Anyway, I would suggest anyone who is working on any portion of a LyX
 to ePub conversion talk with me. I'm pretty knowledgeable about ePub,
 and I've already identified a lot of the dead ends and blind alleys in
 ePub creation, and I know what parts of the LyX document should go into
 the ePub, and which parts would be better re-done as either config or
 CSS.

 My switch to Bluefish isn't cast in concrete: Once LyX contains a
 good, generic, reliable LyX to ePub or even LyX to ePub friendly Xhtml
 conversion, I can switch back. If you do it soon enough, I won't even
 have to write an Xhtml to 

Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-01-31 Thread Alex Fernandez
Hi Steve,

On Fri, Jan 31, 2014 at 5:47 PM, Steve Litt sl...@troubleshooters.comwrote:

 Ladies and gentlemen, if the preceding paragraph doesn't convince us we
 need a good, solid, LyX to ePub and LyX to Mobi conversion (do ePub
 first, you can convert ePub into Mobi), then nothing will. Instead of
 slamming out his book in LyX, Anthony must use an outside service
 (smashwords), meaning a two word modification is, as Joe Biden would
 say, a Big Friggin Deal. Further, to satisfy Smashwords he must write it
 in LibreOffice to simulate MS Word. Or, if he's just doing Kindle, he
 can submit rtf (what could *possibly* go wrong).


Good luck with the convincing part.


 You can't base a LyX to ePub converter off either LyX2Xhtml or Alex's
 eLyXer: Those produce great (X)html for stuff like footnotes and
 bibliographies, but they discard semantic tags (h1-h6) for variously
 named divs (yeah, div, not even p), as I remember they still use
 outdated a name=whatever/ instead of giving an ID to a tag. One or
 both of them does you the favor of renaming all graphic files to a
 numerical sequence: I guess this is to prevent identically named
 graphics in different directories from clobbering each other, but there
 are better ways of doing this that don't have the anti-debugging
 baggage of removing all meaning from graphic names.  Current
 LyX exported (X)html files just generally require *huge*
 postprocessing, with zillions of special cases, to get them in
 reasonable shape to make an ePub. If that were not the case, somebody
 would have made a LyX2ePub a long, long time ago, because the demand is
 there, and a lot of people have that itch, and I'm not the only one
 who has tried to do it.


eLyXer does not rename any graphic files AFAICR. Also, eLyXer should
provide semantic tags like h1-h6, as you can see here:
  http://elyxer.nongnu.org/math-unicode.html

Shamefully, because I need to be able to have my books available as
 ePubs, after 13 years using LyX to write my books, I'm now using the
 Bluefish editor to write my future books. I've written an Xhtml to ePub
 converter in Python, and I can write an Xhtml to LaTeX converter just
 as easily. But let me ask you something: Have you ever tried to slam
 out 2500 words a day in Bluefish? Bluefish will never have the
 authoring speed of LyX. But then again, as things stand now, a LyX
 authored document will never be convertible to ePub.


I feel your pain. Not with Bluefish but with other, also less powerful
editors. For good or for bad I have not needed translation to ePub yet.

The shame is, in theory, LyX to ePub is simple. Every environment
 becomes p class=environmentname, every character style becomes
 span class=charstylename. Leave div out of it except for every
 special cases. Even lyx-code should become pre class=lyxcode, not
 div class=lyxcode.


It should not be difficult to change eLyXer output to be as you desire;
just take a look at base.cfg
  https://github.com/alexfernandez/elyxer/blob/master/src/conf/base.cfg
where most elements of the output can be configured. There are many special
cases and a few ugly-ish tricks (such as h? to denote h1-h6), but it is
mostly there. Or should be.


 A special one-per-book configuration file (I did mine in YAML) defines
 the assignment of Part, Chapter, Section, Subsection etc to h1, h2,
 h3, h4 etc, and defines which go in the table of contents, and
 which get numbers and what prefix the number gets (Part, Chapter, etc).
 I've already done this: It works. Don't worry about converting LyX
 environment and char style defs to CSS, just list all paragraph and
 char styles, so that the author can make the necessary CSS. CSS is
 *much* easier to define than LaTeX environments and commands. And yes,
 let the author know that this export requires the author use only a
 subset of LyX's capabilities.


With luck, LyX devs will propose a native export which is neither here nor
there. Uphill battle in any case.

I briefly considered writing Yet Another LyX to HTML Exporter, but
 found out that in spite of LyX's native format being Non Human Friendly
 XML, it's not *well formed* XML, so I can't use Python's lxml.etree,
 let alone Python's xml.etree.ElementTree, to parse it. Perhaps if LyX
 offered an export to well-formed XML, hopefully with a DTD, I could
 parse that to produce ePub-friendly Xhtml, but as far as I know that
 doesn't exist either.


With eLyXer I have already done all the heavy work myself of converting LyX
documents to an in-memory structure of containers and insets. In theory you
might just tweak the configuration file base.cfg and generate a completely
different document structure such as ePub. In practice, and as far as I
know, it works: I was able to make the transition from LyX 1.x to 2.0 just
by adding a few insets and containers to the configuration file.

Sadly, after 7 years, not using LyX anymore and with little help from LyX
developers (with some *laudable exceptions*) I can no longer justify 

We need ePub/Mobi conversion: was: Book Frontmatter

2014-01-31 Thread Steve Litt
On Fri, 31 Jan 2014 09:35:46 +
Anthony Campbell  wrote:


> This is for printed books. As regards conversion to ebook format, I've
> done this for several books on Smashwords, but that is quite a
> long-winded process because it has to be Word.doc format, which I do
> in LibreOffice (not much fun). Kindle does accept rtf, which would
> help, but as I'd already made Word.doc files I just used those.
> 
> Anthony

Ladies and gentlemen, if the preceding paragraph doesn't convince us we
need a good, solid, LyX to ePub and LyX to Mobi conversion (do ePub
first, you can convert ePub into Mobi), then nothing will. Instead of
slamming out his book in LyX, Anthony must use an outside service
(smashwords), meaning a two word modification is, as Joe Biden would
say, a Big Friggin Deal. Further, to satisfy Smashwords he must write it
in LibreOffice to simulate MS Word. Or, if he's just doing Kindle, he
can submit rtf (what could *possibly* go wrong).

You can't base a LyX to ePub converter off either LyX2Xhtml or Alex's
eLyXer: Those produce great (X)html for stuff like footnotes and
bibliographies, but they discard semantic tags (h1-h6) for variously
named divs (yeah, , not even ), as I remember they still use
outdated  instead of giving an ID to a tag. One or
both of them does you the "favor" of renaming all graphic files to a
numerical sequence: I guess this is to prevent identically named
graphics in different directories from clobbering each other, but there
are better ways of doing this that don't have the anti-debugging
baggage of removing all meaning from graphic names.  Current
LyX exported (X)html files just generally require *huge*
postprocessing, with zillions of special cases, to get them in
reasonable shape to make an ePub. If that were not the case, somebody
would have made a LyX2ePub a long, long time ago, because the demand is
there, and a lot of people have that itch, and I'm not the only one
who has tried to do it. 

Shamefully, because I need to be able to have my books available as
ePubs, after 13 years using LyX to write my books, I'm now using the
Bluefish editor to write my future books. I've written an Xhtml to ePub
converter in Python, and I can write an Xhtml to LaTeX converter just
as easily. But let me ask you something: Have you ever tried to slam
out 2500 words a day in Bluefish? Bluefish will never have the
authoring speed of LyX. But then again, as things stand now, a LyX
authored document will never be convertible to ePub.

The shame is, in theory, LyX to ePub is simple. Every environment
becomes , every character style becomes
. Leave  out of it except for every
special cases. Even lyx-code should become , not
.

A special one-per-book configuration file (I did mine in YAML) defines
the assignment of Part, Chapter, Section, Subsection etc to , ,
,  etc, and defines which go in the table of contents, and
which get numbers and what prefix the number gets (Part, Chapter, etc).
I've already done this: It works. Don't worry about converting LyX
environment and char style defs to CSS, just list all paragraph and
char styles, so that the author can make the necessary CSS. CSS is
*much* easier to define than LaTeX environments and commands. And yes,
let the author know that this export requires the author use only a
subset of LyX's capabilities.

I briefly considered writing Yet Another LyX to HTML Exporter, but
found out that in spite of LyX's native format being Non Human Friendly
XML, it's not *well formed* XML, so I can't use Python's lxml.etree,
let alone Python's xml.etree.ElementTree, to parse it. Perhaps if LyX
offered an export to well-formed XML, hopefully with a DTD, I could
parse that to produce ePub-friendly Xhtml, but as far as I know that
doesn't exist either.

Anyway, I would suggest anyone who is working on any portion of a LyX
to ePub conversion talk with me. I'm pretty knowledgeable about ePub,
and I've already identified a lot of the dead ends and blind alleys in
ePub creation, and I know what parts of the LyX document should go into
the ePub, and which parts would be better re-done as either config or
CSS. 

My switch to Bluefish isn't cast in concrete: Once LyX contains a
good, generic, reliable LyX to ePub or even LyX to "ePub friendly" Xhtml
conversion, I can switch back. If you do it soon enough, I won't even
have to write an Xhtml to LaTeX converter :-)

Thanks,

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-01-31 Thread Alan L Tyree
Sorry for the top posting, but this is short. My own view is that an
ePub exporter for LyX would make it a killer application. 

Thanks for your comments, Steve. Have you looked at Pandoc?

Cheers,
Alan


Steve Litt writes:

> On Fri, 31 Jan 2014 09:35:46 +
> Anthony Campbell  wrote:
>
>
>> This is for printed books. As regards conversion to ebook format, I've
>> done this for several books on Smashwords, but that is quite a
>> long-winded process because it has to be Word.doc format, which I do
>> in LibreOffice (not much fun). Kindle does accept rtf, which would
>> help, but as I'd already made Word.doc files I just used those.
>> 
>> Anthony
>
> Ladies and gentlemen, if the preceding paragraph doesn't convince us we
> need a good, solid, LyX to ePub and LyX to Mobi conversion (do ePub
> first, you can convert ePub into Mobi), then nothing will. Instead of
> slamming out his book in LyX, Anthony must use an outside service
> (smashwords), meaning a two word modification is, as Joe Biden would
> say, a Big Friggin Deal. Further, to satisfy Smashwords he must write it
> in LibreOffice to simulate MS Word. Or, if he's just doing Kindle, he
> can submit rtf (what could *possibly* go wrong).
>
> You can't base a LyX to ePub converter off either LyX2Xhtml or Alex's
> eLyXer: Those produce great (X)html for stuff like footnotes and
> bibliographies, but they discard semantic tags (h1-h6) for variously
> named divs (yeah, , not even ), as I remember they still use
> outdated  instead of giving an ID to a tag. One or
> both of them does you the "favor" of renaming all graphic files to a
> numerical sequence: I guess this is to prevent identically named
> graphics in different directories from clobbering each other, but there
> are better ways of doing this that don't have the anti-debugging
> baggage of removing all meaning from graphic names.  Current
> LyX exported (X)html files just generally require *huge*
> postprocessing, with zillions of special cases, to get them in
> reasonable shape to make an ePub. If that were not the case, somebody
> would have made a LyX2ePub a long, long time ago, because the demand is
> there, and a lot of people have that itch, and I'm not the only one
> who has tried to do it. 
>
> Shamefully, because I need to be able to have my books available as
> ePubs, after 13 years using LyX to write my books, I'm now using the
> Bluefish editor to write my future books. I've written an Xhtml to ePub
> converter in Python, and I can write an Xhtml to LaTeX converter just
> as easily. But let me ask you something: Have you ever tried to slam
> out 2500 words a day in Bluefish? Bluefish will never have the
> authoring speed of LyX. But then again, as things stand now, a LyX
> authored document will never be convertible to ePub.
>
> The shame is, in theory, LyX to ePub is simple. Every environment
> becomes , every character style becomes
> . Leave  out of it except for every
> special cases. Even lyx-code should become , not
> .
>
> A special one-per-book configuration file (I did mine in YAML) defines
> the assignment of Part, Chapter, Section, Subsection etc to , ,
> ,  etc, and defines which go in the table of contents, and
> which get numbers and what prefix the number gets (Part, Chapter, etc).
> I've already done this: It works. Don't worry about converting LyX
> environment and char style defs to CSS, just list all paragraph and
> char styles, so that the author can make the necessary CSS. CSS is
> *much* easier to define than LaTeX environments and commands. And yes,
> let the author know that this export requires the author use only a
> subset of LyX's capabilities.
>
> I briefly considered writing Yet Another LyX to HTML Exporter, but
> found out that in spite of LyX's native format being Non Human Friendly
> XML, it's not *well formed* XML, so I can't use Python's lxml.etree,
> let alone Python's xml.etree.ElementTree, to parse it. Perhaps if LyX
> offered an export to well-formed XML, hopefully with a DTD, I could
> parse that to produce ePub-friendly Xhtml, but as far as I know that
> doesn't exist either.
>
> Anyway, I would suggest anyone who is working on any portion of a LyX
> to ePub conversion talk with me. I'm pretty knowledgeable about ePub,
> and I've already identified a lot of the dead ends and blind alleys in
> ePub creation, and I know what parts of the LyX document should go into
> the ePub, and which parts would be better re-done as either config or
> CSS. 
>
> My switch to Bluefish isn't cast in concrete: Once LyX contains a
> good, generic, reliable LyX to ePub or even LyX to "ePub friendly" Xhtml
> conversion, I can switch back. If you do it soon enough, I won't even
> have to write an Xhtml to LaTeX converter :-)
>
> Thanks,
>
> SteveT
>
> Steve Litt*  http://www.troubleshooters.com/
> Troubleshooting Training  *  Human Performance


-- 
Alan L Tyree   http://www2.austlii.edu.au/~alan
Tel:  04 2748 6206 

Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-01-31 Thread Scott Kostyshak
I'm CC'ing Josh Hieronymus, who has worked on some of this for GSoC.
He is probably busy with other things, but maybe he is still
interested.

Scott

On Fri, Jan 31, 2014 at 1:25 PM, Alan L Tyree  wrote:
> Sorry for the top posting, but this is short. My own view is that an
> ePub exporter for LyX would make it a killer application.
>
> Thanks for your comments, Steve. Have you looked at Pandoc?
>
> Cheers,
> Alan
>
>
> Steve Litt writes:
>
>> On Fri, 31 Jan 2014 09:35:46 +
>> Anthony Campbell  wrote:
>>
>>
>>> This is for printed books. As regards conversion to ebook format, I've
>>> done this for several books on Smashwords, but that is quite a
>>> long-winded process because it has to be Word.doc format, which I do
>>> in LibreOffice (not much fun). Kindle does accept rtf, which would
>>> help, but as I'd already made Word.doc files I just used those.
>>>
>>> Anthony
>>
>> Ladies and gentlemen, if the preceding paragraph doesn't convince us we
>> need a good, solid, LyX to ePub and LyX to Mobi conversion (do ePub
>> first, you can convert ePub into Mobi), then nothing will. Instead of
>> slamming out his book in LyX, Anthony must use an outside service
>> (smashwords), meaning a two word modification is, as Joe Biden would
>> say, a Big Friggin Deal. Further, to satisfy Smashwords he must write it
>> in LibreOffice to simulate MS Word. Or, if he's just doing Kindle, he
>> can submit rtf (what could *possibly* go wrong).
>>
>> You can't base a LyX to ePub converter off either LyX2Xhtml or Alex's
>> eLyXer: Those produce great (X)html for stuff like footnotes and
>> bibliographies, but they discard semantic tags (h1-h6) for variously
>> named divs (yeah, , not even ), as I remember they still use
>> outdated  instead of giving an ID to a tag. One or
>> both of them does you the "favor" of renaming all graphic files to a
>> numerical sequence: I guess this is to prevent identically named
>> graphics in different directories from clobbering each other, but there
>> are better ways of doing this that don't have the anti-debugging
>> baggage of removing all meaning from graphic names.  Current
>> LyX exported (X)html files just generally require *huge*
>> postprocessing, with zillions of special cases, to get them in
>> reasonable shape to make an ePub. If that were not the case, somebody
>> would have made a LyX2ePub a long, long time ago, because the demand is
>> there, and a lot of people have that itch, and I'm not the only one
>> who has tried to do it.
>>
>> Shamefully, because I need to be able to have my books available as
>> ePubs, after 13 years using LyX to write my books, I'm now using the
>> Bluefish editor to write my future books. I've written an Xhtml to ePub
>> converter in Python, and I can write an Xhtml to LaTeX converter just
>> as easily. But let me ask you something: Have you ever tried to slam
>> out 2500 words a day in Bluefish? Bluefish will never have the
>> authoring speed of LyX. But then again, as things stand now, a LyX
>> authored document will never be convertible to ePub.
>>
>> The shame is, in theory, LyX to ePub is simple. Every environment
>> becomes , every character style becomes
>> . Leave  out of it except for every
>> special cases. Even lyx-code should become , not
>> .
>>
>> A special one-per-book configuration file (I did mine in YAML) defines
>> the assignment of Part, Chapter, Section, Subsection etc to , ,
>> ,  etc, and defines which go in the table of contents, and
>> which get numbers and what prefix the number gets (Part, Chapter, etc).
>> I've already done this: It works. Don't worry about converting LyX
>> environment and char style defs to CSS, just list all paragraph and
>> char styles, so that the author can make the necessary CSS. CSS is
>> *much* easier to define than LaTeX environments and commands. And yes,
>> let the author know that this export requires the author use only a
>> subset of LyX's capabilities.
>>
>> I briefly considered writing Yet Another LyX to HTML Exporter, but
>> found out that in spite of LyX's native format being Non Human Friendly
>> XML, it's not *well formed* XML, so I can't use Python's lxml.etree,
>> let alone Python's xml.etree.ElementTree, to parse it. Perhaps if LyX
>> offered an export to well-formed XML, hopefully with a DTD, I could
>> parse that to produce ePub-friendly Xhtml, but as far as I know that
>> doesn't exist either.
>>
>> Anyway, I would suggest anyone who is working on any portion of a LyX
>> to ePub conversion talk with me. I'm pretty knowledgeable about ePub,
>> and I've already identified a lot of the dead ends and blind alleys in
>> ePub creation, and I know what parts of the LyX document should go into
>> the ePub, and which parts would be better re-done as either config or
>> CSS.
>>
>> My switch to Bluefish isn't cast in concrete: Once LyX contains a
>> good, generic, reliable LyX to ePub or even LyX to "ePub friendly" Xhtml
>> conversion, I can 

Re: We need ePub/Mobi conversion: was: Book Frontmatter

2014-01-31 Thread Alex Fernandez
Hi Steve,

On Fri, Jan 31, 2014 at 5:47 PM, Steve Litt wrote:

> Ladies and gentlemen, if the preceding paragraph doesn't convince us we
> need a good, solid, LyX to ePub and LyX to Mobi conversion (do ePub
> first, you can convert ePub into Mobi), then nothing will. Instead of
> slamming out his book in LyX, Anthony must use an outside service
> (smashwords), meaning a two word modification is, as Joe Biden would
> say, a Big Friggin Deal. Further, to satisfy Smashwords he must write it
> in LibreOffice to simulate MS Word. Or, if he's just doing Kindle, he
> can submit rtf (what could *possibly* go wrong).
>

Good luck with the convincing part.


> You can't base a LyX to ePub converter off either LyX2Xhtml or Alex's
> eLyXer: Those produce great (X)html for stuff like footnotes and
> bibliographies, but they discard semantic tags (h1-h6) for variously
> named divs (yeah, , not even ), as I remember they still use
> outdated  instead of giving an ID to a tag. One or
> both of them does you the "favor" of renaming all graphic files to a
> numerical sequence: I guess this is to prevent identically named
> graphics in different directories from clobbering each other, but there
> are better ways of doing this that don't have the anti-debugging
> baggage of removing all meaning from graphic names.  Current
> LyX exported (X)html files just generally require *huge*
> postprocessing, with zillions of special cases, to get them in
> reasonable shape to make an ePub. If that were not the case, somebody
> would have made a LyX2ePub a long, long time ago, because the demand is
> there, and a lot of people have that itch, and I'm not the only one
> who has tried to do it.
>

eLyXer does not rename any graphic files AFAICR. Also, eLyXer should
provide semantic tags like h1-h6, as you can see here:
  http://elyxer.nongnu.org/math-unicode.html

Shamefully, because I need to be able to have my books available as
> ePubs, after 13 years using LyX to write my books, I'm now using the
> Bluefish editor to write my future books. I've written an Xhtml to ePub
> converter in Python, and I can write an Xhtml to LaTeX converter just
> as easily. But let me ask you something: Have you ever tried to slam
> out 2500 words a day in Bluefish? Bluefish will never have the
> authoring speed of LyX. But then again, as things stand now, a LyX
> authored document will never be convertible to ePub.
>

I feel your pain. Not with Bluefish but with other, also less powerful
editors. For good or for bad I have not needed translation to ePub yet.

The shame is, in theory, LyX to ePub is simple. Every environment
> becomes , every character style becomes
> . Leave  out of it except for every
> special cases. Even lyx-code should become , not
> .
>

It should not be difficult to change eLyXer output to be as you desire;
just take a look at base.cfg
  https://github.com/alexfernandez/elyxer/blob/master/src/conf/base.cfg
where most elements of the output can be configured. There are many special
cases and a few ugly-ish tricks (such as h? to denote h1-h6), but it is
mostly there. Or should be.


> A special one-per-book configuration file (I did mine in YAML) defines
> the assignment of Part, Chapter, Section, Subsection etc to , ,
> ,  etc, and defines which go in the table of contents, and
> which get numbers and what prefix the number gets (Part, Chapter, etc).
> I've already done this: It works. Don't worry about converting LyX
> environment and char style defs to CSS, just list all paragraph and
> char styles, so that the author can make the necessary CSS. CSS is
> *much* easier to define than LaTeX environments and commands. And yes,
> let the author know that this export requires the author use only a
> subset of LyX's capabilities.
>

With luck, LyX devs will propose a native export which is neither here nor
there. Uphill battle in any case.

I briefly considered writing Yet Another LyX to HTML Exporter, but
> found out that in spite of LyX's native format being Non Human Friendly
> XML, it's not *well formed* XML, so I can't use Python's lxml.etree,
> let alone Python's xml.etree.ElementTree, to parse it. Perhaps if LyX
> offered an export to well-formed XML, hopefully with a DTD, I could
> parse that to produce ePub-friendly Xhtml, but as far as I know that
> doesn't exist either.
>

With eLyXer I have already done all the heavy work myself of converting LyX
documents to an in-memory structure of containers and insets. In theory you
might just tweak the configuration file base.cfg and generate a completely
different document structure such as ePub. In practice, and as far as I
know, it works: I was able to make the transition from LyX 1.x to 2.0 just
by adding a few insets and containers to the configuration file.

Sadly, after 7 years, not using LyX anymore and with little help from LyX
developers (with some *laudable exceptions*) I can no longer justify myself
to devote any efforts to support eLyXer, not