Re: We need ePub/Mobi conversion: was: Book Frontmatter

Steve Litt Tue, 04 Feb 2014 08:25:07 -0800

On Tue, 4 Feb 2014 07:34:24 +0100
Liviu Andronic <[email protected]> wrote:

> On Tue, Feb 4, 2014 at 2:38 AM, Steve Litt
> <[email protected]> wrote:

> > I just created a brand new LyX file in 2.0.6, which is the packaged
> > LyX for Ubuntu 13.10, and there wasn't a bit of XML in it, well
> > formed or otherwise. It was basically the same format human
> > parseable format it's been for 10 years, but with a lot more insets
> > and options. But as far
> >
> Quick question: What are your reasons / needs for being able to
> "humanly" parse the file format? Would a tool like pLyX address these
> needs: http://wiki.lyx.org/Examples/FindAndReplaceLyXFormatElements ?
> 
> Liviu

Thanks for asking, Liviu. Let me answer your second question first:
From the URL you gave, I can't determine how much or how little pLyX
would help me. I can't even tell whether pLyX is runnable as a
command: a must for putting it into shellscripts. But either way, it's
not a substitute for parsability...

Which brings us to your first question: Why do I need a parsable native
format? The best answer I can give you is "general principle". Over
many years with many different programs on many different operating
systems, every time I had one of those programs emitting "magic
format"  native formats, eventually I wished I could parse.

I have tons of files in Micrografx Windows Draw format: best vector
graphics program ever. But it doesn't work in Linux, and in fact the
latest version came out last century. If I could parse it, I could
convert it to SVG and continue with Inkscape. But Noooooo, it's a trade
secret binary.

I had a Clarion (Rapid Application Development) app whose .app file
suffered a little bit of corruption. It would have been pretty
easy to tweak it back to life with a text editor, but it was a binary
file. I had to drop back to a three day old backup: I lost three days
of work.

I needed an automated tweak of an OpenOffice doc, so I could watermark
it before sending it to customers. But OpenOffice/LibreOffice docs are
groups of XML files that, if they were a database, would be considered
horribly denormalized. For all practical purposes, OpenOffice format
might as well be considered secret binary: I had to do it in (gulp), MS
Powerpoint binary.

A few weeks ago I upgraded my daily driver computer to Ubuntu 13.10, and
a few of my Gnumeric files wouldn't open in Gnumeric. The file format
was very complex XML I couldn't understand. I ended up taking the file
to a computer with a different Gnumeric version, exporting to MSExcel,
and then importing that in my daily driver's Gnumeric. If the XML had
been easy enough to understand, I could have used the old "repeatedly
remove half, see if the symptom went away" technique. But with XML,
especially the kind where a single fact is represented in several
branches and they all must agree (denormalized), you're likely to bust
it further, so it's not good for troubleshooting.

Now let's look at the other side of the equation: Times I've had an
accessible and parsable native format...

VimOutliner's (VO) native format is tab-indented ascii. So far I've
created VO to HTML, VO to Easy Menu Definition Language, VO to
Troubleshooters.Com Linux Library web page, and many, many more. Others
have created all sorts of extensions for VO, one guy made it into a
calendaring program, and a lot of these guys weren't professional
programmers.

Xhtml is easily parsable with Python's lxml.etree parser. So I used it
to convert a Bluefish-created Xhtml file into ePub. The hardest part of
the job was understanding the ePub spec, complete with device
idiosyncrasies. 

About 5 to 10 times in the 12 years I've used LyX, there have been LyX
files that couldn't be opened in LyX. No problem, I edited them in Vim,
did the old "repeatedly remove half, see if the symptom went away"
technique, eventually finding that one factor that prevented opening.

My (LyX created) instructor notes have a slide-by-slide explanation of
the accompanying (Powerpoint) presentation. From time to time I change
the slides or slide order, so I really don't want to hard-code numbers
into the Instructor Notes: I want the slides to be numbered on the fly
in the Instructor Notes. At the time I first made this, I didn't know
enough about LaTeX counters and LaTeX commands (and LyX didn't have
char styles with which to implement the commands anyway) to do it that
way. So instead, I built a preprocessor that went through my file, and
every time it found blue text, it assigned an incremented number within
LyX itself.

Most of my books are written with LyX. With eBooks, I hate DRM but want
my recipients to understand it's not cool to unauthorizedly copy my
books, because I feed my family by selling those books. So I
personalize every book with the person's name: The footer says "This
book created expressly for John Q Public" or whatever the guy's name
is. Doing this was trivial: The LyX file has a command called Licensee,
and I have a script copies the LyX file to a temporary, changes that
variable to the buyer's name, and compiles the book to PDF.

One of my books: "Troubleshooting Techniques of the Successful
Technologist", is available both as a print book and as a PDF eBook. By
parsing and changing a few commands in the preamble, I can do either,
while maintaining only one source file.

The bottom line is this: With a parsable native format, you know
for sure you'll never get painted into a corner. That's a *powerful*
feeling of security.

Thanks,

SteveT

Steve Litt                *  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance

Re: We need ePub/Mobi conversion: was: Book Frontmatter

Reply via email to