[sphinx-dev] using sphinx latex with scientific journal templates

foobaron Sun, 30 Jan 2011 00:11:02 -0800

Hi,
I've been using Sphinx a long time for software projects and more
recently for writing all my scientific papers.  I love being able to
run it right on my iPad to view and edit papers with lots of equations
(HTML output with jsMath), and at the same time generate latex & PDFs
I can circulate to people for comments.


I've found that the problems start when I want to submit the paper to
a journal, because each journals requires the latex source of the
manuscript to conform exactly to their custom latex template, which of
course Sphinx's latex output does not.  After studying how Sphinx /
docutils generate latex output, I decided not to mess with that code
at all but instead just wrote a separate Python script to extract the
relevant sections of Sphinx's latex output and insert them into the
latex template supplied by the journal.  This is pretty easy, works
great, and is easy to adapt to different journals (so far I've done it
for PNAS, PLOS, and Information; if people want me to post an example
rewriter script I can do that).  But it feels like a kluge; it seems
like lots of people want this kind of latex output customization and
we should instead all be using docutils latex templates etc.  For
example, if Sphinx alters little details of the latex it outputs, that
might make my re-writer script stop working (because it has to search
for specific strings in Sphinx's output, and transform them).  An
example of a latex paper produced this way is viewable here:
http://www.mdpi.com/2078-2489/2/1/17/

I'm totally sold on Sphinx as my long-term solution for being able to
"cross-compile" my content to many different outputs.  I would now
like to work on this latex output template customization problem in a
general way that would be usable, extensible and customizable by
others.  My question is what approach Sphinx developers would
recommend.  I'll quickly note a few requirements:

- this is not just a matter of changing a template file.  The Sphinx
code itself contains many fragments of latex that must be customized
for each possible output.  For example,
sphinx.ext.mathbase.wrap_displaymath() outputs displaymath using the
non-standard environments "gather" and "split".  This does not match
any scientific journal's template, so either this Sphinx code itself
must change, or we are stuck with my klugey approach (external scripts
that rewrite the latex output by Sphinx, inserting it into a journal's
template).

- scientific journals supply precise templates and demand that authors
follow them exactly.  Nowadays they input the latex file directly into
their typesetting production, so to ensure a uniform appearance and
standard across all the papers they publish, they *require* that the
manuscript follow the template.  As an author, I cannot deviate from
their template at all.  For example, they won't permit the inclusion
of any packages other than a specified list that is used by their
template.  Unfortunately, much of the Sphinx code for latex support
assumes the use of many custom packages.  Again either that code has
to change, or we're stuck with the external re-writer script approach.

- this will require user-settable options for what to do with figures
and tables.  For example, during the initial submission / review
phase, PLOS journals want the figures and tables included at the end
of the manuscript (not in the middle of the text where Sphinx inserts
them).  However, once the paper is accepted for production, they want
*only* the figure legends included at the end of the manuscript (i.e.
do not include the figure images in the manuscript at all; they must
be submitted separately).  With my re-writer script this is easy; it
just takes an optional argument that controls whether it includes the
figures in the output or not.

I'd like to get some advice about what approach people think would be
best.  A few options come to mind:

- external rewriter scripts:  the rewriter takes a Sphinx latex output
file and a latex template file, and inserts the relevant pieces of
content into the template.  This could be designed in a relatively
modular way.  I.e. a parser that extracts relevant sections from the
Sphinx latex output; a "standardizer" that removes non-standard things
like "gather" and "split".  Then for each output target there could be
a very small amount of code that processes journal-specific options
like "submission format" vs. "production format".  While in my
experience such "re-writers" are compact and easy to write, there is
clearly a disadvantage that if Sphinx changes its latex output, that
could break the parser or standardizer.

- for this reason, it might make sense to make the "parser" and
"standardizer" components of this actually part of the sphinx
codebase, along with a bunch of automated tests that ensure they are
working.  Since these pieces must be kept in sync with the Sphinx
code, that argues that they should be part of the Sphinx mercurial
tree.  Then the set of journal-specific "writer" scripts (which will
be *very* simple, since all they have to do is process various little
options) could either also be included with Sphinx, or distributed as
a separate project.

- "the full Monty": instead of using an external re-writer script, we
modify the Sphinx latex code (e.g. sphinx.ext.mathbase,
sphinx.writers.latex) to make it easy to customize the latex output in
a truly general way (i.e. to produce output that does *not* assume non-
standard packages, that inserts directly into any template file the
user specifies, etc.).  Having browsed the sphinx code a bit, this
seems like a fair amount of work, as it requires understanding what
both docutils and sphinx are doing to produce the latex output, and a
fair amount of code is involved...

All comments and suggestions welcome.  Sorry for the long post!

-- Chris Lee

-- 
You received this message because you are subscribed to the Google Groups 
"sphinx-dev" group.
To post to this group, send email to sphinx-dev@googlegroups.com.
To unsubscribe from this group, send email to 
sphinx-dev+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sphinx-dev?hl=en.

[sphinx-dev] using sphinx latex with scientific journal templates

Reply via email to