#10637: Implement sage -sws2rst
----------------------------+-----------------------------------------------
Reporter: nthiery | Owner: jason, mpatel, was
Type: enhancement | Status: new
Priority: major | Milestone: sage-4.6.2
Component: notebook | Keywords: ReST, worksheet
Author: Pablo Angulo | Upstream: N/A
Reviewer: | Merged:
Work_issues: |
----------------------------+-----------------------------------------------
Changes (by pang):
* cc: nthiery, hivert (added)
Comment:
== Install instructions ==
* It's necessary to install the beautifulsoup spkg
* A first patch adds the sage-sws2rst script, and it must be imported on
the local/bin dir
* A second patch sends three files to sagenb/misc and it must be imported
on the devel/sagenb dir, and followed by a
{{{
sage -python setup.py install && sage -python setup.py develop
}}}
== Comments ==
The code above is preliminary and undocumented, but I post it in order to
guide the discussion. I'll write comments and make improvements when we
decide on some issues.
I have to warn the task is not trivial, as html is more relaxed than rst,
so that some editing is required after the conversion. The result from the
script produces rst files that get compiled correctly only on occassion,
but usually the modifications required are not much.
== How it works ==
The sws file is uncompressed, and the images inside the 'data' and the
'cell/xx' dirs ar copied to a '_media' dir (this is done by the sage-
sws2rst file).
Then the file worksheet.html is parsed, and split into comments, source
code and output. The first is handled using the library BeautifulSoup.
Source code is mostly untouched, and results are parsed to find some
patterns like "image", "latex", "docstring", "other html patterns" and
"plain text". The first two are displayed. The docstring, or any
unrecognized html in the result cells are discarded. Plain text is
indented, and is recognized as the output of the previous cell.
== Issues ==
* rest does not understand that images or latex are the output of some
code cells.
* docstrings are ignored. I felt stupid trying to parse some html that
comes from a rst file. Maybe I can grab the rst file from the source, but
only if it belongs to the library, so is it worth it?
Please add your concerns here.
--
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/10637#comment:4>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sage-trac?hl=en.