BTW, I just made some manner of progress toward this. I went about this in a bit of a roundabout way, I wrote a script that extracts the indented code blocks from the Docs to docx conversion, and inserts them into the docx to rst conversion output (done with Pandoc, which strips the indentation on those blockquotes). It's not perfect, but it's definitely readable, there are occasional mis-indent but in few of the code blocks.
(The right way to do all this would be to change Pandoc so that it doesn't strip the whitespace on those blocks, and I started that way but ended up going around in circles, my knowledge of Haskell isn't too amazing.) Anyhow, I'll try to finish this and convert all the docs to rst at some point. On Mon, Nov 6, 2017 at 1:19 AM, Martin Blais <[email protected]> wrote: > On Mon, Oct 30, 2017 at 9:38 AM, Stefano Zacchiroli <[email protected]> > wrote: > >> Heya, >> >> On Sun, Sep 24, 2017 at 03:33:03AM -0400, Martin Blais wrote: >> > here's a snapshot of all the documents exported to all the available >> > formats: http://furius.ca/tmp/beancount/beancount-docs-exported.tar.gz >> >> Thanks for this, very useful. >> >> I've started looking into automatically converting to something that >> Sphinx would like to produce good output. Question about your findings: >> >> > I took some notes on the conversion in the past: >> > https://bitbucket.org/blais/beancount/src/22be0f233d079c14dc >> 727d54378d21164db04cdf/experiments/docs/convert/ >> compare_download_formats.txt >> > >> > It seems like one may have to source from more than one export in order >> to >> > obtain all the necessary bits to make a nice conversion. >> > Most difficult is that all the formats seem to lose the indentation of >> the >> > "code" (Beancount source) examples. >> > (I think it should be possible to automatically reindent it with code.) >> >> Docx output seems indeed to be the most rich output option from Google >> Docs. But AFAICT it does retain the needed spacing in Beancount code >> snippets. Here's an example from the rounding precision proposal >> document: >> >> <w:t xml:space="preserve">2014-05-06 * “Buy mutual fund”</w:t> >> <w:br w:type="textWrapping"/> >> <w:t xml:space="preserve"> Assets:Investments:RGXGX 4.278 RGAGX >> {53.21 USD} </w:t> >> <w:br w:type="textWrapping"/> >> <w:t xml:space="preserve"> Assets:Investments:Cash -227.6324 >> USD</w:t> >> <w:br w:type="textWrapping"/> >> <w:t xml:space="preserve"> Expenses:Commissions 9.95 >> USD</w:t> >> >> which corresponds to: >> >> 2014-05-06 * “Buy mutual fund” >> Assets:Investments:RGXGX 23.45 RGAGX {42.6439 USD} >> Assets:Investments:Cash -1000 USD >> >> When reading docx *pandoc* does indeed strip those spaces, >> unfortunately, so one can't simply rely on pandoc for a docx -> rst >> conversation, but the information seems to be there in the docx. >> >> I'll give this path a try, but please let me know if you're aware that >> I'm missing something here! > > > This is a good observation, I hadn't noticed this myself. This is great. > I think it should be possible to combine the output from pandoc and splice > in the code examples extracted from a custom (Python) script. > Thank you for this, > > -- You received this message because you are subscribed to the Google Groups "Beancount" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/CAK21%2BhNvtehLiFw8QuN36f71PLYLHpjF6OCV6OFu651KDdsy2w%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
