I would certainly address the problem using Marpa, but I'm kinda biased. :-)

>From what I read, you do *not* need to capture any nesting of blocks.
RNS's parser doesn't do that so it might make a fine start.  (I've never
used it.)

You don't need to produce a full AST, it seems, so the following is not
directly relevant to your question, but let me talk about how I might go
about how I might go about parsing Markdown for display purposes, for which
case the block nesting is essential, I would probably try a two-layer
approach -- one level of parser to capture the line-by-line directives and
individual pieces, and an upper level which captures the block structure.
The upper level would certainly be in Marpa, and probably the lower level
as well.

I hope this is helpful.



On Wed, Apr 22, 2020 at 8:38 PM Martin Quinson <[email protected]>
wrote:

> Hello,
>
> I'm one of the authors of the po4a project (https://po4a.org), that
> helps translating the documentation.
>
> The idea is to extract the translatable content of the documents into
> PO files that are comonly used by the translators of open-source
> programs, get the translators do their job, and then reinject the
> translated content in the structure of the original document.
>
> We have parsers for many formats, such as POD, manpages, asciidoc,
> markdown, xml, and some others. The project exists since almost 2
> decades and we are now used in production for the translation of many
> manpages in all major distributions, for the translation of the
> manpages documenting the git project, for the f-droid web pages, for
> the whole fedora documentation, etc.
>
> My problem is that our parsers are currently written as a ugly bunch
> of regexps that are hard to work with, and I am considering converting
> to something more robust.
>
> Our parsers don't really need to access the AST, but they are more of
> a filter calling the translate() function on the parts that need.
>
> Every parser takes a document to analyse + a translation catalog
> (called PO file) associating a set of strings to their transation in a
> given language.
>
> This produce an output document where the content of the input doc was
> replaced by the translations found in the catalog + a list of strings
> that the input doc contains. This list is used to update the
> translation catalogs when the input document changes.
>
>  Input document --\                             /---> Output document
>                    \      TransTractor::       /       (translated)
>                     +-->--   parse()  --------+
>                    /                           \
>  Input PO --------/                             \---> Output PO
>                                                       (extracted)
>
>
> Let's take a little Markdown example:
> | A nice title
> | ============
> |
> | The first paragraph.
> |
> |  * Item 1
> |  * Item 2
>
> I need the following calls to be issued during the parsing:
> |  pushline( translate ("A nice title", "input:1") );
> |  pushline("============");
> |  pushline("");
> |  pushline( translate("The first paragraph.", "input:4") );
> |  pushline("");
> |  pushline(" * " . translate("Item 1", "input:6") );
> |  pushline(" * " . translate("Item 2", "input:7") );
> |  pushline("");
>
> All the po4a magic lays into the translate() function, that add its
> parameters to the output PO file while returning the translation found
> in the input PO file for that string (or the string itself if no
> translation was found). The second parameter of translate is the
> location in the input file.
>
>
> So, after this long context, I guess that my question would simply be:
> how would you address this problem with Marpa?
>
> I found [1], that provide a Marpa parser for the Markdown format.
> First subquestion: is this parser use the latest recommendations to
> Marpa (right input language and such) as I think?
>
>   [1] https://github.com/rns/MarpaX-Languages-CommonMark-AST/
>
> In some sense I feel that this example is too complex for what I need
> because it seems difficult to dump a Markdown file from the AST. Am I
> wrong here? If I'm correct so far, what would be the easiest to dump
> the parser file with no modification, eg using actions? Or maybe I'm
> misleaded and Marpa is not exactly the tool I'm looking for?
>
> I have the feeling that what I need is very simple, but I fail to nail
> it done, so I'd really appreciate any idea or insight that you could
> provide.
>
> Thanks in advance,
> Mt.
>
> --
> Fear is no philosophy of life.          -- Kurt Von Hammerstein.
>
> --
> You received this message because you are subscribed to the Google Groups
> "marpa parser" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/marpa-parser/20200423003604.GS8215%40cafuron
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/marpa-parser/CA%2B2Wrv9vFx5d8a5Pj6KMX14TwHcfnGjsBcOM3sDRqHAP0UaeFQ%40mail.gmail.com.

Reply via email to