While parsing wiki code without specific python tools, I found a major
problem into templates code, since regex can't manage so well nested
structures. I solved such issue by a layman approach with a parseTemplate
 routine, both in python and in javascript, which converts templates into a
simple object (a dictionary + a list), coupled with another simple routine
which rebuilds the template code from the original, or edited, object. The
whole thing is - as I told - very rough and it has written for personal use
only; but if anyone is interested about, please ask.

Alex brollo


2014-06-08 23:47 GMT+02:00 Merlijn van Deen <[email protected]>:

> On 1 June 2014 01:57, Ricordisamoa <[email protected]> wrote:
>
>>  Since gerrit:131263 <https://gerrit.wikimedia.org/r/131263/> , it seems
>> to me that the excellent mwpfh is going to be used more and more
>> extensively within our framework.
>> Am I right? For example, the DuplicateReferences detection and fix in
>> reflinks.py could be brightly refactored without regular expressions.
>> Or are we supposed to do the opposite conversion, where possible?
>>
>
> My preference is to depend on mwpfh where possible - their parser support
> is much better than ours, and it makes much more sense to concentrate
> efforts in one place. However, there's one blocker for this: the Windows
> support of wmpfh. It uses a C extension, and it's hard to build C
> extensions under Windows -- so we'd need to help Windows users along
> installing it in some way. I've updated the issue at
> https://github.com/earwig/mwparserfromhell/issues/68 with some notes for
> that.
>
> Merlijn
>
>
> _______________________________________________
> Pywikipedia-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
>
>
_______________________________________________
Pywikipedia-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l

Reply via email to