-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello Hannes

Just wondering; is your text parser able to correctly find all headings
(e.g. '== bla ==' as well as '<h2>bla</h2>') and distinguish headings
from other similar text but within a paragraph? And finally return the
byte offset of those headings?

I am using such a piece of code written with help of difflib and it is
may be useful here also? (even though I had not that much time to write
a unittest with full coverage... but a simple one is there ;)

Greetings
DrTrigon


On 23.01.2012 23:34, Hannes Röst wrote:
> Hello all
> 
>> From one of my assignments as a bot operator I have some code
>> which
> does template parsing and general text parsing (e.g. Image/File
> tags). It is not using regex and thus able to correctly parse
> nested templates and other such nasty things. I have written those
> as library classes and written tests for them which cover almost
> all of the code. I would now really like to contribute that code
> back to the community.
> 
> Would you be interested in adding this code to the pywikibot 
> framework? If yes, can I send the code to someone for code review
> or how do you usually operate?
> 
> Greetings
> 
> Hannes
> 
> PS: wiki userpage is
> http://en.wikipedia.org/wiki/User:Hannes_R%C3%B6st
> 
> _______________________________________________ Pywikipedia-l
> mailing list [email protected] 
> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8eceUACgkQAXWvBxzBrDBmJQCePmfUbs4Y8HNN18UT6vMFYo5r
N1AAoLuN1VLpZQOrwegmkKWl08Te0Rxp
=HXai
-----END PGP SIGNATURE-----

_______________________________________________
Pywikipedia-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l

Reply via email to