-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello Hannes
Just wondering; is your text parser able to correctly find all headings (e.g. '== bla ==' as well as '<h2>bla</h2>') and distinguish headings from other similar text but within a paragraph? And finally return the byte offset of those headings? I am using such a piece of code written with help of difflib and it is may be useful here also? (even though I had not that much time to write a unittest with full coverage... but a simple one is there ;) Greetings DrTrigon On 23.01.2012 23:34, Hannes Röst wrote: > Hello all > >> From one of my assignments as a bot operator I have some code >> which > does template parsing and general text parsing (e.g. Image/File > tags). It is not using regex and thus able to correctly parse > nested templates and other such nasty things. I have written those > as library classes and written tests for them which cover almost > all of the code. I would now really like to contribute that code > back to the community. > > Would you be interested in adding this code to the pywikibot > framework? If yes, can I send the code to someone for code review > or how do you usually operate? > > Greetings > > Hannes > > PS: wiki userpage is > http://en.wikipedia.org/wiki/User:Hannes_R%C3%B6st > > _______________________________________________ Pywikipedia-l > mailing list [email protected] > https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk8eceUACgkQAXWvBxzBrDBmJQCePmfUbs4Y8HNN18UT6vMFYo5r N1AAoLuN1VLpZQOrwegmkKWl08Te0Rxp =HXai -----END PGP SIGNATURE----- _______________________________________________ Pywikipedia-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
