Hi, Over the last few years, I've refactored the basis for the IMDbPY HTML parsers into a separate package called "piculet" that could be used with -hopefully- any HTML markup. It has no required external dependency, supports py2/py3/pypy and improves on the current IMDbPY parsers with some features and a more consistent interface.
The idea was, and still is, that at some point we can reimplement the IMDbPY parsers using piculet. This shouldn't be too hard since the syntax is quite similar. I've attempted this a few times already and managed to make some headway but trying to fit things into the current codebase kept distracting me from the actual job of dealing with the parsers. So I decided to develop a parser generator that will read a specification for a parser and generate the necessary code. I hope this will make the transition easier. My not-so-preliminary work is here: https://github.com/uyar/piculet_imdb Note that this project is not a full package like IMDbPY. It doesn't have the Movie/Person/etc classes. It doesn't even have the code to fetch the IMDb pages (except for the simple retrievers in the tests). If we decide that this approach makes sense, we could create a template suitable for IMDbPY. If anyone's interested I'd be happy to hear thoughts, suggestions, and of course pull requests. Have a nice day, -- Turgut Uyar ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Imdbpy-devel mailing list Imdbpy-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/imdbpy-devel