Hello, I recently wrote a python package and was wondering if anyone might have time to review it?
I'm fairly new to python - it's been about 1/2 of my workload at work for the past year. Any suggestions would be super appreciated. https://github.com/tiffon/take https://pypi.python.org/pypi/take The package implements a DSL that is intended to make web-scraping a bit more maintainable :) I generally find my scraping code ends up being rather chaotic with querying, regex manipulations, conditional processing, conversions, etc., ending up being to close together and sometimes interwoven. It's stressful. The DSL attempts to mitigate this by doing only two things: finding stuff and saving it as a string. The post-processing is left to be done down the pipeline. It's almost just a configuration file. Here is an example that would get the text and URL for every link in a page: $ a save each: links | [href] save: url | text save: link_text The result would be something along these lines: { 'links': [ { 'url': 'http://www.something.com/hm', 'link_text': 'The text in the link' }, # etc... another dict for each <a> tag ] } The hope is that having all the selectors in one place will make them more manageable and possibly simplify the post-processing. This is my first go at something along these lines, so any feedback is welcomed. Thanks! Joe _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor