Just two things regarding the python scraping and parsing code in svn:

1) mixture of tabs and spaces - quite a few of the files have inconsistent indentation. Didn't check everything, but for example:

  python -tt pyscraper/createhansardindex.py
  python -tt pyscraper/miscfuncs.py
  python -tt pyscraper/patchtool.py

all give a TabError.

There is a reindent script here:
http://svn.python.org/view/*checkout*/python/trunk/Tools/scripts/reindent.py?revision=66903&content-type=text%2Fplain

2) Trying to understand the sequencing of the scripts, I ended up playing about with this Dispatcher:

  http://pastebin.com/m7e8b0b3d

the idea being to try and avoid those nested ifs and elifs in lazyrunall.py

Not fully thought out, but (fwiw) you would instead end up with something like:


dispatcher = Dispatcher()

dispatcher.on__scrape__hansard = UpdateHansardIndex
dispatcher.on__scrape__lords = UpdateLordsHansardIndex
dispatcher.on__scrape__standing = UpdateStandingHansardIndex
dispatcher.on__scrape__chgpages = (GrabWatchCopies, (datetime.date.today().isoformat(),), None)
dispatcher.on__scrape__force_scrape__regmem = RegMemPullGluePages

etc.
etc.

options, args = parser.parse_args()


dispatcher.run(args)

Just an idea.



_______________________________________________
Mailing list [email protected]
Archive, settings, or unsubscribe:
https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public

Reply via email to