I've taken the first steps to begin to work on the new design of the IMDb site.
[CVS] In the CVS there's a branch called "newdesign" which includes the first changes. You can checkout it with: cvs -z3 -d:pserver:[EMAIL PROTECTED]:/cvsroot/imdbpy co -r newdesign imdbpy (replace anonymous with your account, if you're an already authorized developer) This branch uses the new 'IMDbPYweb' account, set to use the new layout - to access the IMDb web site. I'm not an expert of CVS - I hope to have set up this thing appropriately. [test-suite] There is a test-suite slightly modified to help the development here: http://imdbpy.sourceforge.net/imdbpy-3.0-testsuite.tar.gz in the 'http' directory there are the main useful (?) tool. The first one is the test_parser.py testsuite. Run it with the -f option and it will start downloading (a lot) of pages from imdb.com. Then run it with the -p option: it will read every downloaded HTML page and parse it with the appropriate parser, saving the result in a ".p" file. Now you can test, with the -t option, that the output of a second run of the parser on the various files (it's assumed than between the first and the second run you've modified the parser, otherwise there's nothing to test!); if the output is different, the test fails and the differences are - more or less - nicely print. Another way to use this test_parser.py is that: you've a parser that you know is working ok, and you've downloaded the html pages and created the ".p" files; after some time (weeks or months) you want to check that the parser is still working, so you move the ".html" file elsewhere and fetch again the whole data with the -f option _without_ recreating the ".p" file. Then you can run again the -t test to see if the output of the fresh ".html" pages is consistently different from the old ".p" files (some difference are unavoidable, think about the number of votes for a movies). The other tool is the build_tests.py script: it takes every test made by test_parser.py and create another script (based on the skel_test.py skeleton) for every test; these test just read the corresponding ".html" file and output the result of the parser. E.g.: run build_tests.py and your current directory will be full of "test_*" scripts. Fetch the html pages with test_parser.py -f and try to run one of the created "test_*" script. As an example if you run ./test_airing_parser_m37.py you will see that no titlesRefs and namesRefs are collected (correct!) and that the only data parsed is "airing", followed by the value of these data. Then you _manually_ try to find if the parsed data are corrected: you open the "m37.html" file with a browser and see if every information that the airing_parser is supposed to parse is correct. In this specific case the parser already works perfectly; other parser may be completely (or partially) broken. Run these scripts one by one, and see what's not working; if you're interested in fixing something, let me know on this list, to avoid two persons working on the same bug. Currently I'm looking at the test_guests_parser_m33.py (heavily broken). Enjoy, -- Davide Alberani <[EMAIL PROTECTED]> [PGP KeyID: 0x465BFD47] http://erlug.linux.it/~da/ ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Imdbpy-devel mailing list Imdbpy-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/imdbpy-devel