I am interested in developing a cross-platform, batch web download application with a GUI, implementing recursion and filters similar to the Wget utility. An important feature (not supported by Wget in an automated fashion) is the ability to log into a web site via an HTML form.
The GUI would include edit boxes for a user name and password. The program would then log in and begin downloading files, according to criteria specified, using twill and other Python modules. Since web forms are not standardized regarding the HTML code that prompts for a user name and password, I would like the program to use heuristics that work in most cases (knowing that perfection is elusive, especially if the form uses JavaScript in a manner to thwart bots). I am seeking suggestions on how to implement such heuristics. For example, the code might look for words like "Name" and "Password" in id or name attributes of text input fields. Naturally, if anyone has code they can share, that would be appreciated. Jamal _______________________________________________ twill mailing list [email protected] http://lists.idyll.org/listinfo/twill
