Steve Perkins wrote: > Hello - > I have just discovered the whole Simile project, and let me say that I > am blown away. I think this is amazing stuff. > > Anyway, I am interested in using crowbar as a crawling agent. I wrote > a crawler and parser in python that uses urllib2 (and lxml with > xpath), but I'm getting problems. It seems I can't just send a post > request down the line to this server... the form has some javascript > that executes prior to submitting, so my encoded POST request is not > jiving with the cgi. I'd like to use Crowbar as my crawling agent > because it's a full fledged browser and I can simply load and submit > the form as a browser would. > > The problem I'm having, though, is that crowbar doesn't seem to handle > HTTPS: When I type in a URL with https: to the crowbar interface, I > get: > > undefined is not a valid HTTP URL > > Any suggestions?
Good catch, it was a bug. I just submitted a patch on the code repository, do a 'svn update' to get it. Now, keep in mind that crowbar might require user interaction (like clicking on 'accept this certificate' dialogs, etc.) so keep an eye on the crowbar little window. Let me know if you have any other problem with it. -- Stefano Mazzocchi Digital Libraries Research Group Research Scientist Massachusetts Institute of Technology E25-131, 77 Massachusetts Ave skype: stefanomazzocchi Cambridge, MA 02139-4307, USA email: stefanom at mit . edu ------------------------------------------------------------------- _______________________________________________ General mailing list [email protected] http://simile.mit.edu/mailman/listinfo/general
