Re: Using crowbar with https pages

Stefano Mazzocchi Fri, 20 Apr 2007 10:48:51 -0700

Steve Perkins wrote:
> Hello -
> I have just discovered the whole Simile project, and let me say that I
> am blown away.  I think this is amazing stuff.
> 
> Anyway, I am interested in using crowbar as a crawling agent.  I wrote
> a crawler and parser in python that uses urllib2 (and lxml with
> xpath), but I'm getting problems.  It seems I can't just send a post
> request down the line to this server... the form has some javascript
> that executes prior to submitting, so my encoded POST request is not
> jiving with the cgi.  I'd like to use Crowbar as my crawling agent
> because it's a full fledged browser and I can simply load and submit
> the form as a browser would.
> 
> The problem I'm having, though, is that crowbar doesn't seem to handle
> HTTPS:  When I type in a URL with https: to the crowbar interface, I
> get:
> 
> undefined is not a valid HTTP URL
> 
> Any suggestions?


Good catch, it was a bug. I just submitted a patch on the code
repository, do a 'svn update' to get it.

Now, keep in mind that crowbar might require user interaction (like
clicking on 'accept this certificate' dialogs, etc.) so keep an eye on
the crowbar little window.

Let me know if you have any other problem with it.

-- 
Stefano Mazzocchi
Digital Libraries Research Group                 Research Scientist
Massachusetts Institute of Technology
E25-131, 77 Massachusetts Ave               skype: stefanomazzocchi
Cambridge, MA  02139-4307, USA         email: stefanom at mit . edu
-------------------------------------------------------------------

_______________________________________________
General mailing list
[email protected]
http://simile.mit.edu/mailman/listinfo/general

Re: Using crowbar with https pages

Reply via email to