Re: Looking for browser emulator

2011-10-13 Thread Roy Smith
In article ,
 Gary Herron  wrote:
 
> Try mechanize
>http://wwwsearch.sourceforge.net/mechanize/
> billed as
>Stateful programmatic web browsing in Python.

Wow, this is cool, thanks!  It even does cookies!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Looking for browser emulator

2011-10-13 Thread Gary Herron

On 10/13/2011 07:19 PM, Roy Smith wrote:

I've got to write some tests in python which simulate getting a page of
HTML from an http server, finding a link, clicking on it, and then
examining the HTML on the next page to make sure it has certain features.

I can use urllib to do the basic fetching, and lxml gives me the tools
to find the link I want and extract its href attribute.  What's missing
is dealing with turning the href into an absolute URL that I can give to
urlopen().  Browsers implement all sorts of stateful logic such as "if
the URL has no hostname, use the same hostname as the current page".
I'm talking about something where I can execute this sequence of calls:

urlopen("http://foo.com:/bar";)
urlopen("/baz")

and have the second one know that it needs to get
"http://foo.com:/baz";.  Does anything like that exist?

I'm really trying to stay away from Selenium and go strictly with
something I can run under unittest.



Try mechanize
  http://wwwsearch.sourceforge.net/mechanize/
billed as
  Stateful programmatic web browsing in Python.


I handles clicking on links, cookies, logging in/out, and filling in of 
forms in the same way as a "real" browser, but it's all under 
programmatic control from Python.



In Ubuntu, it's the python-mechanize package.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Looking for browser emulator

2011-10-13 Thread Miki Tebeka
IIRC mechanize can do that.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Looking for browser emulator

2011-10-13 Thread Roy Smith
In article 
<2323f3d7-42ff-4de5-9006-4741e865f...@a9g2000yqo.googlegroups.com>,
 Jon Clements  wrote:

> On Oct 14, 3:19 am, Roy Smith  wrote:
> > I've got to write some tests in python which simulate getting a page of
> > HTML from an http server, finding a link, clicking on it, and then
> > examining the HTML on the next page to make sure it has certain features.
> >
> > I can use urllib to do the basic fetching, and lxml gives me the tools
> > to find the link I want and extract its href attribute.  What's missing
> > is dealing with turning the href into an absolute URL that I can give to
> > urlopen().  Browsers implement all sorts of stateful logic such as "if
> > the URL has no hostname, use the same hostname as the current page".  
> > I'm talking about something where I can execute this sequence of calls:
> >
> > urlopen("http://foo.com:/bar";)
> > urlopen("/baz")
> >
> > and have the second one know that it needs to get
> > "http://foo.com:/baz";.  Does anything like that exist?
> >
> > I'm really trying to stay away from Selenium and go strictly with
> > something I can run under unittest.
> 
> lxml.html.make_links_absolute() ?

Interesting.  That might be exactly what I'm looking for.  Thanks!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Looking for browser emulator

2011-10-13 Thread Jon Clements
On Oct 14, 3:19 am, Roy Smith  wrote:
> I've got to write some tests in python which simulate getting a page of
> HTML from an http server, finding a link, clicking on it, and then
> examining the HTML on the next page to make sure it has certain features.
>
> I can use urllib to do the basic fetching, and lxml gives me the tools
> to find the link I want and extract its href attribute.  What's missing
> is dealing with turning the href into an absolute URL that I can give to
> urlopen().  Browsers implement all sorts of stateful logic such as "if
> the URL has no hostname, use the same hostname as the current page".  
> I'm talking about something where I can execute this sequence of calls:
>
> urlopen("http://foo.com:/bar";)
> urlopen("/baz")
>
> and have the second one know that it needs to get
> "http://foo.com:/baz";.  Does anything like that exist?
>
> I'm really trying to stay away from Selenium and go strictly with
> something I can run under unittest.

lxml.html.make_links_absolute() ?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Looking for browser emulator

2011-10-13 Thread Jon Clements
On Oct 14, 3:19 am, Roy Smith  wrote:
> I've got to write some tests in python which simulate getting a page of
> HTML from an http server, finding a link, clicking on it, and then
> examining the HTML on the next page to make sure it has certain features.
>
> I can use urllib to do the basic fetching, and lxml gives me the tools
> to find the link I want and extract its href attribute.  What's missing
> is dealing with turning the href into an absolute URL that I can give to
> urlopen().  Browsers implement all sorts of stateful logic such as "if
> the URL has no hostname, use the same hostname as the current page".  
> I'm talking about something where I can execute this sequence of calls:
>
> urlopen("http://foo.com:/bar";)
> urlopen("/baz")
>
> and have the second one know that it needs to get
> "http://foo.com:/baz";.  Does anything like that exist?
>
> I'm really trying to stay away from Selenium and go strictly with
> something I can run under unittest.

lxml.html.make_links_absolute() ?
-- 
http://mail.python.org/mailman/listinfo/python-list


Looking for browser emulator

2011-10-13 Thread Roy Smith
I've got to write some tests in python which simulate getting a page of 
HTML from an http server, finding a link, clicking on it, and then 
examining the HTML on the next page to make sure it has certain features.

I can use urllib to do the basic fetching, and lxml gives me the tools 
to find the link I want and extract its href attribute.  What's missing 
is dealing with turning the href into an absolute URL that I can give to 
urlopen().  Browsers implement all sorts of stateful logic such as "if 
the URL has no hostname, use the same hostname as the current page".  
I'm talking about something where I can execute this sequence of calls:

urlopen("http://foo.com:/bar";)
urlopen("/baz")

and have the second one know that it needs to get 
"http://foo.com:/baz";.  Does anything like that exist?

I'm really trying to stay away from Selenium and go strictly with 
something I can run under unittest.
-- 
http://mail.python.org/mailman/listinfo/python-list