[issue33342] urllib IPv6 parsing fails with special characters in passwords

2019-10-15 Thread STINNER Victor


STINNER Victor  added the comment:

I modified my PR 16780 to also fix this issue, my PR was written for bpo-36338.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33342] urllib IPv6 parsing fails with special characters in passwords

2019-10-15 Thread Karthikeyan Singaravelan


Change by Karthikeyan Singaravelan :


--
nosy: +vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33342] urllib IPv6 parsing fails with special characters in passwords

2019-01-24 Thread Terrence Brannon


Terrence Brannon  added the comment:

Also note, if SQLAlchemy gives any guidance, then note that SA unquotes both 
the username and password of the URL:

https://github.com/sqlalchemy/sqlalchemy/blob/master/lib/sqlalchemy/engine/url.py#L274

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33342] urllib IPv6 parsing fails with special characters in passwords

2019-01-24 Thread Terrence Brannon


Terrence Brannon  added the comment:

Regarding "RFC 2396 explicitly excludes the use of [ and ] in URLs. RFC 2732 
 defines the syntax for IPv6 URLs, and 
allows [ and ] ONLY in the host part.

So I'd say that the behaviour is arguably correct (if somewhat unfortunate)"

I would say that a square bracket CAN be used in the password, but that it 
should be urlencoded and that this library should perform a urldecode for both 
username and password, just as SQLAlchemy does.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33342] urllib IPv6 parsing fails with special characters in passwords

2019-01-23 Thread Terrence Brannon


Terrence Brannon  added the comment:

I would like to add to this bug - the password field on the URL cannot contain 
a pound sign or question mark or the parser incorrectly parses the URL, as this 
gist demonstrates - 
https://gist.github.com/metaperl/fc6f43bf6b9a9f874b8f27e29695e68c

--
nosy: +metaperl

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33342] urllib IPv6 parsing fails with special characters in passwords

2018-10-06 Thread Thomas Jollans


Thomas Jollans  added the comment:

RFC 2396 explicitly excludes the use of [ and ] in URLs. RFC 2732 
 defines the syntax for IPv6 URLs, and 
allows [ and ] ONLY in the host part.

So I'd say that the behaviour is arguably correct (if somewhat unfortunate)

--
nosy: +tjollans

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33342] urllib IPv6 parsing fails with special characters in passwords

2018-05-19 Thread Martin Panter

Martin Panter  added the comment:

I presume this is about parsing a URL like

>>> urlsplit("//user:[@host")
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/proj/python/cpython/Lib/urllib/parse.py", line 431, in urlsplit
raise ValueError("Invalid IPv6 URL")
ValueError: Invalid IPv6 URL

Ideally the square bracket should be escaped as %5B. Related reports about 
parsing unescaped delimiters in a URL password are Issue 18140 (fragment #, 
query ?) and Issue 23328 (slash /).

--
nosy: +martin.panter

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33342] urllib IPv6 parsing fails with special characters in passwords

2018-04-23 Thread benaryorg

Change by benaryorg :


--
type:  -> behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33342] urllib IPv6 parsing fails with special characters in passwords

2018-04-23 Thread benaryorg

New submission from benaryorg <bin...@benary.org>:

The documentation specifies to follow RFC 2396 
(https://tools.ietf.org/html/rfc2396.html) but fails to parse a 
user:password@host url in urllib.parse.urlsplit 
(https://docs.python.org/3/library/urllib.parse.html#urllib.parse.urlsplit) 
when the password contains an '[' character.
This is because the urlsplit code does not strip the authority part (everything 
from index 0 up to and including the last '@') before checking whether the 
hostname contains '[' for detecting whether it's an IPv6 address 
(https://github.com/python/cpython/blob/8a6f4b4bba950fb8eead1b176c58202d773f2f70/Lib/urllib/parse.py#L416-L418).

--
components: Library (Lib)
messages: 315668
nosy: benaryorg
priority: normal
severity: normal
status: open
title: urllib IPv6 parsing fails with special characters in passwords
versions: Python 2.7, Python 3.6

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33342>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: urllib and parsing

2011-10-06 Thread Tim Roberts
luca72 lucabe...@libero.it wrote:

Hello i have a simple question:
up to now if i have to parse a page i do as follow:
...
Now i have the site that is open by an html file like this:
...
how can i open it with urllib, please note i don't have to parse this
file, but i have to parse the site where he point.

Well, you can use htmllib to parse the HTML, look for the form tag, and
extract the action verb.  Or, if you really just want this one site, you
can use urllib2 to provide POST parameters:

  import urllib
  import urllib2

  url = 'http://lalal.hhdik/'
  values = {'password' : 'password',
'Entra' : 'Entra' }

  data = urllib.urlencode(values)
  req = urllib2.Request(url, data)
  response = urllib2.urlopen(req)
  the_page = response.read()
-- 
Tim Roberts, t...@probo.com
Providenza  Boekelheide, Inc.
-- 
http://mail.python.org/mailman/listinfo/python-list


question about urllib and parsing a page

2005-11-02 Thread nephish
hey there,
i am using beautiful soup to parse a few pages (screen scraping)
easy stuff.
the issue i am having is with one particular web page that uses a
javascript to display some numbers in tables.

now if i open the file in mozilla and save as i get the numbers in
the source. cool. but i click on the view source or download the url
with urlretrieve, i get the source, but not the numbers.

is there a way around this ?

thanks

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: question about urllib and parsing a page

2005-11-02 Thread matt
Yeah, this tends to be silly, but a workaround (for firefox at least)
is to select the content and rather than saying view source, right
click and click View Selection Source...

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: question about urllib and parsing a page

2005-11-02 Thread nephish
thats cool, but i want to do this automatically with python.
what can i do to have urllib download the source with the numbers in
it?

ok, not necessarily urllib, whatever one is best for the occation
thanks 
shawn

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: question about urllib and parsing a page

2005-11-02 Thread nephish
well, i think thats the case, looking at the code, there is a long
string of math functions in page, java math functions. h. i guess
i'm up that famous creek.
thanks for the info, though
shawn

-- 
http://mail.python.org/mailman/listinfo/python-list