I've had much success doing round trips through the lxml.html parser.
https://lxml.de/lxmlhtml.html
I ditched bs for lxml long ago and never regretted it.
If you find that you have a bunch of invalid html that lxml inadvertently
"fixes", I would recommend adding a stutter-step to your project:
On Thu, Feb 11, 2021 at 1:49 PM dn via Python-list
wrote:
> When I first met it, one of the concepts I found difficult to 'wrap my
> head around' was the idea that "open software" allowed folk to fork the
> original work and 'do their own thing'. My thinking was (probably)
> "surely, the
Received?
On Sun, Sep 16, 2018 at 3:39 PM Buck Evan wrote:
> I started to send this to python-ideas, but I'm having second thoughts.
> Does tihs have merit?
>
> ---
> I stumble on this a lot, and I see it in many python libraries:
>
> def f(*args, **kwargs):
> ...
&g
I started to send this to python-ideas, but I'm having second thoughts.
Does tihs have merit?
---
I stumble on this a lot, and I see it in many python libraries:
def f(*args, **kwargs):
...
f(*[list comprehension])
f(**mydict)
It always seems a shame to carefully build up an object in
Change by Buck Evan :
--
type: -> behavior
___
Python tracker
<https://bugs.python.org/issue34706>
___
___
Python-bugs-list mailing list
Unsubscrib
New submission from Buck Evan :
Specifically in the case of a class that does not override its constructor
signature inherited from object.
Github PR incoming shortly.
--
components: Library (Lib)
messages: 325501
nosy: bukzor
priority: normal
severity: normal
status: open
title
Buck Evan added the comment:
@serhiy.storchaka This is a very stable piece of a legacy code base, so we're
not keen to refactor it so dramatically, although we could.
We've worked around this issue by compiling pyc files ahead of time and taking
extra care that they're preserved through
Buck Evan added the comment:
New data: The memory consumption seems to be in the compiler rather than the
marshaller:
```
$ PYTHONDONTWRITEBYTECODE=1 python -c 'import repro'
16032
$ PYTHONDONTWRITEBYTECODE=1 python -c 'import repro'
16032
$ PYTHONDONTWRITEBYTECODE=1 python -c 'import repro
New submission from Buck Evan:
In the attached example I show that there's a significant memory overhead
present whenever a pre-compiled pyc is not present.
This only occurs with more than 5225 objects (dictionaries in this case)
allocated. At 13756 objects, the mysterious pyc overhead is 50
Buck Evan added the comment:
Also, we've reproduced this in both linux and osx.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24085
Buck Golemon added the comment:
We've hit this problem today.
What are we supposed to do in the meantime?
--
nosy: +bukzor
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5945
New submission from Buck Golemon:
In order to make an inheritable pipe, the code is quite a bit different between
posixes that implement pipe2 and those that don't (osx, mainly). I believe the
officially-supported path is to call os.pipe() then os.setinheritable(). This
seems objectionable
New submission from Buck Golemon:
The color needs adjusted such that it has at least 3:1 luminance contrast
versus the surrounding non-link text. (See non-inheritable
https://docs.python.org/3/library/os.html#os.dup)
See also:
* http://www.w3.org/TR/WCAG20/#visual-audio-contrast-without
Buck Golemon added the comment:
I notice that dup2 grew an `inheritable=True` argument in 3.4.
This might be a good precedent to use here, as a third option.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22722
Buck Golemon added the comment:
Proposed patch attached.
--
keywords: +patch
Added file: http://bugs.python.org/file37006/link-color.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22723
New submission from Buck Golemon:
I have fixed the issue in my branch here:
https://github.com/bukzor/cpython/commit/013e689731ba32319f05a62a602f01dd7d7f2e83
I don't propose it as a patch, but as a proof of concept and point of
discussion.
If there's no chance of shipping a fix in 2.7.9, feel
It used to be that the best way to compare floating point numbers while
disregarding the inherent epsilon was to use `str(x) == str(y)`. It looks like
that workaround doesn't work anymore in 3.4.
What's the recommended way to do this now?
format(.01 + .01 + .01 + .01 + .01 + .01, 'g') ==
Buck Golemon added the comment:
I believe this issue is still extant.
The tip httplib client neither sends accept-encoding gzip nor supports
content-encoding gzip.
http://hg.python.org/cpython/file/tip/Lib/http/client.py#l1012
There is a diff to httplib in this attached patch, where
On Sunday, January 19, 2014 12:19:29 AM UTC-8, Ian wrote:
On Sat, Jan 18, 2014 at 10:40 PM, buck w***@gmail.com wrote:
I'm trying to work through Skienna's algorithms handbook, and note that the
author often uses graphical representations of the diagrams to help
understand (and even
I'm trying to work through Skienna's algorithms handbook, and note that the
author often uses graphical representations of the diagrams to help understand
(and even debug) the algorithms. I'd like to reproduce this in python.
How would you go about this? pyQt, pygame and pyglet immediately come
On Friday, November 16, 2012 4:33:14 PM UTC-8, Nobody wrote:
On Fri, 16 Nov 2012 13:44:03 -0800, buck wrote:
IOW: Microsoft's embrace, extend, extinguish strategy has been too
successful and now we have to deal with it. If HTML content is tagged as
using ISO-8859-1, it's more likely that it's
Latin1 has a block of 32 undefined characters.
Windows-1252 (aka cp1252) fills in 27 of these characters but leaves five
undefined: 0x81, 0x8D, 0x8F, 0x90, 0x9D
The byte 0x81 decoded with latin gives the unicode 0x81.
Decoding the same byte with windows-1252 yields a stack trace with
On Friday, November 16, 2012 2:34:32 PM UTC-8, Ian wrote:
On Fri, Nov 16, 2012 at 2:44 PM, buck wrote:
Latin1 has a block of 32 undefined characters.
These characters are not undefined. 0x80-0x9f are the C1 control
codes in Latin-1, much as 0x00-0x1f are the C0 control codes
Buck Golemon buck.gole...@amd.com added the comment:
Let's examine x://
absolute-URI = scheme : hier-part [ ? query ]
hier-part = // authority path-abempty
So this is okay if authority and path-abempty can both be empty strings.
authority = [ userinfo @ ] host [ : port ]
host
Buck Golemon b...@yelp.com added the comment:
Well i think the real issue is that you can't enumerate the protocals that use
netloc. All protocols are allowed to have a netloc. the smb: protocol
certainly does, but it's not in the list.
The core issue is that smb:/foo and smb:///foo
New submission from Buck Golemon b...@yelp.com:
1) As long as x is valid, I expect that urlunsplit(urlsplit(x)) == x
2) yelp:///foo is a well-formed (albeit odd) url. It it similar to file:///tmp:
it specifies the /foo resource, on the current host, using the yelp protocol
(defined on mobile
I feel like the design of sum() is inconsistent with other language
features of python. Often python doesn't require a specific type, only
that the type implement certain methods.
Given a class that implements __add__ why should sum() not be able to
operate on that class?
We can fix this in a
On Feb 23, 1:19 pm, Buck Golemon b...@yelp.com wrote:
I feel like the design of sum() is inconsistent with other language
features of python. Often python doesn't require a specific type, only
that the type implement certain methods.
Given a class that implements __add__ why should sum
On Feb 23, 1:32 pm, Chris Rebert c...@rebertia.com wrote:
On Thu, Feb 23, 2012 at 1:19 PM, Buck Golemon b...@yelp.com wrote:
I feel like the design of sum() is inconsistent with other language
features of python. Often python doesn't require a specific type, only
that the type implement
This is what I came up with:
https://gist.github.com/1496028
We'll see if it helps, tomorrow.
On Sunday, December 18, 2011 6:01:50 PM UTC-8, buck wrote:
Thanks Jack. I think printf is what it will come down to. I plan to put a
little code into PyDict_New to print the id and the line at which
On Saturday, December 17, 2011 11:55:13 PM UTC-8, Paul Rubin wrote:
buck workit...@gmail.com writes:
I tried to pinpoint this intermediate allocation with a similar
PyDict_New/LD_PRELOAD interposer, but that isn't working for me[2].
Did you try a gdb watchpoint?
I didn't try that, since
on and off to pinpoint where
the refcounts are getting messed up. It also causes python to use
plain malloc()s so valgrind becomes useful. Worst case add assertions
and printf()s in the places you think are most janky.
-Jack
On Sat, Dec 17, 2011 at 11:17 PM, buck workit...@gmail.com wrote:
I'm
I'm getting a fatal python error Fatal Python error: GC object already
tracked[1].
Using gdb, I've pinpointed the place where the error is detected. It is an
empty dictionary which is marked as in-use. This is somewhat helpful since I
can reliably find the memory address of the dict, but it
, @kwargs)
For backward compatibility, we could say that the unary * is identical to @list
and unary ** is identical to @dict.
-buck
--
http://mail.python.org/mailman/listinfo/python-list
it to bitbucket and share with the world if you like, almost as
easily.
--Buck
--
http://mail.python.org/mailman/listinfo/python-list
This is what made me choose Mercurial in my recent search.
http://www.python.org/dev/peps/pep-0374/
There is a tremendous amount of detail there. In summary, hg and git are both
very good, and essentially equal in features. The only salient difference is
that hg is implemented in python, so
I've been having issues with getting a file-like object to work with
multiprocessing. Since the details are quite lengthy, I've posted them on
stackoverflow here:
http://stackoverflow.com/questions/5821880/python-multiprocessing-synchronizing-file-like-object
I hope I'm not being super rude by
I'm not not touching you!
--
http://mail.python.org/mailman/listinfo/python-list
Buck Golemon buck.gole...@amd.com added the comment:
@Barry: Yes, it's still a problem.
The ubuntu 10.10 python2.7 still has no multiprocessing.
Since the EOL is April 2012, it needs fixed.
It may be considered an invalid python bug, since it seems to be strictly
related to Ubuntu packaging
Buck Golemon buck.gole...@amd.com added the comment:
python2.7.1+ from mercurial supports sem_open (and multiprocessing) just fine.
doko: Could you help us figure out why the ubuntu 10.10 python2.7 build has
this issue? I believe this issue should be assigned to you?
Relevant lines from
Buck Golemon buck.gole...@amd.com added the comment:
Isn't this an Ubuntu problem if sem_open only works with some specific
kernels?
sem_open works fine (python2.6 is using it), but the python2.7 build process
didn't detect it properly. This is either a bug with Ubuntu's python2.7 build
Changes by Buck Golemon buck.gole...@amd.com:
--
title: Cannot import name SemLock on Ubuntu lucid - Cannot import name SemLock
on Ubuntu
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8326
Buck Golemon buck.gole...@amd.com added the comment:
I suggest that you try to build from the above mercurial repository and see
if the problem persists.
How do I know the configuration options that the Ubuntu packager used?
--
___
Python tracker
Buck Golemon buck.gole...@amd.com added the comment:
On Ubuntu 10.10 (maverick), python2.6 is functioning correctly, but python2.7
is giving this error again.
$ /usr/bin/python2.7
from multiprocessing.synchronize import Semaphore
ImportError: This platform lacks a functioning sem_open
Buck Golemon buck.gole...@amd.com added the comment:
Minimal demo:
$ setenv PYTHONOPTIMIZE 0
$ python3.1 -OO -c print(__debug__)
False
I've used this code to get the desired functionality:
if [[ $TESTING == 1 || ${PYTHONOPTIMIZE-2} =~ '^(0*|)$' ]]; then
#someone is requesting
Buck Golemon buck.gole...@amd.com added the comment:
If I understand this code, it means that PYTHONOPTIMIZE set to 1 or 2 works as
expected, but set to 0, gives a flag value of 1.
static int
add_flag(int flag, const char *envs)
{
int env = atoi(envs);
if (flag env
Buck Golemon buck.gole...@amd.com added the comment:
that number of times isn't exactly accurate either, since 0 is effectively
interpreted as 1.
This change would only adversely affect people who use no -O option, set
PYTHONOPTIMIZE to '0', and need optimization.
I feel like that falls
Buck Golemon buck.gole...@amd.com added the comment:
The file is here:
http://svn.python.org/view/python/trunk/Python/pythonrun.c?view=markup
The second if statement is doing exactly what I find troubling: set the flag
even if the incoming value is 0.
I guess this is to handle the empty
New submission from Buck Golemon buck.gole...@amd.com:
In our environment, we have a wrapper which enables optimization by default
(-OO). Most commandline tools which have a mode-changing flag such as this,
also have a flag to do the opposite ( see: ls -t -U, wget -nv -v, ).
I'd like
On Oct 12, 4:30 pm, Carl Banks pavlovevide...@gmail.com wrote:
On Oct 12, 11:24 am, Buck workithar...@gmail.com wrote:
On Oct 10, 9:44 am, Gabriel Genellina gagsl-...@yahoo.com.ar
wrote:
The good thing is that, if the backend package is properly installed
somewhere in the Python
On Oct 12, 3:34 pm, Gabriel Genellina gagsl-...@yahoo.com.ar
wrote:
En Mon, 12 Oct 2009 15:24:34 -0300, Buck workithar...@gmail.com escribió:
On Oct 10, 9:44 am, Gabriel Genellina gagsl-...@yahoo.com.ar
wrote:
The good thing is that, if the backend package is properly installed
On Oct 13, 9:37 am, Ethan Furman et...@stoneleaf.us wrote:
Buck wrote:
I'd like to get to zero-installation if possible. It's easy with
simple python scripts, why not packages too? I know the technical
reasons, but I haven't heard any practical reasons.
I don't think we mean the same
On Oct 10, 9:44 am, Gabriel Genellina gagsl-...@yahoo.com.ar
wrote:
The good thing is that, if the backend package is properly installed
somewhere in the Python path ... it still works with no modifications.
I'd like to get to zero-installation if possible. It's easy with
simple python
On Oct 5, 11:29 am, Robert Kern robert.k...@gmail.com wrote:
On 2009-10-05 12:42 PM, Buck wrote:
With the package layout, you would just do:
from parrot.sleeping import sleeping_in_a_bed
from parrot.feeding.eating import eat_cracker
This is really much more straightforward
might as well re-write the above
boilerplate code.
I'm overstating my case here for emphasis, but it's essentially true.
--Buck
--
http://mail.python.org/mailman/listinfo/python-list
I use MySQLdb quite a bit in my work. I could volunteer to help update
it. Are there any particular bugs we're talking about or just a
straight port to 3.0?
--Buck
On Jul 31, 6:32 pm, John Nagle na...@animats.com wrote:
Any progress on updating feedparser and MySQLdb for Python 3.x
Buck Golemon [EMAIL PROTECTED] added the comment:
/agree
___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2613
___
___
Python-bugs-list mailing list
Unsubscribe
Buck Golemon [EMAIL PROTECTED] added the comment:
If there's no difference then they should work the same?
I agree there's probably little value in 'fixing' it.
__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2613
Buck Golemon [EMAIL PROTECTED] added the comment:
I'm not sure what your problem is, but comp.lang.python might be a
better place to ask. It's not clear that this is a bug yet.
http://groups.google.com/group/comp.lang.python/topics
--
nosy: +bgolemon
I've been trying to install Mailman, which requires a newer version
of the Python language compiler (p-code generator?) than the one I
currently have on my linux webserver/gateway box.
It's running a ClarkConnect 2.01 package based on Red Hat 7.2 linux.
I downloaded the zipped tarball
I've been trying to install Mailman, which requires a newer version
of the Python language compiler (p-code generator?) than the one I
currently have on my linux webserver/gateway box.
It's running a ClarkConnect 2.01 package based on Red Hat 7.2 linux.
I downloaded the zipped tarball
or Model 204?
buck
--
http://mail.python.org/mailman/listinfo/python-list
eventually be a credible product. But right now it's has a
wide range of inexcusable problems.
More info at http://sql-info.de/mysql/gotchas.html
buck
--
http://mail.python.org/mailman/listinfo/python-list
-info.de/mysql/gotchas.html
BTW, you should upgrade, they're now on 5.0.3. Their support site
appears to be down right now (timeouts) so I can't check the new bug
list, but since 5.0.2 is beta, it may have introduced more problems
than it solved.
buck
--
http://mail.python.org/mailman/listinfo
64 matches
Mail list logo