subject:"Progress on the Gilectomy"

Re: Progress on the Gilectomy

2018-01-03 Thread harindudilshan95

Why not make the garbage collector check the reference count before freeing 
objects? Only c extensions would increment the ref count while python code 
would just use garbage collector making ref count = 0. That way even the 
existing c extensions would continue to work. 


Regarding to Java using all the memory, thats not really true. It has a default 
heap size which may exceed the total memory in a particular environment(Android 
).
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-23 Thread Ethan Furman


On 06/22/2017 10:26 PM, Rustom Mody wrote:


Lawrence d'Oliveiro was banned on 30th Sept 2016 till end-of-year
https://mail.python.org/pipermail/python-list/2016-September/714725.html

Is there still a ban?


My apologies to Lawrence, I completely forgot.

The ban is now lifted.

--
~Ethan~
--
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy (Posting On Python-List Prohibited)

2017-06-23 Thread Marko Rauhamaa

Gregory Ewing :

> Lawrence D’Oliveiro wrote:
>> what WOULD you consider to be so “representative”?
>
> I don't claim any of them to be representative. Different GC
> strategies have different characteristics.

My experiences with Hotspot were a bit disheartening. GC is a winning
concept provided that you don't have to strategize too much. In
practice, it seems tweaking the GC parameters is a frequent necessity.

On the other hand, I believe much of the trouble comes from storing too
much information in the heap. Applications shouldn't have semipersistent
multigigabyte lookup structures kept in RAM, at least not in numerous
small objects.

Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy (Posting On Python-List Prohibited)

2017-06-23 Thread Gregory Ewing


Lawrence D’Oliveiro wrote:

what WOULD you consider to be so “representative”?


I don't claim any of them to be representative. Different GC
strategies have different characteristics.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-22 Thread Gregory Ewing


Marko Rauhamaa wrote:

And, BTW, my rule of thumb came from experiences with the Hotspot JRE.


I wouldn't take a Java implementation to be representative of
the behaviour of GC systems in general.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-22 Thread Rustom Mody

On Thursday, June 22, 2017 at 4:28:03 AM UTC+5:30, Steve D'Aprano wrote:
> On Thu, 22 Jun 2017 08:23 am, breamoreboy wrote:
> 
> > Don't you know that Lawrence D’Oliveiro has been banned from the mailing 
> > list
> > as he hasn't got a clue what he's talking about, 
> 
> That's not why he was given a ban. Being ignorant is not a crime -- if it 
> were,
> a lot more of us would be banned, including all newbies.

Lawrence d'Oliveiro was banned on 30th Sept 2016 till end-of-year
https://mail.python.org/pipermail/python-list/2016-September/714725.html

Is there still a ban?
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-22 Thread Steve D'Aprano

On Fri, 23 Jun 2017 01:07 am, breamore...@gmail.com wrote:

> 11 comments on the thread "Instagram: 40% Py3 to 99% Py3 in 10 months" showing
> that he knows as much about Unicode as LDO knows about garabge collection.


Who cares? Every time he opens his mouth to write absolute rubbish he just makes
a fool of himself. Why do you let it upset you?



-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy (Posting On Python-List Prohibited)

2017-06-22 Thread CFK

On Jun 22, 2017 4:03 PM, "Chris Angelico"  wrote:

On Fri, Jun 23, 2017 at 5:22 AM, CFK  wrote:
> On Jun 22, 2017 9:32 AM, "Chris Angelico"  wrote:
>
> On Thu, Jun 22, 2017 at 11:24 PM, CFK  wrote:
>> When
>> I draw memory usage graphs, I see sawtooth waves to the memory usage
which
>> suggest that the garbage builds up until the GC kicks in and reaps the
>> garbage.
>
> Interesting. How do you actually measure this memory usage? Often,
> when a GC frees up memory, it's merely made available for subsequent
> allocations, rather than actually given back to the system - all it
> takes is one still-used object on a page and the whole page has to be
> retained.
>
> As such, a "create and drop" usage model would tend to result in
> memory usage going up for a while, but then remaining stable, as all
> allocations are being fulfilled from previously-released memory that's
> still owned by the process.
>
>
> I'm measuring it using a bit of a hack; I use psutil.Popen
> (https://pypi.python.org/pypi/psutil) to open a simulation as a child
> process, and in a tight loop gather the size of the resident set and the
> number of virtual pages currently in use of the child. The sawtooths are
> about 10% (and decreasing) of the size of the overall memory usage, and
are
> probably due to different stages of the simulation doing different things.
> That is an educated guess though, I don't have strong evidence to back it
> up.
>
> And, yes, what you describe is pretty close to what I'm seeing. The longer
> the simulation has been running, the smoother the memory usage gets.

Ah, I think I understand. So the code would be something like this:

Phase one:
Create a bunch of objects
Do a bunch of simulation
Destroy a bunch of objects
Simulate more
Destroy all the objects used in this phase, other than the result

Phase two:
Like phase one

In that case, yes, it's entirely possible that the end of a phase
could signal a complete cleanup of intermediate state, with the
consequent release of memory to the system. (Or, more likely, a
near-complete cleanup, with release of MOST of memory.)

Very cool bit of analysis you've done there.


Thank you! And, yes, that is essentially what is going on (or was in that
version of the simulator; I'm in the middle of a big refactor to speed
things up and expect the memory usage patterns to change)

Thanks,
Cem Karan
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy (Posting On Python-List Prohibited)

2017-06-22 Thread Chris Angelico

On Fri, Jun 23, 2017 at 5:22 AM, CFK  wrote:
> On Jun 22, 2017 9:32 AM, "Chris Angelico"  wrote:
>
> On Thu, Jun 22, 2017 at 11:24 PM, CFK  wrote:
>> When
>> I draw memory usage graphs, I see sawtooth waves to the memory usage which
>> suggest that the garbage builds up until the GC kicks in and reaps the
>> garbage.
>
> Interesting. How do you actually measure this memory usage? Often,
> when a GC frees up memory, it's merely made available for subsequent
> allocations, rather than actually given back to the system - all it
> takes is one still-used object on a page and the whole page has to be
> retained.
>
> As such, a "create and drop" usage model would tend to result in
> memory usage going up for a while, but then remaining stable, as all
> allocations are being fulfilled from previously-released memory that's
> still owned by the process.
>
>
> I'm measuring it using a bit of a hack; I use psutil.Popen
> (https://pypi.python.org/pypi/psutil) to open a simulation as a child
> process, and in a tight loop gather the size of the resident set and the
> number of virtual pages currently in use of the child. The sawtooths are
> about 10% (and decreasing) of the size of the overall memory usage, and are
> probably due to different stages of the simulation doing different things.
> That is an educated guess though, I don't have strong evidence to back it
> up.
>
> And, yes, what you describe is pretty close to what I'm seeing. The longer
> the simulation has been running, the smoother the memory usage gets.

Ah, I think I understand. So the code would be something like this:

Phase one:
Create a bunch of objects
Do a bunch of simulation
Destroy a bunch of objects
Simulate more
Destroy all the objects used in this phase, other than the result

Phase two:
Like phase one

In that case, yes, it's entirely possible that the end of a phase
could signal a complete cleanup of intermediate state, with the
consequent release of memory to the system. (Or, more likely, a
near-complete cleanup, with release of MOST of memory.)

Very cool bit of analysis you've done there.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-22 Thread Ned Batchelder

On Thursday, June 22, 2017 at 11:07:36 AM UTC-4, bream...@gmail.com wrote:
> On Wednesday, June 21, 2017 at 11:58:03 PM UTC+1, Steve D'Aprano wrote:
> > On Thu, 22 Jun 2017 08:23 am, breamoreboy wrote:
> > 
> > > Don't you know that Lawrence D’Oliveiro has been banned from the mailing 
> > > list
> > > as he hasn't got a clue what he's talking about, 
> > 
> > That's not why he was given a ban. Being ignorant is not a crime -- if it 
> > were,
> > a lot more of us would be banned, including all newbies.
> > 
> > > just like the RUE? 
> > 
> > What is your obsession with wxjmfauth? You repeatedly mention him in 
> > unrelated
> > discussions.
> > 
> 
> 11 comments on the thread "Instagram: 40% Py3 to 99% Py3 in 10 months" 
> showing that he knows as much about Unicode as LDO knows about garabge 
> collection.

You've been asked to stop making personal attacks before. Please stop.

--Ned.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy (Posting On Python-List Prohibited)

2017-06-22 Thread CFK

On Jun 22, 2017 9:32 AM, "Chris Angelico"  wrote:

On Thu, Jun 22, 2017 at 11:24 PM, CFK  wrote:
> When
> I draw memory usage graphs, I see sawtooth waves to the memory usage which
> suggest that the garbage builds up until the GC kicks in and reaps the
> garbage.

Interesting. How do you actually measure this memory usage? Often,
when a GC frees up memory, it's merely made available for subsequent
allocations, rather than actually given back to the system - all it
takes is one still-used object on a page and the whole page has to be
retained.

As such, a "create and drop" usage model would tend to result in
memory usage going up for a while, but then remaining stable, as all
allocations are being fulfilled from previously-released memory that's
still owned by the process.

I'm measuring it using a bit of a hack; I use psutil.Popen (
https://pypi.python.org/pypi/psutil) to open a simulation as a child
process, and in a tight loop gather the size of the resident set and the
number of virtual pages currently in use of the child. The sawtooths are
about 10% (and decreasing) of the size of the overall memory usage, and are
probably due to different stages of the simulation doing different things.
That is an educated guess though, I don't have strong evidence to back it
up.

And, yes, what you describe is pretty close to what I'm seeing. The longer
the simulation has been running, the smoother the memory usage gets.

Thanks,
Cem Karan
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-22 Thread Chris Angelico

On Fri, Jun 23, 2017 at 1:48 AM, Marko Rauhamaa  wrote:
> Chris Angelico :
>
>> not "aim for 400MB because the garbage collector is only 10%
>> efficient". Get yourself a better garbage collector. Employ Veolia or
>> something.
>
> It's about giving GC room (space- and timewise) to operate. Also, you
> don't want your memory consumption to hit the RAM ceiling even for a
> moment.

Again, if you'd said to *leave 10% room*, I would be inclined to
believe you (eg to use no more than 3.5ish gig when you have four
available), but not to leave 90% room.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-22 Thread Marko Rauhamaa

Marko Rauhamaa :

> Chris Angelico :
>
>> not "aim for 400MB because the garbage collector is only 10%
>> efficient". Get yourself a better garbage collector. Employ Veolia or
>> something.
>
> It's about giving GC room (space- and timewise) to operate. Also, you
> don't want your memory consumption to hit the RAM ceiling even for a
> moment.

And, BTW, my rule of thumb came from experiences with the Hotspot JRE.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-22 Thread Marko Rauhamaa

Chris Angelico :

> not "aim for 400MB because the garbage collector is only 10%
> efficient". Get yourself a better garbage collector. Employ Veolia or
> something.

It's about giving GC room (space- and timewise) to operate. Also, you
don't want your memory consumption to hit the RAM ceiling even for a
moment.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-22 Thread Chris Angelico

On Thu, Jun 22, 2017 at 11:27 PM, Marko Rauhamaa  wrote:
> CFK :
>
>> Yes, and this is why I suspect CPython would work well too.  My usage
>> pattern may be similar to Python usage patterns. The only way to know for
>> sure is to try it and see what happens.
>
> I have a rule of thumb that your application should not need more than
> 10% of the available RAM. If your server has 4 GB of RAM, your
> application should only need 400 MB. The 90% buffer should be left for
> the GC to maneuver.

*BOGGLE*

I could see a justification in saying "aim for 400MB, because then
unexpected spikes won't kill you", or "aim for 400MB to ensure that
you can run multiple instances of the app for load balancing", or "aim
for 400MB because you don't want to crowd out the database and the
disk cache", but not "aim for 400MB because the garbage collector is
only 10% efficient". Get yourself a better garbage collector. Employ
Veolia or something.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-22 Thread Marko Rauhamaa

CFK :

> Yes, and this is why I suspect CPython would work well too.  My usage
> pattern may be similar to Python usage patterns. The only way to know for
> sure is to try it and see what happens.

I have a rule of thumb that your application should not need more than
10% of the available RAM. If your server has 4 GB of RAM, your
application should only need 400 MB. The 90% buffer should be left for
the GC to maneuver.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy (Posting On Python-List Prohibited)

2017-06-22 Thread Chris Angelico

On Thu, Jun 22, 2017 at 11:24 PM, CFK  wrote:
> When
> I draw memory usage graphs, I see sawtooth waves to the memory usage which
> suggest that the garbage builds up until the GC kicks in and reaps the
> garbage.

Interesting. How do you actually measure this memory usage? Often,
when a GC frees up memory, it's merely made available for subsequent
allocations, rather than actually given back to the system - all it
takes is one still-used object on a page and the whole page has to be
retained.

As such, a "create and drop" usage model would tend to result in
memory usage going up for a while, but then remaining stable, as all
allocations are being fulfilled from previously-released memory that's
still owned by the process.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy (Posting On Python-List Prohibited)

2017-06-22 Thread CFK

On Jun 22, 2017 12:38 AM, "Paul Rubin"  wrote:

Lawrence D’Oliveiro  writes:
> while “memory footprint” depends on how much memory is actually being
> retained in accessible objects.

If the object won't be re-accessed but is still retained by gc, then
refcounting won't free it either.

> Once again: The trouble with GC is, it doesn’t know when to kick in:
> it just keeps on allocating memory until it runs out.

When was the last time you encountered a problem like that in practice?
It's almost never an issue.  "Runs out" means reached an allocation
threshold that's usually much smaller than the program's memory region.
And as you say, you can always manually trigger a gc if the need arises.

I'm with Paul and Steve on this. I've had to do a **lot** of profiling on
my simulator to get it to run at a reasonable speed. Memory usage seems to
follow an exponential decay curve, hitting a strict maximum that strongly
correlates with the number of live objects in a given simulation run. When
I draw memory usage graphs, I see sawtooth waves to the memory usage which
suggest that the garbage builds up until the GC kicks in and reaps the
garbage.  In short, only an exceptionally poorly written GC would exhaust
memory before reaping garbage.

Thanks,
Cem Karan
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-22 Thread CFK

On Jun 21, 2017 1:38 AM, "Paul Rubin"  wrote:

Cem Karan  writes:
> I'm not too sure how much of performance impact that will have.  My
> code generates a very large number of tiny, short-lived objects at a
> fairly high rate of speed throughout its lifetime.  At least in the
> last iteration of the code, garbage collection consumed less than 1%
> of the total runtime.  Maybe this is something that needs to be done
> and profiled to see how well it works?

If the gc uses that little runtime and your app isn't suffering from the
added memory fragmentation, then it sounds like you're doing fine.

Yes, and this is why I suspect CPython would work well too.  My usage
pattern may be similar to Python usage patterns. The only way to know for
sure is to try it and see what happens.

> I **still** can't figure out how they managed to do it,

How it works (i.e. what the implementation does) is quite simple and
understandable.  The amazing thing is that it doesn't leak memory
catastrophically.

I'll have to read through the code then, just to see what they are doing.

Thanks,
Cem Karan
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy (Posting On Python-List Prohibited)

2017-06-21 Thread Paul Rubin

Lawrence D’Oliveiro  writes:
> while “memory footprint” depends on how much memory is actually being
> retained in accessible objects.

If the object won't be re-accessed but is still retained by gc, then
refcounting won't free it either.

> Once again: The trouble with GC is, it doesn’t know when to kick in:
> it just keeps on allocating memory until it runs out.

When was the last time you encountered a problem like that in practice?
It's almost never an issue.  "Runs out" means reached an allocation
threshold that's usually much smaller than the program's memory region.
And as you say, you can always manually trigger a gc if the need arises.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy (Posting On Python-List Prohibited)

2017-06-21 Thread Steve D'Aprano

On Thu, 22 Jun 2017 10:30 am, Lawrence D’Oliveiro wrote:

> Once again: The trouble with GC is, it doesn’t know when to kick in: it just
> keeps on allocating memory until it runs out.

Once again: no it doesn't.

Are you aware that CPython has a GC? (Or rather, a *second* GC, apart from the
reference counter.) It runs periodically to reclaim dead objects in cycles that
the reference counter won't free. It runs whenever the number of allocations
minus the number of deallocations exceed certain thresholds, and you can set
and query the thresholds using:

gc.set_threshold

gc.get_threshold

CPython alone disproves your assertion that GCs "keep on allocating memory until
it runs out". Are you aware that there are more than one garbage collection
algorithm? Apart from reference-counting GC, there are also "mark and sweep"
GCs, generational GCs (like CPython's), real-time algorithms, and more.

One real-time algorithm implicitly divides memory into two halves. When one half
is half-full, it moves all the live objects into the other half, freeing up the
first half.

The Mercury programming language even has a *compile time* garbage collector
that can determine when an object can be freed during compilation -- no sweeps
or reference counting required.

It may be that *some* (possibly toy) GC algorithms behave as you say, only
running when memory is completely full. But your belief that *all* GC
algorithms behave this way is simply wrong.

-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-21 Thread Steve D'Aprano

On Thu, 22 Jun 2017 08:23 am, breamore...@gmail.com wrote:

> Don't you know that Lawrence D’Oliveiro has been banned from the mailing list
> as he hasn't got a clue what he's talking about, 

That's not why he was given a ban. Being ignorant is not a crime -- if it were,
a lot more of us would be banned, including all newbies.

> just like the RUE? 

What is your obsession with wxjmfauth? You repeatedly mention him in unrelated
discussions.

-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-21 Thread Paul Rubin

Lawrence D’Oliveiro  writes:
> The trouble with GC is, it doesn’t know when to kick in: it just keeps
> on allocating memory until it runs out.

That's not how GC works, geez.  Typically it would run after every N
bytes of memory allocated, for N chosen to balance memory footprint
with cpu overhead.  
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-20 Thread Marko Rauhamaa

Paul Rubin :

> How it works (i.e. what the implementation does) is quite simple and
> understandable. The amazing thing is that it doesn't leak memory
> catastrophically.

If I understand it correctly, the 32-bit Go language runtime
implementation suffered "catastrophically" at one point. The reason was
that modern programs can actually use 2GB of RAM. That being the case,
there is a 50% chance for any random 4-byte combination to look like a
valid pointer into the heap.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-20 Thread Paul Rubin

Cem Karan  writes:
> I'm not too sure how much of performance impact that will have.  My
> code generates a very large number of tiny, short-lived objects at a
> fairly high rate of speed throughout its lifetime.  At least in the
> last iteration of the code, garbage collection consumed less than 1%
> of the total runtime.  Maybe this is something that needs to be done
> and profiled to see how well it works?

If the gc uses that little runtime and your app isn't suffering from the
added memory fragmentation, then it sounds like you're doing fine.

> I **still** can't figure out how they managed to do it,

How it works (i.e. what the implementation does) is quite simple and
understandable.  The amazing thing is that it doesn't leak memory
catastrophically.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-20 Thread Cem Karan

On Jun 20, 2017, at 1:19 AM, Paul Rubin  wrote:

> Cem Karan  writes:
>> Can you give examples of how it's not reliable?
> 
> Basically there's a chance of it leaking memory by mistaking a data word
> for a pointer.  This is unlikely to happen by accident and usually
> inconsequential if it does happen, but maybe there could be malicious
> data that makes it happen

Got it, thank you.  My processes will run for 1-2 weeks at a time, so I can 
handle minor memory leaks over that time without too much trouble.

> Also, it's a non-compacting gc that has to touch all the garbage as it
> sweeps, not a reliability issue per se, but not great for performance
> especially in large, long-running systems.

I'm not too sure how much of performance impact that will have.  My code 
generates a very large number of tiny, short-lived objects at a fairly high 
rate of speed throughout its lifetime.  At least in the last iteration of the 
code, garbage collection consumed less than 1% of the total runtime.  Maybe 
this is something that needs to be done and profiled to see how well it works?

> It's brilliant though.  It's one of those things that seemingly can't
> possibly work, but it turns out to be quite effective.

Agreed!  I **still** can't figure out how they managed to do it, it really does 
look like it shouldn't work at all!

Thanks,
Cem Karan
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-20 Thread Marko Rauhamaa

Paul Rubin :

> The simplest way to start experimenting with GC in Python might be to
> redefine the refcount macros to do nothing, connect the allocator to
> the Boehm GC, and stop all the threads when GC time comes. I don't
> know if Guile has threads at all, but I know it uses the Boehm GC and
> it's quite effective.

Guile requires careful programming practices in the C extension code:

   https://www.gnu.org/software/guile/manual/html_node/Foreign-Ob
   ject-Memory-Management.html#Foreign-Object-Memory-Management>


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-19 Thread Paul Rubin

Cem Karan  writes:
> Can you give examples of how it's not reliable?

Basically there's a chance of it leaking memory by mistaking a data word
for a pointer.  This is unlikely to happen by accident and usually
inconsequential if it does happen, but maybe there could be malicious
data that makes it happen

Also, it's a non-compacting gc that has to touch all the garbage as it
sweeps, not a reliability issue per se, but not great for performance
especially in large, long-running systems.

It's brilliant though.  It's one of those things that seemingly can't
possibly work, but it turns out to be quite effective.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-19 Thread Paul Rubin

Chris Angelico  writes:
> Or let's look at it a different way. Instead of using a PyObject* in C
> code, you could write C++ code that uses a trivial wrapper class that
> holds the pointer, increments its refcount on construction, and
> decrements that refcount on destruction.

That's the C++ STL shared_ptr template.  Unfortunately it has the same
problem as Python refcounts, i.e. it has to use locks to maintain thread
safety, which slows it down significantly.

The simplest way to start experimenting with GC in Python might be to
redefine the refcount macros to do nothing, connect the allocator to the
Boehm GC, and stop all the threads when GC time comes.  I don't know if
Guile has threads at all, but I know it uses the Boehm GC and it's quite
effective.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-19 Thread Chris Angelico

On Tue, Jun 20, 2017 at 1:52 PM, Rustom Mody  wrote:
> Saw this this morning
> https://medium.com/@alexdixon/functional-programming-in-javascript-is-an-antipattern-58526819f21e
>
> May seem irrelevant to this, but if JS, FP is replaced by Python, GC it 
> becomes
> more on topical

https://rhettinger.wordpress.com/2011/05/26/super-considered-super/

If super() is replaced with GC, it also becomes on-topic.

I'm sure all this has some deep existential meaning about how easily
blog posts can be transplanted into utterly unrelated conversations,
but at the moment, it eludes me.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-19 Thread Rustom Mody

On Tuesday, June 20, 2017 at 5:53:00 AM UTC+5:30, Cem Karan wrote:
> On Jun 19, 2017, at 6:19 PM, Gregory Ewing wrote:
> 
> > Ethan Furman wrote:
> >> Let me ask a different question:  How much effort is required at the C 
> >> level when using tracing garbage collection?
> > 
> > That depends on the details of the GC implementation, but often
> > you end up swapping one form of boilerplate (maintaining ref
> > counts) for another (such as making sure the GC system knows
> > about all the temporary references you're using).
> > 
> > Some, such as the Bohm collector, try to figure it all out
> > automagically, but they rely on non-portable tricks and aren't
> > totally reliable.
> 
> Can you give examples of how it's not reliable?  I'm currently using it in 
> one of my projects, so if it has problems, I need to know about them.

Saw this this morning
https://medium.com/@alexdixon/functional-programming-in-javascript-is-an-antipattern-58526819f21e

May seem irrelevant to this, but if JS, FP is replaced by Python, GC it becomes
more on topical
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-19 Thread Cem Karan

On Jun 19, 2017, at 6:19 PM, Gregory Ewing  wrote:

> Ethan Furman wrote:
>> Let me ask a different question:  How much effort is required at the C level 
>> when using tracing garbage collection?
> 
> That depends on the details of the GC implementation, but often
> you end up swapping one form of boilerplate (maintaining ref
> counts) for another (such as making sure the GC system knows
> about all the temporary references you're using).
> 
> Some, such as the Bohm collector, try to figure it all out
> automagically, but they rely on non-portable tricks and aren't
> totally reliable.

Can you give examples of how it's not reliable?  I'm currently using it in one 
of my projects, so if it has problems, I need to know about them.

On the main topic: I think that a good tracing garbage collector would probably 
be a good idea.  I've been having a real headache binding python to my C 
library via ctypes, and a large part of that problem is that I've got two 
different garbage collectors (python and bdwgc).  I think I've got it worked 
out at this point, but it would have been convenient to get memory allocated 
from python's garbage collected heap on the C-side.  Lot fewer headaches.

Thanks,
Cem Karan
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-19 Thread Gregory Ewing


Ethan Furman wrote:
Let me ask a different question:  How much effort is required at the C 
level when using tracing garbage collection?


That depends on the details of the GC implementation, but often
you end up swapping one form of boilerplate (maintaining ref
counts) for another (such as making sure the GC system knows
about all the temporary references you're using).

Some, such as the Bohm collector, try to figure it all out
automagically, but they rely on non-portable tricks and aren't
totally reliable.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-19 Thread Chris Angelico

On Tue, Jun 20, 2017 at 1:44 AM, Skip Montanaro
 wrote:
> On Mon, Jun 19, 2017 at 10:20 AM, Ethan Furman  wrote:
>
>> Programming at the C level is not working in Python, and many Python
>> niceties simply don't exist there.
>
>
> True, but a lot of functionality available to Python programmers exists at
> the extension module level, whether delivered as part of the core
> distribution or from third-party sources. (The core CPython test suite
> spends a fair amount of effort on leak detection, one side effect of
> incorrect reference counting.) While programming in Python you don't need
> to worry about reference counting errors, when they slip through from the C
> level, they affect you.

High level languages mean that you don't have to write C code. Does
the presence of core code and/or extension modules written in C mean
that Python isn't a high level language? No. And nor does that code
mean Python isn't garbage-collected. Everything has to have an
implementation somewhere.

Or let's look at it a different way. Instead of using a PyObject* in C
code, you could write C++ code that uses a trivial wrapper class that
holds the pointer, increments its refcount on construction, and
decrements that refcount on destruction. That way, you can simply
declare these PyObjectWrappers and let them expire. Does that mean
that suddenly the refcounting isn't your responsibility, ergo it's now
a garbage collector? Because the transformation is trivially easy.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-19 Thread Ethan Furman


On 06/19/2017 08:44 AM, Skip Montanaro wrote:

On Mon, Jun 19, 2017 at 10:20 AM, Ethan Furman wrote:



Programming at the C level is not working in Python, and many Python niceties 
simply don't exist there.


True, but a lot of functionality available to Python programmers exists at the 
extension module level, whether delivered
as part of the core distribution or from third-party sources. (The core CPython 
test suite spends a fair amount of
effort on leak detection, one side effect of incorrect reference counting.) 
While programming in Python you don't need
to worry about reference counting errors, when they slip through from the C 
level, they affect you.


Let me ask a different question:  How much effort is required at the C level 
when using tracing garbage collection?

--
~Ethan~

--
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-19 Thread Skip Montanaro

On Mon, Jun 19, 2017 at 10:20 AM, Ethan Furman  wrote:

> Programming at the C level is not working in Python, and many Python
> niceties simply don't exist there.

True, but a lot of functionality available to Python programmers exists at
the extension module level, whether delivered as part of the core
distribution or from third-party sources. (The core CPython test suite
spends a fair amount of effort on leak detection, one side effect of
incorrect reference counting.) While programming in Python you don't need
to worry about reference counting errors, when they slip through from the C
level, they affect you.

Skip
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-19 Thread Ethan Furman


On 06/19/2017 08:06 AM, Skip Montanaro wrote:

On Mon, Jun 19, 2017 at 9:20 AM, Ethan Furman wrote:



Reference counting is a valid garbage collecting mechanism, therefore Python is 
also a GC language.


Garbage collection is usually thought of as a way to remove responsibility for 
tracking of live data from the user.
Reference counting doesn't do that.


Caveat:  I'm not a CS major.

Question: In the same way that Object Orientation is usually thought of as data 
hiding?

Comment:  Except in rare cases (e.g. messing with __del__), the Python user does not have to think about nor manage live 
data, so reference counting seems to meet that requirement.  Programming at the C level is not working in Python, and 
many Python niceties simply don't exist there.


--
~Ethan~
--
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-19 Thread Skip Montanaro

On Mon, Jun 19, 2017 at 9:20 AM, Ethan Furman  wrote:

> Reference counting is a valid garbage collecting mechanism, therefore
> Python is also a GC language.

Garbage collection is usually thought of as a way to remove responsibility
for tracking of live data from the user. Reference counting doesn't do that.

Skip
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-19 Thread Rustom Mody

On Monday, June 19, 2017 at 7:40:49 PM UTC+5:30, Robin Becker wrote:
> On 19/06/2017 01:20, Paul Rubin wrote:
> ...
> > the existing C API quite seriously.  Reworking the C modules in the
> > stdlib would be a large but not impossible undertaking.  The many
> > external C modules out there would be more of an issue.
> > 
> I have always found the management of reference counts to be one of the 
> hardest 
> things about the C api.  I'm not sure exactly how C extensions would/should 
> interact with a GC python. There seem to be different approaches eg lua & go 
> are 
> both GC languages but seem different in how C/GC memory should interact.

Worth reading for chances python missed:

https://stackoverflow.com/questions/588958/what-are-the-drawbacks-of-stackless-python

To be fair also this:
https://stackoverflow.com/questions/377254/stackless-python-and-multicores
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-19 Thread Ethan Furman


On 06/19/2017 07:10 AM, Robin Becker wrote:


I have always found the management of reference counts to be one of the hardest 
things about the C api.  I'm not sure
exactly how C extensions would/should interact with a GC python. There seem to be 
different approaches eg lua & go are
both GC languages but seem different in how C/GC memory should interact.


The conversation would be easier if the proper terms were used.  Reference counting is a valid garbage collecting 
mechanism, therefore Python is also a GC language.


--
~Ethan~
--
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-19 Thread Robin Becker


On 19/06/2017 01:20, Paul Rubin wrote:
...

the existing C API quite seriously.  Reworking the C modules in the
stdlib would be a large but not impossible undertaking.  The many
external C modules out there would be more of an issue.

I have always found the management of reference counts to be one of the hardest 
things about the C api.  I'm not sure exactly how C extensions would/should 
interact with a GC python. There seem to be different approaches eg lua & go are 
both GC languages but seem different in how C/GC memory should interact.

--
Robin Becker

--
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-18 Thread Paul Rubin

I always thought the GIL removal obstacle was the need to put locks
around every refcount adjustment, and the only real cure for that is to
use a tracing GC.  That is a good idea in many ways, but it would break
the existing C API quite seriously.  Reworking the C modules in the
stdlib would be a large but not impossible undertaking.  The many
external C modules out there would be more of an issue.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-13 Thread Skip Montanaro

On Tue, Jun 13, 2017 at 1:53 PM, Terry Reedy  wrote:

> This was tried at least once, perhaps 15 years ago.


Yes, I believe Greg Smith (?) implemented a proof-of-concept in about the
Python 1.4 timeframe. The observation at the time was that it slowed down
single-threaded programs too much to be accepted as it existed then. That
remains the primary bugaboo as I understand it. It seems Larry has pushed
the envelope a fair bit farther, but there are still problems.

I don't know if the Gilectomy code changes are too great to live along the
mainline branches, but I wonder if having a bleeding-edge-gilectomy branch
in Git (maintained alongside the regular stuff, but not formally released)
would

a) help it stay in sync better with CPython
b) expose the changes to more people, especially extension module authors

Combined, the two might make it so the GIL-free branch isn't always playing
catchup (because of 'a') and more extension modules get tweaked to work
properly in a GIL-free world (because of 'b'). I imagine Larry Hastings has
given the idea some consideration.

Skip
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-13 Thread Terry Reedy

On 6/13/2017 12:09 PM, Robin Becker wrote:

On 11/06/2017 07:27, Steve D'Aprano wrote:

I'm tired of people complaining about the GIL as a "mistake" without
acknowledging that it exists for a reason.

I thought we were also consenting adults about problems arising from bad 
extensions. The GIL is a blocker for cpython's ability to use multi-core 
cpus.

When using threads, not when using multiple processes.

> The contention issues all arise from reference counting. Newer
> languages like go seem to prefer the garbage collection approach.
> Perhaps someone should try a reference-countectomy,

This was tried at least once, perhaps 15 years ago.

--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-13 Thread Skip Montanaro

On Tue, Jun 13, 2017 at 11:09 AM, Robin Becker  wrote:

> I looked at Larry's talk with interest. The GIL is not a requirement as he
> pointed out at the end, both IronPython and Jython don't need it.

But they don't support CPython's extension module API either, I don't
think. (I imagine that might have been the point of your reference.)

Skip
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-13 Thread Robin Becker


On 11/06/2017 07:27, Steve D'Aprano wrote:



I'm tired of people complaining about the GIL as a "mistake" without
acknowledging that it exists for a reason.



I thought we were also consenting adults about problems arising from bad 
extensions. The GIL is a blocker for cpython's ability to use multi-core cpus.


I looked at Larry's talk with interest. The GIL is not a requirement as he 
pointed out at the end, both IronPython and Jython don't need it.


That said I think the approach he outlined is probably wrong unless we attach a 
very high weight to preserving the current extension interface. C extensions are 
a real nuisance.


The contention issues all arise from reference counting. Newer languages like go 
seem to prefer the garbage collection approach. Perhaps someone should try a 
reference-countectomy, but then they already have with other python implementations.

--
Robin Becker

--
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-11 Thread Steve D'Aprano

On Sun, 11 Jun 2017 04:21 pm, Stefan Behnel wrote:

> Serhiy Storchaka schrieb am 11.06.2017 um 07:11:
 
>> And also GIL is used for guaranteeing atomicity of many operations and
>> consistencity of internal structures without using additional locks. Many
>> parts of the core and the stdlib would just not work correctly in
>> multithread environment without GIL.
> 
> And the same applies to external extension modules. The GIL is really handy
> when it comes to reasoning about safety and correctness of algorithms under
> the threat of thread concurrency. Especially in native code, where the
> result of an unanticipated race condition is usually a crash rather than an
> exception.


Thank you Stefan and Serhiy!

I'm tired of people complaining about the GIL as a "mistake" without
acknowledging that it exists for a reason.



-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-11 Thread Stefan Behnel

Serhiy Storchaka schrieb am 11.06.2017 um 07:11:
> 10.06.17 15:54, Steve D'Aprano пише:
>> Larry Hastings is working on removing the GIL from CPython:
>>
>> https://lwn.net/Articles/723949/
>>
>> For those who don't know the background:
>>
>> - The GIL (Global Interpreter Lock) is used to ensure that only one piece of
>> code can update references to an object at a time.
>>
>> - The downside of the GIL is that CPython cannot take advantage of
>> multiple CPU
>> cores effectively. Hence multi-threaded code is not as fast as it could be.
>>
>> - Past attempts to remove the GIL caused unacceptable slow-downs for
>> single-threaded programs and code run on single-core CPUs.
>>
>> - And also failed to show the expected performance gains for multi-threaded
>> programs on multi-core CPUs. (There was some gain, but not much.)
>>
>>
>> Thanks Larry for your experiments on this!
> 
> And also GIL is used for guaranteeing atomicity of many operations and
> consistencity of internal structures without using additional locks. Many
> parts of the core and the stdlib would just not work correctly in
> multithread environment without GIL.

And the same applies to external extension modules. The GIL is really handy
when it comes to reasoning about safety and correctness of algorithms under
the threat of thread concurrency. Especially in native code, where the
result of an unanticipated race condition is usually a crash rather than an
exception.

Stefan

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-10 Thread Serhiy Storchaka


10.06.17 15:54, Steve D'Aprano пише:

Larry Hastings is working on removing the GIL from CPython:

https://lwn.net/Articles/723949/


For those who don't know the background:

- The GIL (Global Interpreter Lock) is used to ensure that only one piece of
code can update references to an object at a time.

- The downside of the GIL is that CPython cannot take advantage of multiple CPU
cores effectively. Hence multi-threaded code is not as fast as it could be.

- Past attempts to remove the GIL caused unacceptable slow-downs for
single-threaded programs and code run on single-core CPUs.

- And also failed to show the expected performance gains for multi-threaded
programs on multi-core CPUs. (There was some gain, but not much.)


Thanks Larry for your experiments on this!


And also GIL is used for guaranteeing atomicity of many operations and 
consistencity of internal structures without using additional locks. 
Many parts of the core and the stdlib would just not work correctly in 
multithread environment without GIL.


--
https://mail.python.org/mailman/listinfo/python-list

Re: Progress on the Gilectomy

2017-06-10 Thread Irmen de Jong

On 10-6-2017 14:54, Steve D'Aprano wrote:
> Larry Hastings is working on removing the GIL from CPython:
> 
> https://lwn.net/Articles/723949/


Here is Larry's "How's it going" presentation from Pycon 2017 on this subject
https://www.youtube.com/watch?v=pLqv11ScGsQ

-irmen
-- 
https://mail.python.org/mailman/listinfo/python-list

Progress on the Gilectomy

2017-06-10 Thread Steve D'Aprano

Larry Hastings is working on removing the GIL from CPython:

https://lwn.net/Articles/723949/


For those who don't know the background:

- The GIL (Global Interpreter Lock) is used to ensure that only one piece of
code can update references to an object at a time.

- The downside of the GIL is that CPython cannot take advantage of multiple CPU
cores effectively. Hence multi-threaded code is not as fast as it could be.

- Past attempts to remove the GIL caused unacceptable slow-downs for
single-threaded programs and code run on single-core CPUs.

- And also failed to show the expected performance gains for multi-threaded
programs on multi-core CPUs. (There was some gain, but not much.)


Thanks Larry for your experiments on this!





-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

-- 
https://mail.python.org/mailman/listinfo/python-list

51 matches

Mail list logo