Re: Garbage collection (was Re: JWZ on s/Java/Perl/)
At 11:36 PM 2/11/2001 -0500, Sam Tregar wrote: >On Sun, 11 Feb 2001, Jan Dubois wrote: > > > However, I couldn't solve the problem of "deterministic destruction > > behavior": Currently Perl will call DESTROY on any object as soon as the > > last reference to it goes out of scope. This becomes important if the > > object own scarce external resources (e.g. file handles or database > > connections) that are only freed during DESTROY. Postponing DESTROY until > > an indeterminate time in the future can lead to program failures due to > > resource exhaustion. > >Well put. Can we finally admit that if we want Perl to DWIM with respect >to DESTROY that we need to keep counting references? Perl needs some level of tracking for objects with finalization attached to them. Full refcounting isn't required, however. Also, the vast majority of perl variables have no finalization attached to them. I do wish people would get garbage collection and finalization split in their minds. They are two separate things which can, and will, be dealt with separately. For the record: THE GARBAGE COLLECTOR WILL HAVE NOTHING TO DO WITH FINALIZATION, AND NO PERL OBJECT CODE WILL BE CALLED FOR VARIABLES UNDERGOING GARBAGE COLLECTION. Thank you. I do wish this stuff would flare up during the week... >Speaking of which, do any of the high priests know when Larry might come >down off the mountain? Any day now the true believers are going to melt >down their copies of Camel III and cast themselves a golden Python. The Cabal Magic 5-ball says "Outlook cloudy, try again later". Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Garbage collection (was Re: JWZ on s/Java/Perl/)
On Sun, 11 Feb 2001, Jan Dubois wrote: > However, I couldn't solve the problem of "deterministic destruction > behavior": Currently Perl will call DESTROY on any object as soon as the > last reference to it goes out of scope. This becomes important if the > object own scarce external resources (e.g. file handles or database > connections) that are only freed during DESTROY. Postponing DESTROY until > an indeterminate time in the future can lead to program failures due to > resource exhaustion. Well put. Can we finally admit that if we want Perl to DWIM with respect to DESTROY that we need to keep counting references? I certainly hope so. I think this research project has gone on long enough. If Larry comes back and says that DESTROYs can be called in a non-deterministic fashion I'll be very surprised and we can certainly revive the GC debate then. Until then I think we might as well accept that ref-counting is here to stay. Speaking of which, do any of the high priests know when Larry might come down off the mountain? Any day now the true believers are going to melt down their copies of Camel III and cast themselves a golden Python. -sam
Re: Garbage collection
crossed to -internals Jan Dubois: > Not necessarily; you would have to implement it that way: When you try to > open a file and you don't succeed, you run the garbage collector and try > again. But what happens in the case of XS code: some external library > tries to open a file and gets a failure. How would it trigger a GC in the > Perl internals? It wouldn't know a thing that it had been embedded in a > Perl app. But that would be the point of the API, no? Even in XS, you'd interface through perl for memory or file management. So the core would still be able to invoke the GC. Granted, these are last-ditch efforts anyway - what would really be needed to trigger? E[MN]FILE? ENOMEM? Weird cases of ENOSPC? If you happen to hit one, force a GC pass, and retry whatever the call was. Even if the GC is unsuccessful (at resource reclamation), wouldn't you still want Perl to panic, vice the XS code anyway? > > This scheme would only work if *all* resources including memory and > garbage collection are handled by the OS (or at least by a virtual machine > like JVM or .NET runtime). But this still doesn't solve the destruction > order problem. Well, no. My thought would be if A needed to be destroyed before B, then B wouldn't/shouldn't be marked for GC until after A was destroyed. It might take several sweeps to clean an entire dependency tree, unfortunately. -- Bryan C. Warnock bwarnock@(gtemail.net|capita.com)
Re: Garbage collection (was Re: JWZ on s/Java/Perl/)
On Sun, 11 Feb 2001 21:11:09 -0500, "Bryan C. Warnock" <[EMAIL PROTECTED]> wrote: >On Sunday 11 February 2001 19:08, Jan Dubois wrote: >> However, I couldn't solve the problem of "deterministic destruction >> behavior": Currently Perl will call DESTROY on any object as soon as the >> last reference to it goes out of scope. This becomes important if the >> object own scarce external resources (e.g. file handles or database >> connections) that are only freed during DESTROY. Postponing DESTROY until >> an indeterminate time in the future can lead to program failures due to >> resource exhaustion. > >But doesn't resource exhaustion usually trigger garbage collection and >resource reallocation? (Not that this addresses the remainder of your >post.) Not necessarily; you would have to implement it that way: When you try to open a file and you don't succeed, you run the garbage collector and try again. But what happens in the case of XS code: some external library tries to open a file and gets a failure. How would it trigger a GC in the Perl internals? It wouldn't know a thing that it had been embedded in a Perl app. This scheme would only work if *all* resources including memory and garbage collection are handled by the OS (or at least by a virtual machine like JVM or .NET runtime). But this still doesn't solve the destruction order problem. -Jan
Re: Garbage collection (was Re: JWZ on s/Java/Perl/)
On Sunday 11 February 2001 19:08, Jan Dubois wrote: > However, I couldn't solve the problem of "deterministic destruction > behavior": Currently Perl will call DESTROY on any object as soon as the > last reference to it goes out of scope. This becomes important if the > object own scarce external resources (e.g. file handles or database > connections) that are only freed during DESTROY. Postponing DESTROY until > an indeterminate time in the future can lead to program failures due to > resource exhaustion. But doesn't resource exhaustion usually trigger garbage collection and resource reallocation? (Not that this addresses the remainder of your post.) -- Bryan C. Warnock bwarnock@(gtemail.net|capita.com)
Re: Garbage collection (was Re: JWZ on s/Java/Perl/)
On Fri, 09 Feb 2001 13:19:36 -0500, Dan Sugalski <[EMAIL PROTECTED]> wrote: >Almost all refcounting schemes are messy. That's one of its problems. A >mark and sweep GC system tends to be less prone to leaks because of program >bugs, and when it *does* leak, the leaks tend to be large. Plus the code to >do the GC work is very localized, which tends not to be the case in >refcounting schemes. > >Going to a more advanced garbage collection scheme certainly isn't a >universal panacea--mark and sweep in perl 6 will *not* bring about world >peace or anything. It will (hopefully) make our lives easier, though. I currently don't have much time to follow the perl6 discussions, so I might have missed this, but I have some questions about abandoning reference counts for Perl internals. When I reimplemented some of the Perl guts in C# last year for the 'Perl for .NET" research project, I tried to get rid of reference counting because the runtime already provides a generational garbage collection scheme. However, I couldn't solve the problem of "deterministic destruction behavior": Currently Perl will call DESTROY on any object as soon as the last reference to it goes out of scope. This becomes important if the object own scarce external resources (e.g. file handles or database connections) that are only freed during DESTROY. Postponing DESTROY until an indeterminate time in the future can lead to program failures due to resource exhaustion. The second problem is destruction order: With reference counts you can have a dependency graph between objects. Without them destruction can only appear in random order, which sometimes is a problem: You may have a database connection and a recordset. The recordset may need to be DESTROYed first because it may contain unsaved data that still needs to be written back to the database. I've been discussing this with Sarathy multiple times over the last year, and he insists that relying on DESTROY for resource cleanup is bad style and shouldn't be done anyways. But always explicitly calling e.g. Close() or whatever is pretty messy at the application level: you have to use eval{} blocks all over the place to guarantee calling Close() even when something else blows up. As an implementer I most definitely see the advantages of giving up deterministic destruction behavior to random sequences of finalizer calls. But as a Perl programmer I loathe the additional complexity for my Perl programs to make them robust. There is a reason memory allocation isn't exposed to the user either. :-) Have these issues been discussed somewhere for Perl6? If yes, could you point me to that discussion? -Jan
Re: JWZ on s/Java/Perl/
[Please be careful with attributions -- I didn't write any of the quoted material...] Russ Allbery wrote: > >> sub test { > >> my($foo, $bar, %baz); > >> ... > >> return \%baz; > >> } > That's a pretty fundamental aspect of the Perl language; I use that sort > of construct all over the place. We don't want to turn Perl into C, where > if you want to return anything non-trivial without allocation you have to > pass in somewhere to put it. There's no problems at all with that code. It's not going to break under Perl 6. It's not going to be deprecated -- this is one of the ultimate Keep Perl Perl language features! I think that there's a lot of concern and confusion about what it means to replace perl's current memory manager (aka garbage collector) with something else. The short-term survival guide for dealing with this is "only believe what Dan says." The longer-term guide is "only believe what Benchmark says." There are only three Perl-visible features of a collector that I can think of (besides the obvious "does it work?"): 1. How fast does it run? 2. How efficient is it? (i.e. what's the overhead?) 3. When does it call object destructors? The first two are too early to talk about, but if Perl 6 is worse than Perl 5 something is seriously wrong. The last has never been defined in Perl, but it's definitely something to discuss before the internals are written. Changing it could be a *major* job. - Ken
Re: JWZ on s/Java/Perl/
Bart Lateur wrote: > On Fri, 09 Feb 2001 12:06:12 -0500, Ken Fox wrote: > > 1. Cheap allocations. Most fast collectors have a one or two > >instruction malloc. In C it looks like this: > > > > void *malloc(size) { void *obj = heap; heap += size; return obj; } > > ... > > That is not a garbage collector. I said it was an allocator not a garbage collector. An advanced garbage collector just makes very simple/fast allocators possible. > That is "drop everything you don't need, and we'll never use it > again." Oh, sure, not doing garbage collection at all is faster then > doing reference counting. You don't have a clue. The allocator I posted is a very common allocator used with copying garbage collectors. This is *not* a "pool" allocator like Apache uses. What happens is when the heap fills up (probably on a seg fault triggered by using an obj outside the current address space), the collector is triggered. It traverses live data and copies it into a new space (in a simple copying collector these are called "from" and "to" spaces). Generational collectors often work similarly, but they have more than two spaces and special rules for references between spaces. > > 2. Work proportional to live data, not total data. This is hard to > >believe for a C programmer, but good garbage collectors don't have > >to "free" every allocation -- they just have to preserve the live, > >or reachable, data. Some researchers have estimated that 90% or > >more of all allocated data dies (becomes unreachable) before the > >next collection. A ref count system has to work on every object, > >but smarter collectors only work on 10% of the objects. > > That may work for C, but not for Perl. Um, no. It works pretty well for Lisp, ML, Prolog, etc. I'm positive that it would work fine for Perl too. > sub test { > my($foo, $bar, %baz); > ... > return \%baz; > } > > You may notice that only PART of the locally malloced memory, gets > freed. the memory of %baz may well be in the middle of that pool. You're > making a huge mistake if you simply declare the whole block dead weight. You don't understand how collectors work. You can't think about individual allocations anymore -- that's a fundamental and severe restriction on malloc(). What happens is that the garbage accumulates until a collection happens. When the collection happens, live data is saved and the garbage over-written. In your example above, the memory for $foo and $bar is not reclaimed until a collection occurs. %baz is live data and will be saved when the collection occurs (often done by copying it to a new heap space). Yes, this means it is *totally* unsafe to hold pointers to objects in places the garbage collector doesn't know about. It also means that memory working-set sizes may be larger than with a malloc-style system. There are lots of advantages though -- re-read my previous note. The one big down-side to non-ref count GC is that finalization is delayed until collection -- which may be relatively infrequently when there's lots of memory. Data flow analysis can allow us to trigger finalizers earlier, but that's a lot harder than just watching a ref count. - Ken
Re: Auto-install (was autoloaded...)
You should probably also take a look a Debian's packaging, the .deb. It consists of an ar archive containing three files: one for the magic (named debian-binary, containing "2.0"), one for the filesystem image (filesystem.tar.gz) On Fri, Feb 09, 2001 at 06:17:34PM -0200, Branden wrote: > | Platform independent | Yes | Yes | Yes | Yes. > | Available in a wide | Yes | No | Yes | > | range of platforms | | (Win32 +/-, | | > | | | MacOS, VMS) | | No -- only debian, but that includes several HW archs, and both linux and the hurd. But source should be portable and abstracted decently. > | Allow platform | Yes | Yes | No | > | dependent deployment | | | | Yes. > | Supports binary, | Yes | Yes | No | > | source and bytecode | | | (source?) | Yes. Source format is .dsc (metadata) + .tar.gz (upstream) + .patch.gz (debian patches). Keeping that would allow for many CPAN packages to be used unmodified. Not keeping it would allow for single-file distribution. > | Install archive | Yes | Yes | No | > | automatically| | | (manually) | Yes. > | Uninstall and| Yes | Yes | No | > | upgrade archive | | | | Yes. > | Install, uninstall | No | Yes | No | > | and upgrade scripts | (possibly) | | | Yes. > | Run from archive | Yes | No | Yes | No, but certianly possible. (Replace the .tar.gz files with .zips. Worse compression but easyer to use individual files. We could do .bz2.tar or somesuch, but that's nastyer for others to deal with.) > | Resources| Yes | Yes | Yes | Yes, I think. (Not certian what you mean by this.) > | Documentation| Yes | Yes | No | Yes. > | Supports various | Yes | No | Yes | > | modules per archive | |(yes)| (packages) | No (one file, one package). This could easily be changed, though. > | Merge many archives | Yes | No | Yes | > | in one | | | | No, but wouldn't be hard with small extention. > | Usable with external | Yes | No | Yes | > | tools (e.g. WinZip) | | | | Yes, with a little pain (ar archive + .tar.gz files). > | Dependencies of | Yes | Yes | No | > | the archive | (included) | | | Yes. Complex dependencies supported (versions between A and B), and support for autogeneration of dependencies in many cases. > | Build archive from | Yes | Yes | No | > | source tree | | (external) | | Yes. > | Could be bundled | Yes | Probably | Maybe (if | > | with Perl 6? | | No | we bundle | > | | | (too big) | a JVM too) | Yes. (I think.) (Binary of format-handling program is 110,288 -- a bit on the big side, but not too bad.) > | Signed archives | No | No | Yes | No. (Source packages are signed, though.) (At present, feature is planned for future, and shouldn't be all that hard.) -=- James Mastros -- "All I really want is somebody to curl up with and pretend the world is a safe place." AIM: theorbtwo homepage: http://www.rtweb.net/theorb/
Re: JWZ on s/Java/Perl/
On Fri, 9 Feb 2001 16:14:34 -0800, Mark Koopman wrote: >but is this an example of the way people SHOULD code, or simply are ABLE to >code this. are we considering to deprecate this type of bad style, and force >to a programmer to, in this case, supply a ref to %baz in the arguements to >this sub? I think you're trying too hard turning Perl into just another C clone. Dynamic variable allocation and freeing, like this, are one of the main selling points for Perl as a language. Note that %baz can, as values, also contain references to other lexically scoped varibles, like \$foo and \$bar. No prototping around that. >> sub test { >> my($foo, $bar, %baz); >> ... >> return \%baz; >> } You could, theoretically, create special versions of "my", or a "my" with an attribute, so that these declared variables are kept out of the normal lexical pool, and garbage collected in a more elaborate way, perhaps even reference counting. -- Bart.