Re: [HACKERS] Experimental dynamic memory allocation of postgresql shared memory
On Mon, Jun 20, 2016 at 12:40 PM, Craig Ringer wrote: > On 18 June 2016 at 02:42, Robert Haas wrote: >> >> On Fri, Jun 17, 2016 at 2:23 PM, Aleksey Demakov >> wrote: >> > Essentially this is pessimizing for the lowest common denominator >> > among OSes. >> >> I totally agree. That's how we make the server portable. >> >> > Having a contiguous address space makes things so >> > much simpler that considering this case, IMHO, is well worth of it. >> >> I think that would be great if you could make it work, but it has to >> support Linux, Windows (all supported versions), MacOS X, all the >> various BSD flavors for which we have buildfarm animals, and other >> platforms that we currently run on like HP-UX. If you come up with a >> solution that works for this on all of those platforms, I will shake >> your hand. But I think that's probably impossible, or at least >> really, really hard. > > > Indeed. In particular, ASLR on Windows or anywhere we EXEC_BACKEND will > cause difficulties attaching to those segments. ASLR that we currently disable in the build because Win8/2k12 and newer versions behaves differently than past OSes in the address mapping, making the problem even harder if we'd want to have both working. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Experimental dynamic memory allocation of postgresql shared memory
On 18 June 2016 at 02:42, Robert Haas wrote: > On Fri, Jun 17, 2016 at 2:23 PM, Aleksey Demakov > wrote: > > Essentially this is pessimizing for the lowest common denominator > > among OSes. > > I totally agree. That's how we make the server portable. > > > Having a contiguous address space makes things so > > much simpler that considering this case, IMHO, is well worth of it. > > I think that would be great if you could make it work, but it has to > support Linux, Windows (all supported versions), MacOS X, all the > various BSD flavors for which we have buildfarm animals, and other > platforms that we currently run on like HP-UX. If you come up with a > solution that works for this on all of those platforms, I will shake > your hand. But I think that's probably impossible, or at least > really, really hard. > Indeed. In particular, ASLR on Windows or anywhere we EXEC_BACKEND will cause difficuties attaching to those segments. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] Experimental dynamic memory allocation of postgresql shared memory
On Sat, Jun 18, 2016 at 3:43 AM, Tom Lane wrote: > DSM already exists, and for many purposes its lack of a > within-a-shmem-segment dynamic allocator is irrelevant; the same purpose > is served (with more speed, more reliability, and less code) by releasing > the whole DSM segment when no longer needed. The DSM segment effectively > acts like a memory context, saving code from having to account precisely > for every single allocation it makes. > > I grant that having a dynamic allocator added to DSM will support even > more use-cases. What I'm not convinced of is that we need a dynamic > allocator within the fixed-size shmem segment. Robert already listed some > reasons why that's rather dubious, but I'll add one more: any leak becomes > a really serious bug, because the only way to recover the space is to > restart the whole database instance. > Okay, if you say that DSM segments work the best for accumulating transient data that may be freed together when it becomes unnecessary at once, then I agree with that. My code is for long-living data that could be allocated and freed chunk by chunk. As if an extension wants to store more data and in more complicated fashion than fits to an ordinary dynahash with the HASH_SHARED_MEM flag. Regards, Aleksey -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Experimental dynamic memory allocation of postgresql shared memory
Aleksey Demakov writes: > On Sat, Jun 18, 2016 at 12:45 AM, Tom Lane wrote: >> You're right, but that doesn't mean that the community is going to take >> much interest in an unportable replacement for code that already exists. > Excuse me, what code already exists? As far as I understand, we > compare the approach taken in my code against Robert's code that > is not yet available to the community. DSM already exists, and for many purposes its lack of a within-a-shmem-segment dynamic allocator is irrelevant; the same purpose is served (with more speed, more reliability, and less code) by releasing the whole DSM segment when no longer needed. The DSM segment effectively acts like a memory context, saving code from having to account precisely for every single allocation it makes. I grant that having a dynamic allocator added to DSM will support even more use-cases. What I'm not convinced of is that we need a dynamic allocator within the fixed-size shmem segment. Robert already listed some reasons why that's rather dubious, but I'll add one more: any leak becomes a really serious bug, because the only way to recover the space is to restart the whole database instance. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Experimental dynamic memory allocation of postgresql shared memory
On Sat, Jun 18, 2016 at 12:45 AM, Tom Lane wrote: > Aleksey Demakov writes: >> On Fri, Jun 17, 2016 at 10:54 PM, Robert Haas wrote: >>> In my opinion, that's not going to fly. If I thought otherwise, I >>> would not have developed the DSM facility in the first place. > >> Essentially this is pessimizing for the lowest common denominator >> among OSes. > > You're right, but that doesn't mean that the community is going to take > much interest in an unportable replacement for code that already exists. Excuse me, what code already exists? As far as I understand, we compare the approach taken in my code against Robert's code that is not yet available to the community. Discussing DSM is beyond the point. My code might be smoothly hooked into the existing system from an extension module just with a couple of calls: RequestAddinShmemSpace() and ShmemInitStruct(). After that this extension might use my concurrent memory allocator and safe memory reclamation for implementing highly optimized concurrent data structures of their choice. E.g. concurrent data structures that I am going to add to the package in the future. All in all, currently this is not a replacement for anything. This is an experimental add-on and a food for thought for interested people. Integrating my code right into the core to replace anything there is a very remote possibility. I understand if it ever happens it would take very serious work and multiple iterations. > Especially not an unportable replacement that also needs sweeping > assumptions like "disciplined use of mmap in postgresql core and > extensions". You don't have to look further than the availability of > mmap to plperlu programmers to realize that that won't fly. (Even if > we threw all the untrusted PLs overboard, I believe plain old stdio > is willing to use mmap in many versions of libc.) > Sorry. I made a sloppy statement about mmap/munmap use. As correctly pointed out by Andres Freund, it is problematic. So the whole line about "disciplined use of mmap in postgresql core and extensions" goes away. Forget it. But the other techniques that I mentioned do not take such a special discipline. The corrected statement is that a single contiguous shared space is practically doable on many platforms with some effort. And this approach would make implementation of many shared data structures more efficient. Furthermore, I'd guess there is no much point to enable parallel query execution on a macbook. Or at least one wouldn't expect superb results from this anyway. I'd make a wild claim that users who would benefit from parallel queries or my concurrency work most of the time are the same users who run platforms that can support single address space. Thus if there is a solution that benefits e.g. 95% of target users then why refrain from it in the name of the other 5%? Should not the support of those 5% be treated as a lower-priority fallback, while the main effort be put on optimizing for 95-percenters? Regards, Aleksey -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Experimental dynamic memory allocation of postgresql shared memory
Aleksey Demakov writes: > On Fri, Jun 17, 2016 at 10:54 PM, Robert Haas wrote: >> On Fri, Jun 17, 2016 at 12:34 PM, Aleksey Demakov wrote: >>> I believe it would be perfectly okay to allocate huge amount of address >>> space with mmap on startup. If the pages are not touched, the OS VM >>> subsystem will not commit them. >> In my opinion, that's not going to fly. If I thought otherwise, I >> would not have developed the DSM facility in the first place. > Essentially this is pessimizing for the lowest common denominator > among OSes. You're right, but that doesn't mean that the community is going to take much interest in an unportable replacement for code that already exists. Especially not an unportable replacement that also needs sweeping assumptions like "disciplined use of mmap in postgresql core and extensions". You don't have to look further than the availability of mmap to plperlu programmers to realize that that won't fly. (Even if we threw all the untrusted PLs overboard, I believe plain old stdio is willing to use mmap in many versions of libc.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Experimental dynamic memory allocation of postgresql shared memory
On Fri, Jun 17, 2016 at 2:23 PM, Aleksey Demakov wrote: > Essentially this is pessimizing for the lowest common denominator > among OSes. I totally agree. That's how we make the server portable. > Having a contiguous address space makes things so > much simpler that considering this case, IMHO, is well worth of it. I think that would be great if you could make it work, but it has to support Linux, Windows (all supported versions), MacOS X, all the various BSD flavors for which we have buildfarm animals, and other platforms that we currently run on like HP-UX. If you come up with a solution that works for this on all of those platforms, I will shake your hand. But I think that's probably impossible, or at least really, really hard. > You are right that this might highly depend on the OS. But you are > only partially right that it's impossible to give the memory back once > you touched it. It is possible in many cases with additional measures. > That is with additional control over memory mapping. Surprisingly, in > this case windows has the most straightforward solution. VirtualAlloc > has separate MEM_RESERVE and MEM_COMMIT flags. On various > Unix flavours it is possible to play with mmap MAP_NORESERVE > flag and madvise syscall. Finally, it's possible to repeatedly mmap > and munmap on portions of a contiguous address space providing > a given addr argument for both of them. The last option might, of > course, is susceptible to hijacking this portion of the address by an > inadvertent caller of mmap with NULL addr argument. But probably > this could be avoided by imposing a disciplined use of mmap in > postgresql core and extensions. I have never understood how mmap() with a non-NULL argument could be anything but a giant foot-gun. If the operation system positions a shared library or your process stack or anything else in the chosen address range, you are dead. I do agree that there are a bunch of other tools that could be used on various platforms, but the need to have a cross-platform solution for anything that goes into core makes this very hard. > Thus providing a single contiguous shared address space is doable. Not convinced. > The other question is how much it would buy. As for development > time of an allocator it is a clear win. In terms of easy passing direct > memory pointers between backends this a clear win again. I agree it would be a huge win if it could be done. > In terms of resulting performance, I don't know. This would take > a few cycles on every step. You have a shared hash table. You > cannot keep pointers there. You need to store offsets against the > base address. Any reference would involve additional arithmetics. > When these things add up, the net effect might become noticeable. I'm sure it's going to be somewhat slower, but I think that's just a tax that we have to pay for using processes rather than threads. I think it's still going to be fast enough to do plenty of cool stuff. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Experimental dynamic memory allocation of postgresql shared memory
On Fri, Jun 17, 2016 at 2:23 PM, Aleksey Demakov wrote: > On Fri, Jun 17, 2016 at 10:54 PM, Robert Haas > wrote: > > On Fri, Jun 17, 2016 at 12:34 PM, Aleksey Demakov > wrote: > >>> I expect that to be useful for parallel query and anything else where > >>> processes need to share variable-size data. However, that's different > >>> from this because ours can grown to arbitrary size and shrink again by > >>> allocating and freeing with DSM segments. We also do everything with > >>> relative pointers since DSM segments can be mapped at different > >>> addresses in different processes, whereas this would only work with > >>> memory carved out of the main shared memory segment (or some new DSM > >>> facility that guaranteed identical placement in every address space). > >>> > >> > >> I believe it would be perfectly okay to allocate huge amount of address > >> space with mmap on startup. If the pages are not touched, the OS VM > >> subsystem will not commit them. > > > > In my opinion, that's not going to fly. If I thought otherwise, I > > would not have developed the DSM facility in the first place. > > > > First, the behavior in this area is highly dependent on choice of > > operating system and configuration parameters. We've had plenty of > > experience with requiring non-default configuration parameters to run > > PostgreSQL, and it's all bad. I don't really want to have to tell > > users that they must run with a particular value of > > vm.overcommit_memory in order to run the server. Nor do I want to > > tell users of other operating systems that their ability to run > > PostgreSQL is dependent on the behavior their OS has in this area. I > > had a MacBook Pro up until a year or two ago where a sufficiently > > shared memory request would cause a kernel panic. That bug will > > probably be fixed at some point if it hasn't been already, but > > probably by returning an error rather than making it work. > > > > Second, there's no way to give memory back once you've touched it. If > > you decide to do a hash join on a 250GB inner table using a shared > > hash table, you're going to have 250GB in swap-backed pages floating > > around when you're done. If the user has swap configured (and more > > and more people don't), the operating system will eventually page > > those out, but until that happens those pages are reducing the amount > > of page cache that's available, and after it happens they're using up > > swap. In either case, the space consumed is consumed to no purpose. > > You don't care about that hash table any more once the query > > completes; there's just no way to tell the operating system that. If > > your workload follows an entirely predictable pattern and you always > > have about the same amount of usage of this facility then you can just > > reuse the same pages and everything is fine. But if your usage > > fluctuates I believe it will be a big problem. With DSM, we can and > > do explicitly free the memory back to the OS as soon as we don't need > > it any more - and that's a big benefit. > > > > Essentially this is pessimizing for the lowest common denominator > among OSes. Having a contiguous address space makes things so > much simpler that considering this case, IMHO, is well worth of it. > > Given PostgreSQL's goals regarding multi-platform operation it would seem that at minimum there needs to be an implementation available that indeed has these properties. Improving our current base implementation within these guidelines would be nice since everyone would benefit from the work and the net amount of code is going to be reasonable since the old stuff will likely be removed while the new stuff is being added. While platform dependent default configuration parameters are undesirable enabling better but less widely usable algorithms seems to be one use for compile-time options. Is this arena amenable to such swapping out of behavior at compile time? David J.
Re: [HACKERS] Experimental dynamic memory allocation of postgresql shared memory
On Sat, Jun 18, 2016 at 12:31 AM, Andres Freund wrote: > On 2016-06-18 00:23:14 +0600, Aleksey Demakov wrote: >> Finally, it's possible to repeatedly mmap >> and munmap on portions of a contiguous address space providing >> a given addr argument for both of them. The last option might, of >> course, is susceptible to hijacking this portion of the address by an >> inadvertent caller of mmap with NULL addr argument. But probably >> this could be avoided by imposing a disciplined use of mmap in >> postgresql core and extensions. > > I don't think that's particularly realistic. malloc() uses mmap(NULL) > internally. And you can't portably mmap non-file backed memory from > different processes; you need something like tmpfs backed / posix shared > memory / for it. On linux you can do stuff like madvise(MADV_FREE), > which kinda helps. Oops. Agreed. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Experimental dynamic memory allocation of postgresql shared memory
Sorry for unclear language. Late Friday evening in my place is to blame. On Sat, Jun 18, 2016 at 12:23 AM, Aleksey Demakov wrote: > On Fri, Jun 17, 2016 at 10:54 PM, Robert Haas wrote: >> On Fri, Jun 17, 2016 at 12:34 PM, Aleksey Demakov wrote: I expect that to be useful for parallel query and anything else where processes need to share variable-size data. However, that's different from this because ours can grown to arbitrary size and shrink again by allocating and freeing with DSM segments. We also do everything with relative pointers since DSM segments can be mapped at different addresses in different processes, whereas this would only work with memory carved out of the main shared memory segment (or some new DSM facility that guaranteed identical placement in every address space). >>> >>> I believe it would be perfectly okay to allocate huge amount of address >>> space with mmap on startup. If the pages are not touched, the OS VM >>> subsystem will not commit them. >> >> In my opinion, that's not going to fly. If I thought otherwise, I >> would not have developed the DSM facility in the first place. >> >> First, the behavior in this area is highly dependent on choice of >> operating system and configuration parameters. We've had plenty of >> experience with requiring non-default configuration parameters to run >> PostgreSQL, and it's all bad. I don't really want to have to tell >> users that they must run with a particular value of >> vm.overcommit_memory in order to run the server. Nor do I want to >> tell users of other operating systems that their ability to run >> PostgreSQL is dependent on the behavior their OS has in this area. I >> had a MacBook Pro up until a year or two ago where a sufficiently >> shared memory request would cause a kernel panic. That bug will >> probably be fixed at some point if it hasn't been already, but >> probably by returning an error rather than making it work. >> >> Second, there's no way to give memory back once you've touched it. If >> you decide to do a hash join on a 250GB inner table using a shared >> hash table, you're going to have 250GB in swap-backed pages floating >> around when you're done. If the user has swap configured (and more >> and more people don't), the operating system will eventually page >> those out, but until that happens those pages are reducing the amount >> of page cache that's available, and after it happens they're using up >> swap. In either case, the space consumed is consumed to no purpose. >> You don't care about that hash table any more once the query >> completes; there's just no way to tell the operating system that. If >> your workload follows an entirely predictable pattern and you always >> have about the same amount of usage of this facility then you can just >> reuse the same pages and everything is fine. But if your usage >> fluctuates I believe it will be a big problem. With DSM, we can and >> do explicitly free the memory back to the OS as soon as we don't need >> it any more - and that's a big benefit. >> > > Essentially this is pessimizing for the lowest common denominator > among OSes. Having a contiguous address space makes things so > much simpler that considering this case, IMHO, is well worth of it. > > You are right that this might highly depend on the OS. But you are > only partially right that it's impossible to give the memory back once > you touched it. It is possible in many cases with additional measures. > That is with additional control over memory mapping. Surprisingly, in > this case windows has the most straightforward solution. VirtualAlloc > has separate MEM_RESERVE and MEM_COMMIT flags. On various > Unix flavours it is possible to play with mmap MAP_NORESERVE > flag and madvise syscall. Finally, it's possible to repeatedly mmap > and munmap on portions of a contiguous address space providing > a given addr argument for both of them. The last option might, of > course, is susceptible to hijacking this portion of the address by an > inadvertent caller of mmap with NULL addr argument. But probably > this could be avoided by imposing a disciplined use of mmap in > postgresql core and extensions. > > Thus providing a single contiguous shared address space is doable. > The other question is how much it would buy. As for development > time of an allocator it is a clear win. In terms of easy passing direct > memory pointers between backends this a clear win again. > > In terms of resulting performance, I don't know. This would take > a few cycles on every step. You have a shared hash table. You > cannot keep pointers there. You need to store offsets against the > base address. Any reference would involve additional arithmetics. > When these things add up, the net effect might become noticeable. > > Regards, > Aleksey -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql
Re: [HACKERS] Experimental dynamic memory allocation of postgresql shared memory
On 2016-06-18 00:23:14 +0600, Aleksey Demakov wrote: > Finally, it's possible to repeatedly mmap > and munmap on portions of a contiguous address space providing > a given addr argument for both of them. The last option might, of > course, is susceptible to hijacking this portion of the address by an > inadvertent caller of mmap with NULL addr argument. But probably > this could be avoided by imposing a disciplined use of mmap in > postgresql core and extensions. I don't think that's particularly realistic. malloc() uses mmap(NULL) internally. And you can't portably mmap non-file backed memory from different processes; you need something like tmpfs backed / posix shared memory / for it. On linux you can do stuff like madvise(MADV_FREE), which kinda helps. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Experimental dynamic memory allocation of postgresql shared memory
On Fri, Jun 17, 2016 at 10:54 PM, Robert Haas wrote: > On Fri, Jun 17, 2016 at 12:34 PM, Aleksey Demakov wrote: >>> I expect that to be useful for parallel query and anything else where >>> processes need to share variable-size data. However, that's different >>> from this because ours can grown to arbitrary size and shrink again by >>> allocating and freeing with DSM segments. We also do everything with >>> relative pointers since DSM segments can be mapped at different >>> addresses in different processes, whereas this would only work with >>> memory carved out of the main shared memory segment (or some new DSM >>> facility that guaranteed identical placement in every address space). >>> >> >> I believe it would be perfectly okay to allocate huge amount of address >> space with mmap on startup. If the pages are not touched, the OS VM >> subsystem will not commit them. > > In my opinion, that's not going to fly. If I thought otherwise, I > would not have developed the DSM facility in the first place. > > First, the behavior in this area is highly dependent on choice of > operating system and configuration parameters. We've had plenty of > experience with requiring non-default configuration parameters to run > PostgreSQL, and it's all bad. I don't really want to have to tell > users that they must run with a particular value of > vm.overcommit_memory in order to run the server. Nor do I want to > tell users of other operating systems that their ability to run > PostgreSQL is dependent on the behavior their OS has in this area. I > had a MacBook Pro up until a year or two ago where a sufficiently > shared memory request would cause a kernel panic. That bug will > probably be fixed at some point if it hasn't been already, but > probably by returning an error rather than making it work. > > Second, there's no way to give memory back once you've touched it. If > you decide to do a hash join on a 250GB inner table using a shared > hash table, you're going to have 250GB in swap-backed pages floating > around when you're done. If the user has swap configured (and more > and more people don't), the operating system will eventually page > those out, but until that happens those pages are reducing the amount > of page cache that's available, and after it happens they're using up > swap. In either case, the space consumed is consumed to no purpose. > You don't care about that hash table any more once the query > completes; there's just no way to tell the operating system that. If > your workload follows an entirely predictable pattern and you always > have about the same amount of usage of this facility then you can just > reuse the same pages and everything is fine. But if your usage > fluctuates I believe it will be a big problem. With DSM, we can and > do explicitly free the memory back to the OS as soon as we don't need > it any more - and that's a big benefit. > Essentially this is pessimizing for the lowest common denominator among OSes. Having a contiguous address space makes things so much simpler that considering this case, IMHO, is well worth of it. You are right that this might highly depend on the OS. But you are only partially right that it's impossible to give the memory back once you touched it. It is possible in many cases with additional measures. That is with additional control over memory mapping. Surprisingly, in this case windows has the most straightforward solution. VirtualAlloc has separate MEM_RESERVE and MEM_COMMIT flags. On various Unix flavours it is possible to play with mmap MAP_NORESERVE flag and madvise syscall. Finally, it's possible to repeatedly mmap and munmap on portions of a contiguous address space providing a given addr argument for both of them. The last option might, of course, is susceptible to hijacking this portion of the address by an inadvertent caller of mmap with NULL addr argument. But probably this could be avoided by imposing a disciplined use of mmap in postgresql core and extensions. Thus providing a single contiguous shared address space is doable. The other question is how much it would buy. As for development time of an allocator it is a clear win. In terms of easy passing direct memory pointers between backends this a clear win again. In terms of resulting performance, I don't know. This would take a few cycles on every step. You have a shared hash table. You cannot keep pointers there. You need to store offsets against the base address. Any reference would involve additional arithmetics. When these things add up, the net effect might become noticeable. Regards, Aleksey -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Experimental dynamic memory allocation of postgresql shared memory
On Fri, Jun 17, 2016 at 12:34 PM, Aleksey Demakov wrote: >> I expect that to be useful for parallel query and anything else where >> processes need to share variable-size data. However, that's different >> from this because ours can grown to arbitrary size and shrink again by >> allocating and freeing with DSM segments. We also do everything with >> relative pointers since DSM segments can be mapped at different >> addresses in different processes, whereas this would only work with >> memory carved out of the main shared memory segment (or some new DSM >> facility that guaranteed identical placement in every address space). >> > > I believe it would be perfectly okay to allocate huge amount of address > space with mmap on startup. If the pages are not touched, the OS VM > subsystem will not commit them. In my opinion, that's not going to fly. If I thought otherwise, I would not have developed the DSM facility in the first place. First, the behavior in this area is highly dependent on choice of operating system and configuration parameters. We've had plenty of experience with requiring non-default configuration parameters to run PostgreSQL, and it's all bad. I don't really want to have to tell users that they must run with a particular value of vm.overcommit_memory in order to run the server. Nor do I want to tell users of other operating systems that their ability to run PostgreSQL is dependent on the behavior their OS has in this area. I had a MacBook Pro up until a year or two ago where a sufficiently shared memory request would cause a kernel panic. That bug will probably be fixed at some point if it hasn't been already, but probably by returning an error rather than making it work. Second, there's no way to give memory back once you've touched it. If you decide to do a hash join on a 250GB inner table using a shared hash table, you're going to have 250GB in swap-backed pages floating around when you're done. If the user has swap configured (and more and more people don't), the operating system will eventually page those out, but until that happens those pages are reducing the amount of page cache that's available, and after it happens they're using up swap. In either case, the space consumed is consumed to no purpose. You don't care about that hash table any more once the query completes; there's just no way to tell the operating system that. If your workload follows an entirely predictable pattern and you always have about the same amount of usage of this facility then you can just reuse the same pages and everything is fine. But if your usage fluctuates I believe it will be a big problem. With DSM, we can and do explicitly free the memory back to the OS as soon as we don't need it any more - and that's a big benefit. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Experimental dynamic memory allocation of postgresql shared memory
On Fri, Jun 17, 2016 at 10:18 PM, Robert Haas wrote: > On Fri, Jun 17, 2016 at 11:30 AM, Tom Lane wrote: > But I'm a bit confused about where it gets the bytes it wants to > manage. There's no call to dsm_create() or ShmemAlloc() anywhere in > the code, at least not that I could find quickly. The only way to get > shar_base set to a non-NULL value seems to be to call SharAttach(), > and if there's no SharCreate() where would we get that non-NULL value? > You are right, I just have to tidy up the initialisation code before publishing it. > I expect that to be useful for parallel query and anything else where > processes need to share variable-size data. However, that's different > from this because ours can grown to arbitrary size and shrink again by > allocating and freeing with DSM segments. We also do everything with > relative pointers since DSM segments can be mapped at different > addresses in different processes, whereas this would only work with > memory carved out of the main shared memory segment (or some new DSM > facility that guaranteed identical placement in every address space). > I believe it would be perfectly okay to allocate huge amount of address space with mmap on startup. If the pages are not touched, the OS VM subsystem will not commit them. > I've been a bit reluctant to put it out there > until we have a tangible application of the allocator working, for > fear people will say "that's not good for anything!". I'm confident > it's good for lots of things, but other people have been known not to > share my confidence. > This is what I've been told by Postgres Pro folks too. But I felt that this thing deserves to be shown to the community sooner rather than latter. Regards, Aleksey -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Experimental dynamic memory allocation of postgresql shared memory
On Fri, Jun 17, 2016 at 9:30 PM, Tom Lane wrote: > Aleksey Demakov writes: >> I have some very experimental code to enable dynamic memory allocation >> of shared memory for postgresql backend processes. > > Um ... what's this do that the existing DSM stuff doesn't do? > It operates over a single large shared memory segment. Within this segment it lets alloc / free small chunks of memory from 16 bytes to 16 kilobytes. Chunks are carved out from fixed-size 32k blocks. Each block is used to allocate chunks of single size class. When a block is full, another block for a given size class is taken from the top shared segment. The goal is to support high levels of concurrency for alloc / free calls. Therefore the allocator is mostly non-blocking. Currently it uses Heller's lazy list algorithm to maintain block lists of a given size class, so it uses slocks once in a while, when a new block is added or removed. If this proves to cause scalability problems the Heller's list might be replaced with Maged Michael's lock-free list to make the whole allocator absolutely lock-free. Additionally it provides epoch-based memory reclamation facility that solves ABA-problem for lock-free algorithms. I am going to implement some lock-free algorithms (extendable hash-tables and probably skip lists) on top of this facility. Regards, Aleksey -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Experimental dynamic memory allocation of postgresql shared memory
On Fri, Jun 17, 2016 at 11:30 AM, Tom Lane wrote: > Aleksey Demakov writes: >> I have some very experimental code to enable dynamic memory allocation >> of shared memory for postgresql backend processes. > > Um ... what's this do that the existing DSM stuff doesn't do? It seems to be a full-fledged allocator, rather than just a way of getting a slab of bytes from the operating system. Think malloc() rather than sbrk(). But I'm a bit confused about where it gets the bytes it wants to manage. There's no call to dsm_create() or ShmemAlloc() anywhere in the code, at least not that I could find quickly. The only way to get shar_base set to a non-NULL value seems to be to call SharAttach(), and if there's no SharCreate() where would we get that non-NULL value? EnterpriseDB is working on a memory allocator which will manage chunks of dynamic shared memory and provide an allocate/free interface to allow small allocations to be carved out of large DSM segments: https://wiki.postgresql.org/wiki/EnterpriseDB_database_server_roadmap I expect that to be useful for parallel query and anything else where processes need to share variable-size data. However, that's different from this because ours can grown to arbitrary size and shrink again by allocating and freeing with DSM segments. We also do everything with relative pointers since DSM segments can be mapped at different addresses in different processes, whereas this would only work with memory carved out of the main shared memory segment (or some new DSM facility that guaranteed identical placement in every address space). I expect we'll probably post our implementation of this shortly after 9.7 development opens. I've been a bit reluctant to put it out there until we have a tangible application of the allocator working, for fear people will say "that's not good for anything!". I'm confident it's good for lots of things, but other people have been known not to share my confidence. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Experimental dynamic memory allocation of postgresql shared memory
Aleksey Demakov writes: > I have some very experimental code to enable dynamic memory allocation > of shared memory for postgresql backend processes. Um ... what's this do that the existing DSM stuff doesn't do? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Experimental dynamic memory allocation of postgresql shared memory
Hi all, I have some very experimental code to enable dynamic memory allocation of shared memory for postgresql backend processes. The source code in the repository is not complete yet. Moreover it is not immediately useful by itself. However it might serve as the basis to implement higher-level features. Such as expanding hash-tables or other data structures to share data between backends. Ultimately it might be used for an in-memory data store usable via FDW interface. Despite such higher level features are not available yet the code anyway might be interesting for curious eyes. https://github.com/ademakov/sharena The first stage of this project was funded by Postgres Pro. Many thanks to this wonderful team. Regards, Aleksey -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers