Re: PostgreSQL performance on FreeBSD

2016-06-25 Thread Maxim Sobolev
Sean, to the issue that you are describing it is also might be possible to
do it some other way around. One, perhaps more portable, is to share a
connected socketpair between two communicating processes, so that you can
do non-blocking read on one of its ends from time to time and check if it
returns EOF. Which would be the case if whatever process holds the other
end of it is no longer there. So instead of shared memory segment, you can
have pool of descriptors, one for each worker that you care about. Polling
on those would be trivial with just regular poll(2). The only issue might
be that postgres forks a lot, so we would probably need to implement
FD_CLOFORK to avoid copying those extra fds into every child.

Something akin to a solution that I recently posted to work around problem
that you cannot really waitpid() on a grand-child see PG BUG #14199 for
details & patch.

But yes, it would be really nice to get rid of SYSV shared memory use in PG
completely as some point one way or another.

-Max

On Thu, Jun 23, 2016 at 3:42 PM, Sean Chittenden 
wrote:

> Small nit:
>
> PostgreSQL used SYSV because it allowed for the detection of dead
> processes.  If you `kill -9`’ed a process, PostgreSQL can detect that and
> then shut down and perform an automatic recovery.  In this regard, sysv is
> pretty clever.  The move to POSIX shared mem was done for a host of
> reasons, but it means that you don’t have to adjust your SYSV limits.  My
> understanding from a few years ago is that there is still a ~64KB SYSV
> memory segment that is still used to act as the latch to signal if a
> process was killed, but all of the shared buffers are stored in posix
> mmap’ed regions.
>
> At this point in time this could be replaced with kqueue(2) EVFILT_PROC,
> but no one has done that yet.
>
> -sc
>
>
>
> --
> Sean Chittenden
> s...@chittenden.org
>
> > On Jun 22, 2016, at 07:26 , Maxim Sobolev  wrote:
> >
> > Konstantin,
> >
> > Not if you do sem_unlink() immediately, AFAIK. And that's what PG does.
> So
> > the window of opportunity for the leakage is quite small, much smaller
> than
> > for SYSV primitives. Sorry for missing your status update message, I've
> > missed it somehow.
> >
> > 
> >mySem = sem_open(semname, O_CREAT | O_EXCL,
> > (mode_t) IPCProtection,
> > (unsigned) 1);
> >
> > #ifdef SEM_FAILED
> >if (mySem != (sem_t *) SEM_FAILED)
> >break;
> > #else
> >if (mySem != (sem_t *) (-1))
> >break;
> > #endif
> >
> >/* Loop if error indicates a collision */
> >if (errno == EEXIST || errno == EACCES || errno == EINTR)
> >continue;
> >
> >/*
> > * Else complain and abort
> > */
> >elog(FATAL, "sem_open(\"%s\") failed: %m", semname);
> >}
> >
> >/*
> > * Unlink the semaphore immediately, so it can't be accessed
> > externally.
> > * This also ensures that it will go away if we crash.
> > */
> >sem_unlink(semname);
> >
> >return mySem;
> > 
> >
> > -Max
> >
> > On Wed, Jun 22, 2016 at 3:02 AM, Konstantin Belousov <
> kostik...@gmail.com>
> > wrote:
> >
> >> On Tue, Jun 21, 2016 at 12:48:00PM -0700, Maxim Sobolev wrote:
> >>> Thanks, Konstantin for the great work, we are definitely looking
> forward
> >> to
> >>> get all those improvements to be part of the default FreeBSD
> kernel/port.
> >>> Would be nice if you can post an update some day later as to what's
> >>> integrated and what's not.
> >> I did posted the update several days earlier.  Since you replying to
> this
> >> thread, it would be not unreasonable to read recent messages that were
> >> sent.
> >>
> >>>
> >>> Just in case, I've opened #14206 with PG to switch us to using POSIX
> >>> semaphores by default. Apart from the mentioned performance benefits,
> >> SYSV
> >>> semaphores are PITA to deal with as they come in very limited
> quantities
> >> by
> >>> default. Also they might stay around if PG dies/gets nuked and prevent
> it
> >>> from starting again due to overflow. We've got some quite ugly code to
> >>> clean up those using ipcrm(1) in our build scripts to deal with just
> >> that.
> >>> I am happy that code could be retired now.
> >>
> >> Named semaphores also stuck around if processes are killed without
> cleanup.
> >>
> >>
> > ___
> > freebsd-performa...@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-performance
> > To unsubscribe, send any mail to "
> freebsd-performance-unsubscr...@freebsd.org"
>
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to 

Re: PostgreSQL performance on FreeBSD

2016-06-23 Thread Sean Chittenden
Small nit:

PostgreSQL used SYSV because it allowed for the detection of dead processes.  
If you `kill -9`’ed a process, PostgreSQL can detect that and then shut down 
and perform an automatic recovery.  In this regard, sysv is pretty clever.  The 
move to POSIX shared mem was done for a host of reasons, but it means that you 
don’t have to adjust your SYSV limits.  My understanding from a few years ago 
is that there is still a ~64KB SYSV memory segment that is still used to act as 
the latch to signal if a process was killed, but all of the shared buffers are 
stored in posix mmap’ed regions.

At this point in time this could be replaced with kqueue(2) EVFILT_PROC, but no 
one has done that yet.

-sc



--
Sean Chittenden
s...@chittenden.org

> On Jun 22, 2016, at 07:26 , Maxim Sobolev  wrote:
> 
> Konstantin,
> 
> Not if you do sem_unlink() immediately, AFAIK. And that's what PG does. So
> the window of opportunity for the leakage is quite small, much smaller than
> for SYSV primitives. Sorry for missing your status update message, I've
> missed it somehow.
> 
> 
>mySem = sem_open(semname, O_CREAT | O_EXCL,
> (mode_t) IPCProtection,
> (unsigned) 1);
> 
> #ifdef SEM_FAILED
>if (mySem != (sem_t *) SEM_FAILED)
>break;
> #else
>if (mySem != (sem_t *) (-1))
>break;
> #endif
> 
>/* Loop if error indicates a collision */
>if (errno == EEXIST || errno == EACCES || errno == EINTR)
>continue;
> 
>/*
> * Else complain and abort
> */
>elog(FATAL, "sem_open(\"%s\") failed: %m", semname);
>}
> 
>/*
> * Unlink the semaphore immediately, so it can't be accessed
> externally.
> * This also ensures that it will go away if we crash.
> */
>sem_unlink(semname);
> 
>return mySem;
> 
> 
> -Max
> 
> On Wed, Jun 22, 2016 at 3:02 AM, Konstantin Belousov 
> wrote:
> 
>> On Tue, Jun 21, 2016 at 12:48:00PM -0700, Maxim Sobolev wrote:
>>> Thanks, Konstantin for the great work, we are definitely looking forward
>> to
>>> get all those improvements to be part of the default FreeBSD kernel/port.
>>> Would be nice if you can post an update some day later as to what's
>>> integrated and what's not.
>> I did posted the update several days earlier.  Since you replying to this
>> thread, it would be not unreasonable to read recent messages that were
>> sent.
>> 
>>> 
>>> Just in case, I've opened #14206 with PG to switch us to using POSIX
>>> semaphores by default. Apart from the mentioned performance benefits,
>> SYSV
>>> semaphores are PITA to deal with as they come in very limited quantities
>> by
>>> default. Also they might stay around if PG dies/gets nuked and prevent it
>>> from starting again due to overflow. We've got some quite ugly code to
>>> clean up those using ipcrm(1) in our build scripts to deal with just
>> that.
>>> I am happy that code could be retired now.
>> 
>> Named semaphores also stuck around if processes are killed without cleanup.
>> 
>> 
> ___
> freebsd-performa...@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-performance
> To unsubscribe, send any mail to "freebsd-performance-unsubscr...@freebsd.org"

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: PostgreSQL performance on FreeBSD

2016-06-23 Thread Jilles Tjoelker
On Wed, Jun 22, 2016 at 07:26:52AM -0700, Maxim Sobolev wrote:
> Konstantin,

> Not if you do sem_unlink() immediately, AFAIK. And that's what PG does. So
> the window of opportunity for the leakage is quite small, much smaller than
> for SYSV primitives. Sorry for missing your status update message, I've
> missed it somehow.

True, but if your process architecture supports that, you can also put
unnamed process-shared semaphores into a piece of shared memory (pad
sem_t to a cache line if contention is expected). This is more natural
API use (fully avoiding the leak) and uses less memory. It has been
supported for a long time, at least since FreeBSD 9.0.

Process-shared mutexes, condition variables, reader/writer locks, etc.
are available in FreeBSD 11 but use more memory (a 1-page object per
synchronization object), somewhat like named semaphores.

-- 
Jilles Tjoelker
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: PostgreSQL performance on FreeBSD

2016-06-22 Thread Maxim Sobolev
Konstantin,

Not if you do sem_unlink() immediately, AFAIK. And that's what PG does. So
the window of opportunity for the leakage is quite small, much smaller than
for SYSV primitives. Sorry for missing your status update message, I've
missed it somehow.


mySem = sem_open(semname, O_CREAT | O_EXCL,
 (mode_t) IPCProtection,
(unsigned) 1);

#ifdef SEM_FAILED
if (mySem != (sem_t *) SEM_FAILED)
break;
#else
if (mySem != (sem_t *) (-1))
break;
#endif

/* Loop if error indicates a collision */
if (errno == EEXIST || errno == EACCES || errno == EINTR)
continue;

/*
 * Else complain and abort
 */
elog(FATAL, "sem_open(\"%s\") failed: %m", semname);
}

/*
 * Unlink the semaphore immediately, so it can't be accessed
externally.
 * This also ensures that it will go away if we crash.
 */
sem_unlink(semname);

return mySem;


-Max

On Wed, Jun 22, 2016 at 3:02 AM, Konstantin Belousov 
wrote:

> On Tue, Jun 21, 2016 at 12:48:00PM -0700, Maxim Sobolev wrote:
> > Thanks, Konstantin for the great work, we are definitely looking forward
> to
> > get all those improvements to be part of the default FreeBSD kernel/port.
> > Would be nice if you can post an update some day later as to what's
> > integrated and what's not.
> I did posted the update several days earlier.  Since you replying to this
> thread, it would be not unreasonable to read recent messages that were
> sent.
>
> >
> > Just in case, I've opened #14206 with PG to switch us to using POSIX
> > semaphores by default. Apart from the mentioned performance benefits,
> SYSV
> > semaphores are PITA to deal with as they come in very limited quantities
> by
> > default. Also they might stay around if PG dies/gets nuked and prevent it
> > from starting again due to overflow. We've got some quite ugly code to
> > clean up those using ipcrm(1) in our build scripts to deal with just
> that.
> > I am happy that code could be retired now.
>
> Named semaphores also stuck around if processes are killed without cleanup.
>
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: PostgreSQL performance on FreeBSD

2016-06-22 Thread Konstantin Belousov
On Tue, Jun 21, 2016 at 12:48:00PM -0700, Maxim Sobolev wrote:
> Thanks, Konstantin for the great work, we are definitely looking forward to
> get all those improvements to be part of the default FreeBSD kernel/port.
> Would be nice if you can post an update some day later as to what's
> integrated and what's not.
I did posted the update several days earlier.  Since you replying to this
thread, it would be not unreasonable to read recent messages that were
sent.

> 
> Just in case, I've opened #14206 with PG to switch us to using POSIX
> semaphores by default. Apart from the mentioned performance benefits, SYSV
> semaphores are PITA to deal with as they come in very limited quantities by
> default. Also they might stay around if PG dies/gets nuked and prevent it
> from starting again due to overflow. We've got some quite ugly code to
> clean up those using ipcrm(1) in our build scripts to deal with just that.
> I am happy that code could be retired now.

Named semaphores also stuck around if processes are killed without cleanup.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: PostgreSQL performance on FreeBSD

2016-06-21 Thread Maxim Sobolev
Thanks, Konstantin for the great work, we are definitely looking forward to
get all those improvements to be part of the default FreeBSD kernel/port.
Would be nice if you can post an update some day later as to what's
integrated and what's not.

Just in case, I've opened #14206 with PG to switch us to using POSIX
semaphores by default. Apart from the mentioned performance benefits, SYSV
semaphores are PITA to deal with as they come in very limited quantities by
default. Also they might stay around if PG dies/gets nuked and prevent it
from starting again due to overflow. We've got some quite ugly code to
clean up those using ipcrm(1) in our build scripts to deal with just that.
I am happy that code could be retired now.

-Maxim

On Fri, Jun 3, 2016 at 11:53 AM, John Baldwin  wrote:

> On Friday, June 03, 2016 11:29:03 AM Adrian Chadd wrote:
> > On 3 June 2016 at 11:27, Adrian Chadd  wrote:
> >
> > > That and the other NUMA stuff is something to address in -12.
> >
> > And, I completely welcome continued development in NUMA scaling in
> > combination with discussion. The iterator changes I committed are a
> > more generic version of a patch people were applying on top of -10 and
> > -head for at least what, three years now? Maybe more if -9 also just
> > did round-robin and not first-touch?
>
> 8 and 9 did first-touch.  Only 10 did round-robin.
>
> --
> John Baldwin
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: PostgreSQL performance on FreeBSD

2016-06-04 Thread John Baldwin
On Friday, June 03, 2016 11:29:03 AM Adrian Chadd wrote:
> On 3 June 2016 at 11:27, Adrian Chadd  wrote:
> 
> > That and the other NUMA stuff is something to address in -12.
> 
> And, I completely welcome continued development in NUMA scaling in
> combination with discussion. The iterator changes I committed are a
> more generic version of a patch people were applying on top of -10 and
> -head for at least what, three years now? Maybe more if -9 also just
> did round-robin and not first-touch?

8 and 9 did first-touch.  Only 10 did round-robin.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: PostgreSQL performance on FreeBSD

2016-06-03 Thread Adrian Chadd
On 3 June 2016 at 11:27, Adrian Chadd  wrote:

> That and the other NUMA stuff is something to address in -12.

And, I completely welcome continued development in NUMA scaling in
combination with discussion. The iterator changes I committed are a
more generic version of a patch people were applying on top of -10 and
-head for at least what, three years now? Maybe more if -9 also just
did round-robin and not first-touch?



-adrian
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: PostgreSQL performance on FreeBSD

2016-06-03 Thread Adrian Chadd
On 3 June 2016 at 10:55, Konstantin Belousov  wrote:
> On Fri, Jun 03, 2016 at 11:29:13AM -0600, Alan Somers wrote:
>> On Fri, Jun 3, 2016 at 11:26 AM, Konstantin Belousov
>>  wrote:
>> > On Fri, Jun 03, 2016 at 09:29:16AM -0600, Alan Somers wrote:
>> >> I notice that, with the exception of the VM_PHYSSEG_MAX change, these
>> >> patches never made it into head or ports.  Are they unsuitable for low
>> >> core-count machines, or is there some other reason not to commit them?
>> >>  If not, what would it take to get these into 11.0 or 11.1 ?
>> >
>> > The fast page fault handler was redesigned and committed in r269728
>> > and r270011 (with several follow-ups).
>> > Instead of lock-less buffer queues iterators, Jeff changed buffer allocator
>> > to use uma, see r289279.  Other improvement to the buffer cache was
>> > committed as r267255.
>> >
>> > What was not committed is the aggressive pre-population of the phys objects
>> > mem queue, and a knob to further split NUMA domains into smaller domains.
>> > The later change is rotten.
>> >
>> > In fact, I think that with that load, what you would see right now on
>> > HEAD, is the contention on vm_page_queue_free_mtx.  There are plans to
>> > handle it.
>>
>> Thanks for the update.  Is it still recommended to enable the
>> multithreaded pagedaemon?
>
> Single-threaded pagedaemon cannot maintain the good system state even
> on non-NUMA systems, if machine has large memory.  This was the motivation
> for the NUMA domain split patch.  So yes, to get better performance you
> should enable VM_NUMA_ALLOC option.
>
> Unfortunately, there were some code changes of quite low quality which
> resulted in the NUMA-enabled system to randomly fail with NULL pointer
> deref in the vm page alloc path.  Supposedly that was fixed, but you
> should try that yourself.  One result of the mentioned changes was that
> nobody used/tested NUMA-enabled systems under any significant load, for
> quite long time.

The iterator bug was fixed, so it still behaves like it used to if
NUMA is enabled circa what, freebsd-9? If you'd like that older
behavior, you can totally flip back to the global policy being
round-robin only, and it's then a glorified, configurable-at-runtime
no-op.

The difference now is that you can tickle imbalances if you have too
many processes that need pages from a specific domain instead of round
robin, because the underlying tracking mechanisms still assume a
single global pool and global method of cleaning things.

That and the other NUMA stuff is something to address in -12.


-adrian
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: PostgreSQL performance on FreeBSD

2016-06-03 Thread Konstantin Belousov
On Fri, Jun 03, 2016 at 11:29:13AM -0600, Alan Somers wrote:
> On Fri, Jun 3, 2016 at 11:26 AM, Konstantin Belousov
>  wrote:
> > On Fri, Jun 03, 2016 at 09:29:16AM -0600, Alan Somers wrote:
> >> I notice that, with the exception of the VM_PHYSSEG_MAX change, these
> >> patches never made it into head or ports.  Are they unsuitable for low
> >> core-count machines, or is there some other reason not to commit them?
> >>  If not, what would it take to get these into 11.0 or 11.1 ?
> >
> > The fast page fault handler was redesigned and committed in r269728
> > and r270011 (with several follow-ups).
> > Instead of lock-less buffer queues iterators, Jeff changed buffer allocator
> > to use uma, see r289279.  Other improvement to the buffer cache was
> > committed as r267255.
> >
> > What was not committed is the aggressive pre-population of the phys objects
> > mem queue, and a knob to further split NUMA domains into smaller domains.
> > The later change is rotten.
> >
> > In fact, I think that with that load, what you would see right now on
> > HEAD, is the contention on vm_page_queue_free_mtx.  There are plans to
> > handle it.
> 
> Thanks for the update.  Is it still recommended to enable the
> multithreaded pagedaemon?

Single-threaded pagedaemon cannot maintain the good system state even
on non-NUMA systems, if machine has large memory.  This was the motivation
for the NUMA domain split patch.  So yes, to get better performance you
should enable VM_NUMA_ALLOC option.

Unfortunately, there were some code changes of quite low quality which
resulted in the NUMA-enabled system to randomly fail with NULL pointer
deref in the vm page alloc path.  Supposedly that was fixed, but you
should try that yourself.  One result of the mentioned changes was that
nobody used/tested NUMA-enabled systems under any significant load, for
quite long time.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: PostgreSQL performance on FreeBSD

2016-06-03 Thread Alan Somers
On Fri, Jun 3, 2016 at 11:26 AM, Konstantin Belousov
 wrote:
> On Fri, Jun 03, 2016 at 09:29:16AM -0600, Alan Somers wrote:
>> I notice that, with the exception of the VM_PHYSSEG_MAX change, these
>> patches never made it into head or ports.  Are they unsuitable for low
>> core-count machines, or is there some other reason not to commit them?
>>  If not, what would it take to get these into 11.0 or 11.1 ?
>
> The fast page fault handler was redesigned and committed in r269728
> and r270011 (with several follow-ups).
> Instead of lock-less buffer queues iterators, Jeff changed buffer allocator
> to use uma, see r289279.  Other improvement to the buffer cache was
> committed as r267255.
>
> What was not committed is the aggressive pre-population of the phys objects
> mem queue, and a knob to further split NUMA domains into smaller domains.
> The later change is rotten.
>
> In fact, I think that with that load, what you would see right now on
> HEAD, is the contention on vm_page_queue_free_mtx.  There are plans to
> handle it.

Thanks for the update.  Is it still recommended to enable the
multithreaded pagedaemon?
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: PostgreSQL performance on FreeBSD

2016-06-03 Thread Konstantin Belousov
On Fri, Jun 03, 2016 at 09:29:16AM -0600, Alan Somers wrote:
> I notice that, with the exception of the VM_PHYSSEG_MAX change, these
> patches never made it into head or ports.  Are they unsuitable for low
> core-count machines, or is there some other reason not to commit them?
>  If not, what would it take to get these into 11.0 or 11.1 ?

The fast page fault handler was redesigned and committed in r269728
and r270011 (with several follow-ups).
Instead of lock-less buffer queues iterators, Jeff changed buffer allocator
to use uma, see r289279.  Other improvement to the buffer cache was
committed as r267255.

What was not committed is the aggressive pre-population of the phys objects
mem queue, and a knob to further split NUMA domains into smaller domains.
The later change is rotten.

In fact, I think that with that load, what you would see right now on
HEAD, is the contention on vm_page_queue_free_mtx.  There are plans to
handle it.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: PostgreSQL performance on FreeBSD

2016-06-03 Thread Matthew Macy

 > >>> A couple small steps have been taken toward eliminating the need for 
 > >>> this 
 > >>> hack: the addition of the "page size index" field to struct vm_page and 
 > >>> the 
 > >>> addition of a similarly named parameter to pmap_enter().  However, at 
 > >>> the 
 > >>> moment, the only tangible effect is in the automatic prefaulting by 
 > >>> mmap(2).  Instead of establishing 96 4KB page mappings, the automatic 
 > >>> prefaulting establishes 96 page mappings whose size is determined by the 
 > >>> size of the physical pages that it finds in the vm object.  So, the 
 > >>> prefaulting overhead remains constant, but the coverage provided by the 
 > >>> automatic prefaulting will vary with the underlying page size. 
 > >> Yes, I think what we might actually want is what I mentioned in person at 
 > >> BSDCan: some sort of flag to mmap() that malloc() could use to assume 
 > >> that any 
 > >> reservations are fully used when they are reserved.  This would avoid the 
 > >> need 
 > >> to wait for all pages to be dirtied before promotion provides a superpage 
 > >> mapping and would avoid demotions while still allowing the kernel to 
 > >> gracefully 
 > >> fall back to regular pages if a reservation can't be made. 
 > >> 
 > > 
 > > I agree. 
 >  
 > I notice that, with the exception of the VM_PHYSSEG_MAX change, these 
 > patches never made it into head or ports.  Are they unsuitable for low 
 > core-count machines, or is there some other reason not to commit them? 
 >  If not, what would it take to get these into 11.0 or 11.1 ? 
 >  

I think the two big issues are: a) there's a lot more work that needs to be 
done b) Adrian has had a lot of other things on his plate in the meantime. 
Adrian is hoping to get back to it post 11.0-RELEASE.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: PostgreSQL performance on FreeBSD

2016-06-03 Thread Alan Somers
On Thu, Aug 14, 2014 at 12:19 PM, Alan Cox  wrote:
> On 08/14/2014 10:47, John Baldwin wrote:
>> On Wednesday, August 13, 2014 1:00:22 pm Alan Cox wrote:
>>> On Tue, Aug 12, 2014 at 1:09 PM, John Baldwin  wrote:
>>>
 On Wednesday, July 16, 2014 1:52:45 pm Adrian Chadd wrote:
> Hi!
>
>
> On 16 July 2014 06:29, Konstantin Belousov  wrote:
>> On Fri, Jun 27, 2014 at 03:56:13PM +0300, Konstantin Belousov wrote:
>>> Hi,
>>> I did some measurements and hacks to see about the performance and
>>> scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
>>> Foundation.
>>>
>>> The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
>>> The uncommitted patches, referenced in the article, are available as
>>> https://kib.kiev.ua/kib/pig1.patch.txt
>>> https://kib.kiev.ua/kib/patch-2
>> A followup to the original paper.
>>
>> Most importantly, I identified the cause for the drop on the graph
>> after the 30 clients, which appeared to be the debugging version
>> of malloc(3) in libc.
>>
>> Also there are some updates on the patches.
>>
>> New version of the paper is available at
>> https://www.kib.kiev.ua/kib/pgsql_perf_v2.0.pdf
>> The changes are marked as 'update for version 2.0'.
> Would you mind trying a default (non-PRODUCTION) build, but with junk
> filling turned off?
>
> adrian@adrian-hackbox:~ % ls -l /etc/malloc.conf
>
> lrwxr-xr-x  1 root  wheel  10 Jun 24 04:37 /etc/malloc.conf -> junk:false
>
> That fixes almost all of the malloc debug performance issues that I
> see without having to recompile.
>
> I'd like to know if you see any after that.
 OTOH, I have actually seen junk profiling _improve_ performance in certain
 cases as it forces promotion of allocated pages to superpages since all
 pages
 are dirtied.  (I have a local hack that adds a new malloc option to
 explicitly
 memset() new pages allocated via mmap() that gives the same benefit without
 the junking overheadon each malloc() / free(), but it does increase
 physical
 RAM usage.)


>>> John,
>>>
>>> A couple small steps have been taken toward eliminating the need for this
>>> hack: the addition of the "page size index" field to struct vm_page and the
>>> addition of a similarly named parameter to pmap_enter().  However, at the
>>> moment, the only tangible effect is in the automatic prefaulting by
>>> mmap(2).  Instead of establishing 96 4KB page mappings, the automatic
>>> prefaulting establishes 96 page mappings whose size is determined by the
>>> size of the physical pages that it finds in the vm object.  So, the
>>> prefaulting overhead remains constant, but the coverage provided by the
>>> automatic prefaulting will vary with the underlying page size.
>> Yes, I think what we might actually want is what I mentioned in person at
>> BSDCan: some sort of flag to mmap() that malloc() could use to assume that 
>> any
>> reservations are fully used when they are reserved.  This would avoid the 
>> need
>> to wait for all pages to be dirtied before promotion provides a superpage
>> mapping and would avoid demotions while still allowing the kernel to 
>> gracefully
>> fall back to regular pages if a reservation can't be made.
>>
>
> I agree.

I notice that, with the exception of the VM_PHYSSEG_MAX change, these
patches never made it into head or ports.  Are they unsuitable for low
core-count machines, or is there some other reason not to commit them?
 If not, what would it take to get these into 11.0 or 11.1 ?

-Alan
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: PostgreSQL performance on FreeBSD

2014-08-14 Thread John Baldwin
On Tuesday, August 12, 2014 5:36:26 pm Adrian Chadd wrote:
 On 12 August 2014 11:09, John Baldwin j...@freebsd.org wrote:
  On Wednesday, July 16, 2014 1:52:45 pm Adrian Chadd wrote:
  Hi!
 
 
  On 16 July 2014 06:29, Konstantin Belousov kostik...@gmail.com wrote:
   On Fri, Jun 27, 2014 at 03:56:13PM +0300, Konstantin Belousov wrote:
   Hi,
   I did some measurements and hacks to see about the performance and
   scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
   Foundation.
  
   The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
   The uncommitted patches, referenced in the article, are available as
   https://kib.kiev.ua/kib/pig1.patch.txt
   https://kib.kiev.ua/kib/patch-2
  
   A followup to the original paper.
  
   Most importantly, I identified the cause for the drop on the graph
   after the 30 clients, which appeared to be the debugging version
   of malloc(3) in libc.
  
   Also there are some updates on the patches.
  
   New version of the paper is available at
   https://www.kib.kiev.ua/kib/pgsql_perf_v2.0.pdf
   The changes are marked as 'update for version 2.0'.
 
  Would you mind trying a default (non-PRODUCTION) build, but with junk
  filling turned off?
 
  adrian@adrian-hackbox:~ % ls -l /etc/malloc.conf
 
  lrwxr-xr-x  1 root  wheel  10 Jun 24 04:37 /etc/malloc.conf - junk:false
 
  That fixes almost all of the malloc debug performance issues that I
  see without having to recompile.
 
  I'd like to know if you see any after that.
 
  OTOH, I have actually seen junk profiling _improve_ performance in certain
  cases as it forces promotion of allocated pages to superpages since all 
  pages
  are dirtied.  (I have a local hack that adds a new malloc option to 
  explicitly
  memset() new pages allocated via mmap() that gives the same benefit without
  the junking overheadon each malloc() / free(), but it does increase physical
  RAM usage.)
 
 Hm. this isn't a jemalloc config option?

No, jemalloc does have a 'zero fill' option, but that runs on every malloc so
it is just as expensive as junking.  What my hack does is only zero the pages
when they are first mmap'd, so subsequent free() / malloc() cycles that reuse
the pages do not do any zeroing.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: PostgreSQL performance on FreeBSD

2014-08-14 Thread John Baldwin
On Wednesday, August 13, 2014 1:00:22 pm Alan Cox wrote:
 On Tue, Aug 12, 2014 at 1:09 PM, John Baldwin j...@freebsd.org wrote:
 
  On Wednesday, July 16, 2014 1:52:45 pm Adrian Chadd wrote:
   Hi!
  
  
   On 16 July 2014 06:29, Konstantin Belousov kostik...@gmail.com wrote:
On Fri, Jun 27, 2014 at 03:56:13PM +0300, Konstantin Belousov wrote:
Hi,
I did some measurements and hacks to see about the performance and
scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
Foundation.
   
The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
The uncommitted patches, referenced in the article, are available as
https://kib.kiev.ua/kib/pig1.patch.txt
https://kib.kiev.ua/kib/patch-2
   
A followup to the original paper.
   
Most importantly, I identified the cause for the drop on the graph
after the 30 clients, which appeared to be the debugging version
of malloc(3) in libc.
   
Also there are some updates on the patches.
   
New version of the paper is available at
https://www.kib.kiev.ua/kib/pgsql_perf_v2.0.pdf
The changes are marked as 'update for version 2.0'.
  
   Would you mind trying a default (non-PRODUCTION) build, but with junk
   filling turned off?
  
   adrian@adrian-hackbox:~ % ls -l /etc/malloc.conf
  
   lrwxr-xr-x  1 root  wheel  10 Jun 24 04:37 /etc/malloc.conf - junk:false
  
   That fixes almost all of the malloc debug performance issues that I
   see without having to recompile.
  
   I'd like to know if you see any after that.
 
  OTOH, I have actually seen junk profiling _improve_ performance in certain
  cases as it forces promotion of allocated pages to superpages since all
  pages
  are dirtied.  (I have a local hack that adds a new malloc option to
  explicitly
  memset() new pages allocated via mmap() that gives the same benefit without
  the junking overheadon each malloc() / free(), but it does increase
  physical
  RAM usage.)
 
 
 
 John,
 
 A couple small steps have been taken toward eliminating the need for this
 hack: the addition of the page size index field to struct vm_page and the
 addition of a similarly named parameter to pmap_enter().  However, at the
 moment, the only tangible effect is in the automatic prefaulting by
 mmap(2).  Instead of establishing 96 4KB page mappings, the automatic
 prefaulting establishes 96 page mappings whose size is determined by the
 size of the physical pages that it finds in the vm object.  So, the
 prefaulting overhead remains constant, but the coverage provided by the
 automatic prefaulting will vary with the underlying page size.

Yes, I think what we might actually want is what I mentioned in person at
BSDCan: some sort of flag to mmap() that malloc() could use to assume that any
reservations are fully used when they are reserved.  This would avoid the need
to wait for all pages to be dirtied before promotion provides a superpage
mapping and would avoid demotions while still allowing the kernel to gracefully
fall back to regular pages if a reservation can't be made.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: PostgreSQL performance on FreeBSD

2014-08-14 Thread Alan Cox
On 08/14/2014 10:47, John Baldwin wrote:
 On Wednesday, August 13, 2014 1:00:22 pm Alan Cox wrote:
 On Tue, Aug 12, 2014 at 1:09 PM, John Baldwin j...@freebsd.org wrote:

 On Wednesday, July 16, 2014 1:52:45 pm Adrian Chadd wrote:
 Hi!


 On 16 July 2014 06:29, Konstantin Belousov kostik...@gmail.com wrote:
 On Fri, Jun 27, 2014 at 03:56:13PM +0300, Konstantin Belousov wrote:
 Hi,
 I did some measurements and hacks to see about the performance and
 scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
 Foundation.

 The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
 The uncommitted patches, referenced in the article, are available as
 https://kib.kiev.ua/kib/pig1.patch.txt
 https://kib.kiev.ua/kib/patch-2
 A followup to the original paper.

 Most importantly, I identified the cause for the drop on the graph
 after the 30 clients, which appeared to be the debugging version
 of malloc(3) in libc.

 Also there are some updates on the patches.

 New version of the paper is available at
 https://www.kib.kiev.ua/kib/pgsql_perf_v2.0.pdf
 The changes are marked as 'update for version 2.0'.
 Would you mind trying a default (non-PRODUCTION) build, but with junk
 filling turned off?

 adrian@adrian-hackbox:~ % ls -l /etc/malloc.conf

 lrwxr-xr-x  1 root  wheel  10 Jun 24 04:37 /etc/malloc.conf - junk:false

 That fixes almost all of the malloc debug performance issues that I
 see without having to recompile.

 I'd like to know if you see any after that.
 OTOH, I have actually seen junk profiling _improve_ performance in certain
 cases as it forces promotion of allocated pages to superpages since all
 pages
 are dirtied.  (I have a local hack that adds a new malloc option to
 explicitly
 memset() new pages allocated via mmap() that gives the same benefit without
 the junking overheadon each malloc() / free(), but it does increase
 physical
 RAM usage.)


 John,

 A couple small steps have been taken toward eliminating the need for this
 hack: the addition of the page size index field to struct vm_page and the
 addition of a similarly named parameter to pmap_enter().  However, at the
 moment, the only tangible effect is in the automatic prefaulting by
 mmap(2).  Instead of establishing 96 4KB page mappings, the automatic
 prefaulting establishes 96 page mappings whose size is determined by the
 size of the physical pages that it finds in the vm object.  So, the
 prefaulting overhead remains constant, but the coverage provided by the
 automatic prefaulting will vary with the underlying page size.
 Yes, I think what we might actually want is what I mentioned in person at
 BSDCan: some sort of flag to mmap() that malloc() could use to assume that any
 reservations are fully used when they are reserved.  This would avoid the need
 to wait for all pages to be dirtied before promotion provides a superpage
 mapping and would avoid demotions while still allowing the kernel to 
 gracefully
 fall back to regular pages if a reservation can't be made.


I agree.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: PostgreSQL performance on FreeBSD

2014-08-13 Thread David Chisnall
On 12 Aug 2014, at 19:09, John Baldwin j...@freebsd.org wrote:

 OTOH, I have actually seen junk profiling _improve_ performance in certain 
 cases as it forces promotion of allocated pages to superpages since all pages 
 are dirtied.  (I have a local hack that adds a new malloc option to 
 explicitly 
 memset() new pages allocated via mmap() that gives the same benefit without 
 the junking overheadon each malloc() / free(), but it does increase physical 
 RAM usage.)

Do you get the same effect by adding MAP_ALIGNED_SUPER | MAP_PREFAULT_READ to 
the mmap() call in jemalloc?  I've been meaning to try the latter on BERI, as 
we spend a lot of time bouncing back and forth between user code and the TLB 
miss handlers.  Given that jemalloc asks for memory in 8MB chunks (I think via 
a single mmap call, although I'm not 100% certain), MAP_ALIGNED_SUPER should 
have little impact on any platform.  MAP_PREFAULT_READ may cause problems on 
machines with limited RAM and no swap (I don't know if the VM subsystem knows 
that it can safely discard a zero'd page that has been read but not written - 
I'd hope so, but it's been a while since I read that code).

It might be that we can make jemalloc autotune whether to use MAP_PREFAULT_READ 
depending on some heuristic.  I wonder if something as simple as 'turn it on 
after the first mmap call' would be enough: programs that don't use more than 
8MB of RAM won't prefault, but after that the wasted physical memory becomes an 
increasingly small percentage.

David

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: PostgreSQL performance on FreeBSD

2014-08-13 Thread Alan Cox
On Wed, Aug 13, 2014 at 4:58 AM, David Chisnall thera...@freebsd.org
wrote:

 On 12 Aug 2014, at 19:09, John Baldwin j...@freebsd.org wrote:

  OTOH, I have actually seen junk profiling _improve_ performance in
 certain
  cases as it forces promotion of allocated pages to superpages since all
 pages
  are dirtied.  (I have a local hack that adds a new malloc option to
 explicitly
  memset() new pages allocated via mmap() that gives the same benefit
 without
  the junking overheadon each malloc() / free(), but it does increase
 physical
  RAM usage.)

 Do you get the same effect by adding MAP_ALIGNED_SUPER | MAP_PREFAULT_READ
 to the mmap() call in jemalloc?



No.  MAP_PREFAULT_READ does not allocate physical pages.  It establishes
mappings to pages that are already allocated.



 I've been meaning to try the latter on BERI, as we spend a lot of time
 bouncing back and forth between user code and the TLB miss handlers.  Given
 that jemalloc asks for memory in 8MB chunks (I think via a single mmap
 call, although I'm not 100% certain), MAP_ALIGNED_SUPER should have little
 impact on any platform.  MAP_PREFAULT_READ may cause problems on machines
 with limited RAM and no swap (I don't know if the VM subsystem knows that
 it can safely discard a zero'd page that has been read but not written -
 I'd hope so, but it's been a while since I read that code).

 It might be that we can make jemalloc autotune whether to use
 MAP_PREFAULT_READ depending on some heuristic.  I wonder if something as
 simple as 'turn it on after the first mmap call' would be enough: programs
 that don't use more than 8MB of RAM won't prefault, but after that the
 wasted physical memory becomes an increasingly small percentage.

 David

 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: PostgreSQL performance on FreeBSD

2014-08-13 Thread Alan Cox
On Tue, Aug 12, 2014 at 1:09 PM, John Baldwin j...@freebsd.org wrote:

 On Wednesday, July 16, 2014 1:52:45 pm Adrian Chadd wrote:
  Hi!
 
 
  On 16 July 2014 06:29, Konstantin Belousov kostik...@gmail.com wrote:
   On Fri, Jun 27, 2014 at 03:56:13PM +0300, Konstantin Belousov wrote:
   Hi,
   I did some measurements and hacks to see about the performance and
   scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
   Foundation.
  
   The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
   The uncommitted patches, referenced in the article, are available as
   https://kib.kiev.ua/kib/pig1.patch.txt
   https://kib.kiev.ua/kib/patch-2
  
   A followup to the original paper.
  
   Most importantly, I identified the cause for the drop on the graph
   after the 30 clients, which appeared to be the debugging version
   of malloc(3) in libc.
  
   Also there are some updates on the patches.
  
   New version of the paper is available at
   https://www.kib.kiev.ua/kib/pgsql_perf_v2.0.pdf
   The changes are marked as 'update for version 2.0'.
 
  Would you mind trying a default (non-PRODUCTION) build, but with junk
  filling turned off?
 
  adrian@adrian-hackbox:~ % ls -l /etc/malloc.conf
 
  lrwxr-xr-x  1 root  wheel  10 Jun 24 04:37 /etc/malloc.conf - junk:false
 
  That fixes almost all of the malloc debug performance issues that I
  see without having to recompile.
 
  I'd like to know if you see any after that.

 OTOH, I have actually seen junk profiling _improve_ performance in certain
 cases as it forces promotion of allocated pages to superpages since all
 pages
 are dirtied.  (I have a local hack that adds a new malloc option to
 explicitly
 memset() new pages allocated via mmap() that gives the same benefit without
 the junking overheadon each malloc() / free(), but it does increase
 physical
 RAM usage.)



John,

A couple small steps have been taken toward eliminating the need for this
hack: the addition of the page size index field to struct vm_page and the
addition of a similarly named parameter to pmap_enter().  However, at the
moment, the only tangible effect is in the automatic prefaulting by
mmap(2).  Instead of establishing 96 4KB page mappings, the automatic
prefaulting establishes 96 page mappings whose size is determined by the
size of the physical pages that it finds in the vm object.  So, the
prefaulting overhead remains constant, but the coverage provided by the
automatic prefaulting will vary with the underlying page size.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: PostgreSQL performance on FreeBSD

2014-08-12 Thread John Baldwin
On Wednesday, July 16, 2014 1:52:45 pm Adrian Chadd wrote:
 Hi!
 
 
 On 16 July 2014 06:29, Konstantin Belousov kostik...@gmail.com wrote:
  On Fri, Jun 27, 2014 at 03:56:13PM +0300, Konstantin Belousov wrote:
  Hi,
  I did some measurements and hacks to see about the performance and
  scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
  Foundation.
 
  The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
  The uncommitted patches, referenced in the article, are available as
  https://kib.kiev.ua/kib/pig1.patch.txt
  https://kib.kiev.ua/kib/patch-2
 
  A followup to the original paper.
 
  Most importantly, I identified the cause for the drop on the graph
  after the 30 clients, which appeared to be the debugging version
  of malloc(3) in libc.
 
  Also there are some updates on the patches.
 
  New version of the paper is available at
  https://www.kib.kiev.ua/kib/pgsql_perf_v2.0.pdf
  The changes are marked as 'update for version 2.0'.
 
 Would you mind trying a default (non-PRODUCTION) build, but with junk
 filling turned off?
 
 adrian@adrian-hackbox:~ % ls -l /etc/malloc.conf
 
 lrwxr-xr-x  1 root  wheel  10 Jun 24 04:37 /etc/malloc.conf - junk:false
 
 That fixes almost all of the malloc debug performance issues that I
 see without having to recompile.
 
 I'd like to know if you see any after that.

OTOH, I have actually seen junk profiling _improve_ performance in certain 
cases as it forces promotion of allocated pages to superpages since all pages 
are dirtied.  (I have a local hack that adds a new malloc option to explicitly 
memset() new pages allocated via mmap() that gives the same benefit without 
the junking overheadon each malloc() / free(), but it does increase physical 
RAM usage.)

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: PostgreSQL performance on FreeBSD

2014-08-12 Thread Adrian Chadd
On 12 August 2014 11:09, John Baldwin j...@freebsd.org wrote:
 On Wednesday, July 16, 2014 1:52:45 pm Adrian Chadd wrote:
 Hi!


 On 16 July 2014 06:29, Konstantin Belousov kostik...@gmail.com wrote:
  On Fri, Jun 27, 2014 at 03:56:13PM +0300, Konstantin Belousov wrote:
  Hi,
  I did some measurements and hacks to see about the performance and
  scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
  Foundation.
 
  The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
  The uncommitted patches, referenced in the article, are available as
  https://kib.kiev.ua/kib/pig1.patch.txt
  https://kib.kiev.ua/kib/patch-2
 
  A followup to the original paper.
 
  Most importantly, I identified the cause for the drop on the graph
  after the 30 clients, which appeared to be the debugging version
  of malloc(3) in libc.
 
  Also there are some updates on the patches.
 
  New version of the paper is available at
  https://www.kib.kiev.ua/kib/pgsql_perf_v2.0.pdf
  The changes are marked as 'update for version 2.0'.

 Would you mind trying a default (non-PRODUCTION) build, but with junk
 filling turned off?

 adrian@adrian-hackbox:~ % ls -l /etc/malloc.conf

 lrwxr-xr-x  1 root  wheel  10 Jun 24 04:37 /etc/malloc.conf - junk:false

 That fixes almost all of the malloc debug performance issues that I
 see without having to recompile.

 I'd like to know if you see any after that.

 OTOH, I have actually seen junk profiling _improve_ performance in certain
 cases as it forces promotion of allocated pages to superpages since all pages
 are dirtied.  (I have a local hack that adds a new malloc option to explicitly
 memset() new pages allocated via mmap() that gives the same benefit without
 the junking overheadon each malloc() / free(), but it does increase physical
 RAM usage.)

Hm. this isn't a jemalloc config option?


-a
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: PostgreSQL performance on FreeBSD

2014-07-17 Thread Hooman Fazaeli

On 7/16/2014 5:59 PM, Konstantin Belousov wrote:

On Fri, Jun 27, 2014 at 03:56:13PM +0300, Konstantin Belousov wrote:

Hi,
I did some measurements and hacks to see about the performance and
scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
Foundation.

The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
The uncommitted patches, referenced in the article, are available as
https://kib.kiev.ua/kib/pig1.patch.txt
https://kib.kiev.ua/kib/patch-2

A followup to the original paper.

Most importantly, I identified the cause for the drop on the graph
after the 30 clients, which appeared to be the debugging version
of malloc(3) in libc.

Also there are some updates on the patches.

New version of the paper is available at
https://www.kib.kiev.ua/kib/pgsql_perf_v2.0.pdf
The changes are marked as 'update for version 2.0'.


Thanks for the great work!

Did you tested the effect of hyper-threading (on or off) on the results?


--

Best regards.
Hooman Fazaeli

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: PostgreSQL performance on FreeBSD

2014-07-17 Thread O. Hartmann
Am Thu, 17 Jul 2014 20:15:39 +0430
Hooman Fazaeli hoomanfaza...@gmail.com schrieb:

 On 7/16/2014 5:59 PM, Konstantin Belousov wrote:
  On Fri, Jun 27, 2014 at 03:56:13PM +0300, Konstantin Belousov wrote:
  Hi,
  I did some measurements and hacks to see about the performance and
  scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
  Foundation.
 
  The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
  The uncommitted patches, referenced in the article, are available as
  https://kib.kiev.ua/kib/pig1.patch.txt
  https://kib.kiev.ua/kib/patch-2
  A followup to the original paper.
 
  Most importantly, I identified the cause for the drop on the graph
  after the 30 clients, which appeared to be the debugging version
  of malloc(3) in libc.
 
  Also there are some updates on the patches.
 
  New version of the paper is available at
  https://www.kib.kiev.ua/kib/pgsql_perf_v2.0.pdf
  The changes are marked as 'update for version 2.0'.
 
 Thanks for the great work!
 
 Did you tested the effect of hyper-threading (on or off) on the results?
 
 


A naive question besides:

Does this labor and effort only affects the work with the PostgreSQL 9.3 
database and is
recent FreeBSD only optimized for this servicing puprpose or provides this also 
some
benefeits for other high-performance scenarios?

Oliver 


signature.asc
Description: PGP signature


Re: PostgreSQL performance on FreeBSD

2014-07-16 Thread Konstantin Belousov
On Fri, Jun 27, 2014 at 03:56:13PM +0300, Konstantin Belousov wrote:
 Hi,
 I did some measurements and hacks to see about the performance and
 scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
 Foundation.
 
 The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
 The uncommitted patches, referenced in the article, are available as
 https://kib.kiev.ua/kib/pig1.patch.txt
 https://kib.kiev.ua/kib/patch-2

A followup to the original paper.

Most importantly, I identified the cause for the drop on the graph
after the 30 clients, which appeared to be the debugging version
of malloc(3) in libc.

Also there are some updates on the patches.

New version of the paper is available at
https://www.kib.kiev.ua/kib/pgsql_perf_v2.0.pdf
The changes are marked as 'update for version 2.0'.


pgpgGSKOqDMkx.pgp
Description: PGP signature


Re: PostgreSQL performance on FreeBSD

2014-07-16 Thread Adrian Chadd
Hi!


On 16 July 2014 06:29, Konstantin Belousov kostik...@gmail.com wrote:
 On Fri, Jun 27, 2014 at 03:56:13PM +0300, Konstantin Belousov wrote:
 Hi,
 I did some measurements and hacks to see about the performance and
 scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
 Foundation.

 The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
 The uncommitted patches, referenced in the article, are available as
 https://kib.kiev.ua/kib/pig1.patch.txt
 https://kib.kiev.ua/kib/patch-2

 A followup to the original paper.

 Most importantly, I identified the cause for the drop on the graph
 after the 30 clients, which appeared to be the debugging version
 of malloc(3) in libc.

 Also there are some updates on the patches.

 New version of the paper is available at
 https://www.kib.kiev.ua/kib/pgsql_perf_v2.0.pdf
 The changes are marked as 'update for version 2.0'.

Would you mind trying a default (non-PRODUCTION) build, but with junk
filling turned off?

adrian@adrian-hackbox:~ % ls -l /etc/malloc.conf

lrwxr-xr-x  1 root  wheel  10 Jun 24 04:37 /etc/malloc.conf - junk:false

That fixes almost all of the malloc debug performance issues that I
see without having to recompile.

I'd like to know if you see any after that.

Thanks!



-a
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


RE: PostgreSQL performance on FreeBSD

2014-06-30 Thread Rang, Anton
Thanks for this.

The cpu_search problem you reference came up here at Isilon as well.  Here's a 
patch which should get clang to do the right thing (inlining 3 specialized 
copies of cpu_search); I haven't checked to make sure it doesn't hurt gcc, 
though.

Anton

Index: sched_ule.c
===
--- sched_ule.c (revision 268043)
+++ sched_ule.c (working copy)
@@ -622,11 +622,11 @@
for ((cpu) = 0; (cpu) = mp_maxid; (cpu)++) \
if (CPU_ISSET(cpu, mask))
 
-static __inline int cpu_search(const struct cpu_group *cg, struct cpu_search 
*low,
+static __always_inline int cpu_search(const struct cpu_group *cg, struct 
cpu_search *low,
 struct cpu_search *high, const int match);
-int cpu_search_lowest(const struct cpu_group *cg, struct cpu_search *low);
-int cpu_search_highest(const struct cpu_group *cg, struct cpu_search *high);
-int cpu_search_both(const struct cpu_group *cg, struct cpu_search *low,
+int __noinline cpu_search_lowest(const struct cpu_group *cg, struct cpu_search 
*low);
+int __noinline cpu_search_highest(const struct cpu_group *cg, struct 
cpu_search *high);
+int __noinline cpu_search_both(const struct cpu_group *cg, struct cpu_search 
*low,
 struct cpu_search *high);
 
 /*
@@ -640,7 +640,7 @@
  * match argument.  It is reduced to the minimum set for each case.  It is
  * also recursive to the depth of the tree.
  */
-static __inline int
+static __always_inline int
 cpu_search(const struct cpu_group *cg, struct cpu_search *low,
 struct cpu_search *high, const int match)
 {

-Original Message-
From: owner-freebsd-curr...@freebsd.org 
[mailto:owner-freebsd-curr...@freebsd.org] On Behalf Of Konstantin Belousov
Sent: Friday, June 27, 2014 7:56 AM
To: performa...@freebsd.org
Cc: curr...@freebsd.org
Subject: PostgreSQL performance on FreeBSD

Hi,
I did some measurements and hacks to see about the performance and scalability 
of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD Foundation.

The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
The uncommitted patches, referenced in the article, are available as 
https://kib.kiev.ua/kib/pig1.patch.txt
https://kib.kiev.ua/kib/patch-2
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: PostgreSQL performance on FreeBSD

2014-06-28 Thread Palle Girgensohn


27 jun 2014 kl. 18:34 skrev Konstantin Belousov kostik...@gmail.com:

 On Fri, Jun 27, 2014 at 10:57:53AM -0400, John Baldwin wrote:
 On Friday, June 27, 2014 8:56:13 am Konstantin Belousov wrote:
 Hi,
 I did some measurements and hacks to see about the performance and
 scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
 Foundation.
 
 The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
 The uncommitted patches, referenced in the article, are available as
 https://kib.kiev.ua/kib/pig1.patch.txt
 https://kib.kiev.ua/kib/patch-2
 
 Did you run the same benchmark on the same hardware with any other OS's to 
 compare results?
 
 No.
 
 FWIW, before the failing after the 30 clients is corrected, I do not
 think it is much interesting to do such comparision.

This is great work!

Does anybody know how far back in FreeBSD versions using posix semaphore 
instead of sysv would make a difference?  It seems we need a rather current 
version? 8.x did not support it at all, at some point at lest, and in 9 it was 
buggy. I could add he patch-2 to the port, but I reckon it needs a conditional 
based on FreeBSD version?

The clang bug should go upstreams, right?

I have seen similar curves, presented by Greg Smith (PostgreSQL hacker) where 
he concluded that there is no point in running more than 50 concurrent 
connections. This was for Linux. In your measures, the knee is at 30. That's 
said, FreeBSD could and should do better, but probably there is a limit where 
there will be a knee in the graph and performance will drop. It should be more 
than 30, though, as you rightly commented.

Do you any ideas to pursue this further apart from complicated rewrites like 
DragonFly?

Palle
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: PostgreSQL performance on FreeBSD

2014-06-28 Thread Konstantin Belousov
On Sat, Jun 28, 2014 at 12:08:39PM +0200, Palle Girgensohn wrote:
 
 
 27 jun 2014 kl. 18:34 skrev Konstantin Belousov kostik...@gmail.com:
 
  On Fri, Jun 27, 2014 at 10:57:53AM -0400, John Baldwin wrote:
  On Friday, June 27, 2014 8:56:13 am Konstantin Belousov wrote:
  Hi,
  I did some measurements and hacks to see about the performance and
  scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
  Foundation.
  
  The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
  The uncommitted patches, referenced in the article, are available as
  https://kib.kiev.ua/kib/pig1.patch.txt
  https://kib.kiev.ua/kib/patch-2
  
  Did you run the same benchmark on the same hardware with any other OS's to 
  compare results?
  
  No.
  
  FWIW, before the failing after the 30 clients is corrected, I do not
  think it is much interesting to do such comparision.
 
 This is great work!
 
 Does anybody know how far back in FreeBSD versions using posix semaphore 
 instead of sysv would make a difference?  It seems we need a rather current 
 version? 8.x did not support it at all, at some point at lest, and in 9 it 
 was buggy. I could add he patch-2 to the port, but I reckon it needs a 
 conditional based on FreeBSD version?
 
I recommend to add it as an option.  The currently supported versions
of stable/9 and higher have new posix semaphores implementation.
The stable/8 also has posix semaphores, but there it is kernel-based
interface, I do not plan to evaluate it in any way.


 The clang bug should go upstreams, right?
I believe there is already some activity about it.  I do not follow
clang development.

 
 I have seen similar curves, presented by Greg Smith (PostgreSQL
 hacker) where he concluded that there is no point in running more
 than 50 concurrent connections. This was for Linux. In your measures,
 the knee is at 30. That's said, FreeBSD could and should do better,
 but probably there is a limit where there will be a knee in the graph
 and performance will drop. It should be more than 30, though, as you
 rightly commented.

 Do you any ideas to pursue this further apart from complicated
 rewrites like DragonFly?

I do.

The scope of the current work was done to obtain understanding where do we stay
and, if possible, evaluate ideas, possibly in the hackish way. I hope
and almost sure that this will be continued, but cannot provide any time
estimation.


pgpzJ1sznk5cq.pgp
Description: PGP signature


Re: PostgreSQL performance on FreeBSD

2014-06-28 Thread Palle Girgensohn


 28 jun 2014 kl. 12:21 skrev Konstantin Belousov kostik...@gmail.com:
 
 On Sat, Jun 28, 2014 at 12:08:39PM +0200, Palle Girgensohn wrote:
 
 
 27 jun 2014 kl. 18:34 skrev Konstantin Belousov kostik...@gmail.com:
 
 On Fri, Jun 27, 2014 at 10:57:53AM -0400, John Baldwin wrote:
 On Friday, June 27, 2014 8:56:13 am Konstantin Belousov wrote:
 Hi,
 I did some measurements and hacks to see about the performance and
 scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
 Foundation.
 
 The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
 The uncommitted patches, referenced in the article, are available as
 https://kib.kiev.ua/kib/pig1.patch.txt
 https://kib.kiev.ua/kib/patch-2
 
 Did you run the same benchmark on the same hardware with any other OS's to 
 compare results?
 
 No.
 
 FWIW, before the failing after the 30 clients is corrected, I do not
 think it is much interesting to do such comparision.
 
 This is great work!
 
 Does anybody know how far back in FreeBSD versions using posix semaphore 
 instead of sysv would make a difference?  It seems we need a rather current 
 version? 8.x did not support it at all, at some point at lest, and in 9 it 
 was buggy. I could add he patch-2 to the port, but I reckon it needs a 
 conditional based on FreeBSD version?
 I recommend to add it as an option.  The currently supported versions
 of stable/9 and higher have new posix semaphores implementation.
 The stable/8 also has posix semaphores, but there it is kernel-based
 interface, I do not plan to evaluate it in any way.

According to one source, posix semaphores uses O(N^2) file descriptors, where N 
is the number of connections. Do you know if this is true? (I'll try it, 
naturally, just checking). 

 
 
 The clang bug should go upstreams, right?
 I believe there is already some activity about it.  I do not follow
 clang development.

Sounds good enough. 

 
 
 I have seen similar curves, presented by Greg Smith (PostgreSQL
 hacker) where he concluded that there is no point in running more
 than 50 concurrent connections. This was for Linux. In your measures,
 the knee is at 30. That's said, FreeBSD could and should do better,
 but probably there is a limit where there will be a knee in the graph
 and performance will drop. It should be more than 30, though, as you
 rightly commented.
 
 Do you any ideas to pursue this further apart from complicated
 rewrites like DragonFly?
 I do.
 
 The scope of the current work was done to obtain understanding where do we 
 stay
 and, if possible, evaluate ideas, possibly in the hackish way. I hope
 and almost sure that this will be continued, but cannot provide any time
 estimation.

Great. If you need help testing, I might be able to help. 

Cheers,
Palle
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: PostgreSQL performance on FreeBSD

2014-06-28 Thread Konstantin Belousov
On Sat, Jun 28, 2014 at 01:37:20PM +0200, Palle Girgensohn wrote:
 
 
  28 jun 2014 kl. 12:21 skrev Konstantin Belousov kostik...@gmail.com:
  
  On Sat, Jun 28, 2014 at 12:08:39PM +0200, Palle Girgensohn wrote:
  
  
  27 jun 2014 kl. 18:34 skrev Konstantin Belousov kostik...@gmail.com:
  
  On Fri, Jun 27, 2014 at 10:57:53AM -0400, John Baldwin wrote:
  On Friday, June 27, 2014 8:56:13 am Konstantin Belousov wrote:
  Hi,
  I did some measurements and hacks to see about the performance and
  scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
  Foundation.
  
  The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
  The uncommitted patches, referenced in the article, are available as
  https://kib.kiev.ua/kib/pig1.patch.txt
  https://kib.kiev.ua/kib/patch-2
  
  Did you run the same benchmark on the same hardware with any other OS's 
  to 
  compare results?
  
  No.
  
  FWIW, before the failing after the 30 clients is corrected, I do not
  think it is much interesting to do such comparision.
  
  This is great work!
  
  Does anybody know how far back in FreeBSD versions using posix semaphore 
  instead of sysv would make a difference?  It seems we need a rather 
  current version? 8.x did not support it at all, at some point at lest, and 
  in 9 it was buggy. I could add he patch-2 to the port, but I reckon it 
  needs a conditional based on FreeBSD version?
  I recommend to add it as an option.  The currently supported versions
  of stable/9 and higher have new posix semaphores implementation.
  The stable/8 also has posix semaphores, but there it is kernel-based
  interface, I do not plan to evaluate it in any way.
 
 According to one source, posix semaphores uses O(N^2) file descriptors, where 
 N is the number of connections. Do you know if this is true? (I'll try it, 
 naturally, just checking). 
 
(New) posix semaphores implementation, done by David Xu, opens a file
descriptor during the sem_open(3), which is used to mmap the area carrying
the lock word, and is immediately closed afterward in sem_open().  In
other words, if you have N semaphores and M processes, there would
be N*M open(2)/close(2) pairs, and N files, each mmaped to M processes.
New implementation does not use file descriptor during semaphore use,
and does not keep the backing file open.

  
  
  The clang bug should go upstreams, right?
  I believe there is already some activity about it.  I do not follow
  clang development.
 
 Sounds good enough. 
 
  
  
  I have seen similar curves, presented by Greg Smith (PostgreSQL
  hacker) where he concluded that there is no point in running more
  than 50 concurrent connections. This was for Linux. In your measures,
  the knee is at 30. That's said, FreeBSD could and should do better,
  but probably there is a limit where there will be a knee in the graph
  and performance will drop. It should be more than 30, though, as you
  rightly commented.
  
  Do you any ideas to pursue this further apart from complicated
  rewrites like DragonFly?
  I do.
  
  The scope of the current work was done to obtain understanding where do we 
  stay
  and, if possible, evaluate ideas, possibly in the hackish way. I hope
  and almost sure that this will be continued, but cannot provide any time
  estimation.
 
 Great. If you need help testing, I might be able to help. 
I have the test set up and the graphing mostly automated, although
the repeat of the configuration would be quite laborous.

On the other hand, if you get access to zoo, replication could be easier.


pgpejmLEUMXlW.pgp
Description: PGP signature


PostgreSQL performance on FreeBSD

2014-06-27 Thread Konstantin Belousov
Hi,
I did some measurements and hacks to see about the performance and
scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
Foundation.

The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
The uncommitted patches, referenced in the article, are available as
https://kib.kiev.ua/kib/pig1.patch.txt
https://kib.kiev.ua/kib/patch-2


pgpcItNw7QgTz.pgp
Description: PGP signature


Re: PostgreSQL performance on FreeBSD

2014-06-27 Thread Mark Felder
June 27 2014 7:56 AM, Konstantin Belousov  wrote: 

 Hi,
 I did some measurements and hacks to see about the performance and
 scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
 Foundation.
 
 The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
 The uncommitted patches, referenced in the article, are available as
 https://kib.kiev.ua/kib/pig1.patch.txt
 https://kib.kiev.ua/kib/patch-2
 

Thank you for taking the time to do this benchmark.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: PostgreSQL performance on FreeBSD

2014-06-27 Thread Tobias Grosser

On 27/06/2014 14:56, Konstantin Belousov wrote:

Hi,
I did some measurements and hacks to see about the performance and
scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
Foundation.

The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
The uncommitted patches, referenced in the article, are available as
https://kib.kiev.ua/kib/pig1.patch.txt
https://kib.kiev.ua/kib/patch-2


Interesting. Did you report the clang bug upstream?

Tobias

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: PostgreSQL performance on FreeBSD

2014-06-27 Thread John Baldwin
On Friday, June 27, 2014 8:56:13 am Konstantin Belousov wrote:
 Hi,
 I did some measurements and hacks to see about the performance and
 scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
 Foundation.
 
 The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
 The uncommitted patches, referenced in the article, are available as
 https://kib.kiev.ua/kib/pig1.patch.txt
 https://kib.kiev.ua/kib/patch-2

Did you run the same benchmark on the same hardware with any other OS's to 
compare results?

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: PostgreSQL performance on FreeBSD

2014-06-27 Thread Konstantin Belousov
On Fri, Jun 27, 2014 at 10:57:53AM -0400, John Baldwin wrote:
 On Friday, June 27, 2014 8:56:13 am Konstantin Belousov wrote:
  Hi,
  I did some measurements and hacks to see about the performance and
  scalability of PostgreSQL 9.3 on FreeBSD, sponsored by The FreeBSD
  Foundation.
  
  The results are described in https://kib.kiev.ua/kib/pgsql_perf.pdf.
  The uncommitted patches, referenced in the article, are available as
  https://kib.kiev.ua/kib/pig1.patch.txt
  https://kib.kiev.ua/kib/patch-2
 
 Did you run the same benchmark on the same hardware with any other OS's to 
 compare results?

No.

FWIW, before the failing after the 30 clients is corrected, I do not
think it is much interesting to do such comparision.


pgp6R89pp0kcX.pgp
Description: PGP signature