Re: Is it possible to block pending queued RealTime signals (AIO originating)?

2013-01-09 Thread Richard Sharpe
On Tue, 2013-01-08 at 09:20 -0800, Adrian Chadd wrote:
 On 8 January 2013 08:15, Richard Sharpe rsha...@richardsharpe.com wrote:
  On Tue, 2013-01-08 at 07:36 -0800, Adrian Chadd wrote:
  .. or you could abstract it out a bit and use freebsd's
  aio_waitcomplete() or kqueue aio notification.
 
  It'll then behave much saner.
 
  Yes, going forward that is what I want to do ... this would work nicely
  with a kqueue back-end for Samba's tevent subsystem, and if someone has
  not already written such a back end, I will have to do so, I guess.
 
 Embrace FreeBSD's nice asynchronous APIs for doing things! You know you want 
 to!
 
 (Then, convert parts of samba over to use grand central dispatch... :-)
 
 Seriously though - I was doing network/disk IO using real time signals
 what, 10 + years ago on Linux and it plain sucked. AIO + kqueue +
 waitcomplete is just brilliant. kqueue for signal delivery is also
 just brilliant. Just saying.

The problem with a fully event-driven approach is that it will not work,
it seems to me. Eventually, you find something that is not async and
then you have to go threaded. (Because handling multiple clients in one
process is very useful and you do not want client-A's long-running op
preventing client-B's short-running op from being serviced.)

Then, you run into problems like Posix's insistence that all threads in
a process must use the same credentials (ie, uid and gids must be the
same across all threads), although there is a hack on Linux to work
around this behind glibc's back.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Is it possible to block pending queued RealTime signals (AIO originating)?

2013-01-09 Thread Julian Elischer

On 1/9/13 7:21 AM, Richard Sharpe wrote:

On Tue, 2013-01-08 at 09:20 -0800, Adrian Chadd wrote:

On 8 January 2013 08:15, Richard Sharpe rsha...@richardsharpe.com wrote:

On Tue, 2013-01-08 at 07:36 -0800, Adrian Chadd wrote:

.. or you could abstract it out a bit and use freebsd's
aio_waitcomplete() or kqueue aio notification.

It'll then behave much saner.

Yes, going forward that is what I want to do ... this would work nicely
with a kqueue back-end for Samba's tevent subsystem, and if someone has
not already written such a back end, I will have to do so, I guess.

Embrace FreeBSD's nice asynchronous APIs for doing things! You know you want to!

(Then, convert parts of samba over to use grand central dispatch... :-)

Seriously though - I was doing network/disk IO using real time signals
what, 10 + years ago on Linux and it plain sucked. AIO + kqueue +
waitcomplete is just brilliant. kqueue for signal delivery is also
just brilliant. Just saying.

The problem with a fully event-driven approach is that it will not work,
it seems to me. Eventually, you find something that is not async and
then you have to go threaded. (Because handling multiple clients in one
process is very useful and you do not want client-A's long-running op
preventing client-B's short-running op from being serviced.)

Then, you run into problems like Posix's insistence that all threads in
a process must use the same credentials (ie, uid and gids must be the
same across all threads), although there is a hack on Linux to work
around this behind glibc's back.


The best implementation of an async framework I've seen is the one 
that Alan Cox

and friends wrote in the code they sold to IronPort/Cisco.

It'd be nice if we could get that extracted out and donated/included 
into something

generally available..  even had a #ifdef Linux code path..



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org



___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Is it possible to block pending queued RealTime signals (AIO originating)?

2013-01-09 Thread Richard Sharpe
On Wed, 2013-01-09 at 10:06 +0800, David Xu wrote:
 [...]
  This code won't work, as I said, after the signal handler returned,
  kernel will copy the signal mask contained in ucontext into kernel
  space, and use it in feature signal delivering.
 
  The code should be modified as following:
 
  void handler(int signum, siginfo_t *info, ucontext_t *uap)
  {
  ...
 
if (count + 1 == TEVENT_SA_INFO_QUEUE_COUNT) {
sigaddset(uap-uc_sigmask, signum);
 
  Hmmm, this seems unlikely because the signal handler is operating in
  user mode and has no access to kernel-mode variables.
 
  Well, it turns out that your suggestion was correct.
 
  I did some more searching and found another similar suggestion, so I
  gave it a whirl, and it works.
 
  Now, my problem is that Jeremy Allison thinks that it is a fugly hack.
  This means that I will probably have big problems getting a patch for
  this into Samba.
 
  I guess a couple of questions I have now are:
 
  1. Is this the same for all versions of FreeBSD since Posix RT Signals
  were introduced?
 
 
 I have checked source code, and found from FreeBSD 7.0, RT signal is
 supported, and aio code uses signal queue.
 
  2. Which (interpretation of which) combination of standards require such
  an approach?
 
 
 The way I introduced is standard:
 http://pubs.opengroup.org/onlinepubs/007904975/functions/sigaction.html
 
 I quoted some text here:
 
 When a signal is caught by a signal-catching function installed by 
 sigaction(), a new signal mask is calculated and installed for the 
 duration of the signal-catching function (or until a call to either 
 sigprocmask() or sigsuspend() is made). This mask is formed by taking 
 the union of the current signal mask and the value of the sa_mask for 
 the signal being delivered [XSI] [Option Start]  unless SA_NODEFER or 
 SA_RESETHAND is set, [Option End] and then including the signal being 
 delivered. If and when the user's signal handler returns normally, the 
 original signal mask is restored.
 
 ...
 
 When the signal handler returns, the receiving thread resumes execution 
 at the point it was interrupted unless the signal handler makes other 
 arrangements. If longjmp() or _longjmp() is used to leave the signal 
 handler, then the signal mask must be explicitly restored.
 
 This volume of IEEE Std 1003.1-2001 defines the third argument of a 
 signal handling function when SA_SIGINFO is set as a void * instead of a 
 ucontext_t *, but without requiring type checking. New applications 
 should explicitly cast the third argument of the signal handling
 
 function to ucontext_t *.
 ^
 
 ---
 
 The above means third parameter is pointing to ucontext_t which is used
 to restored the previously interrupted context, the context contains
 a signal mask which is also restored.
 http://pubs.opengroup.org/onlinepubs/007904975/basedefs/ucontext.h.html

OK, thank you for that. Jeremy agrees that this is a portable approach,
at least across Linux, FreeBSD and Solaris. We will try to get a fix
into Samba to do it the correct way.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: malloc+utrace, tracking memory leaks in a running program.

2013-01-09 Thread Alfred Perlstein

On 12/23/12 12:28 PM, Jason Evans wrote:

On Dec 21, 2012, at 7:37 PM, Alfred Perlstein bri...@mu.org wrote:

So the other day in an effort to debug a memory leak I decided to take a look 
at malloc+utrace(2) and decided to make a tool to debug where leaks are coming 
from.

A few hours later I have:
1) a new version of utrace(2) (utrace2(2)) that uses structured data to prevent 
overloading of data.   (utrace2.diff)
2) changes to ktrace and kdump to decode the new format. (also in utrace2.diff)
3) changes to jemalloc to include the new format AND the function caller so 
it's easy to get the source of the leaks. (also in utrace2.diff)
4) a program that can take a pipe of kdump(1) and figure out what memory has 
leaked. (alloctrace.py)
5) simple test program (test_utrace.c)

[…]

Have you looked at the heap profiling functionality built into jemalloc?  It's not 
currently enabled on FreeBSD, but as far as I know, the only issue keeping it from 
being useful is the absence of a Linux-compatible /proc/pid/maps (and the 
gperftools folks may already have a solution for that; I haven't looked).  I think it 
makes more sense to get that sorted out than to develop a separate trace-based leak 
checker.  The problem with tracing is that it doesn't scale beyond some relatively 
small number of allocator events.


I have looked at some of this functionality (heap profiling) but alas it 
is not implemented yet.  In addition the dtrace work appears to be quite 
away from a workable solution with too many performance penalties until 
some serious hacking is done.


I am just not sure how to proceed, on one hand I do not really have the 
skill to fix the /proc/pid/maps problem, nor figure out how to get 
dtrace into the system in any time frame that is reasonable.


All a few of us need is the addition of the trace back into the existing 
utrace framework.



Is it time to start installing with some form of debug symbols? This would help 
us also with dtrace.

Re: debug symbols, frame pointers, etc. necessary to make userland dtrace work 
by default, IMO we should strongly prefer such defaults.  It's more reasonable 
to expect people who need every last bit of performance to remove functionality 
than to expect people who want to figure out what the system is doing to figure 
out what functionality to turn on.



This is very true.  I'm going to continue to work towards this end with 
a few people and get up to speed on it so that hopefully we can get to 
this point hopefully in the next release cycle or two.


If you have a few moments, can you have a look at the utrace2 branches 
here:

https://github.com/alfredperlstein/freebsd/tree/utrace2

This branch contains the addition of the utrace2 system call which is 
needed to structure data via utrace(2).  The point of this is to avoid 
kdump(1) needing to discern type of ktrace records based on arbitrary 
size or other parameters and introduces an extensible protocol for new 
types of utrace data.


The utrace2 branch here augments jemalloc to use utrace2 to pass the old 
utrace records, but in addition to pass the return address along with 
the type and size of the allocation:

https://github.com/alfredperlstein/jemalloc/tree/utrace2

-Alfred
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: malloc+utrace, tracking memory leaks in a running program.

2013-01-09 Thread Alfred Perlstein

On 1/10/13 1:41 AM, Alfred Perlstein wrote:

On 12/23/12 12:28 PM, Jason Evans wrote:

On Dec 21, 2012, at 7:37 PM, Alfred Perlstein bri...@mu.org wrote:
So the other day in an effort to debug a memory leak I decided to 
take a look at malloc+utrace(2) and decided to make a tool to debug 
where leaks are coming from.


A few hours later I have:
1) a new version of utrace(2) (utrace2(2)) that uses structured data 
to prevent overloading of data. (utrace2.diff)
2) changes to ktrace and kdump to decode the new format. (also in 
utrace2.diff)
3) changes to jemalloc to include the new format AND the function 
caller so it's easy to get the source of the leaks. (also in 
utrace2.diff)
4) a program that can take a pipe of kdump(1) and figure out what 
memory has leaked. (alloctrace.py)

5) simple test program (test_utrace.c)

[…]
Have you looked at the heap profiling functionality built into 
jemalloc?  It's not currently enabled on FreeBSD, but as far as I 
know, the only issue keeping it from being useful is the absence of a 
Linux-compatible /proc/pid/maps (and the gperftools folks may 
already have a solution for that; I haven't looked).  I think it 
makes more sense to get that sorted out than to develop a separate 
trace-based leak checker.  The problem with tracing is that it 
doesn't scale beyond some relatively small number of allocator events.


I have looked at some of this functionality (heap profiling) but alas 
it is not implemented yet.  In addition the dtrace work appears to be 
quite away from a workable solution with too many performance 
penalties until some serious hacking is done.


I am just not sure how to proceed, on one hand I do not really have 
the skill to fix the /proc/pid/maps problem, nor figure out how to get 
dtrace into the system in any time frame that is reasonable.


All a few of us need is the addition of the trace back into the 
existing utrace framework.


Is it time to start installing with some form of debug symbols? This 
would help us also with dtrace.
Re: debug symbols, frame pointers, etc. necessary to make userland 
dtrace work by default, IMO we should strongly prefer such defaults.  
It's more reasonable to expect people who need every last bit of 
performance to remove functionality than to expect people who want to 
figure out what the system is doing to figure out what functionality 
to turn on.




This is very true.  I'm going to continue to work towards this end 
with a few people and get up to speed on it so that hopefully we can 
get to this point hopefully in the next release cycle or two.


If you have a few moments, can you have a look at the utrace2 
branches here:

https://github.com/alfredperlstein/freebsd/tree/utrace2

This branch contains the addition of the utrace2 system call which is 
needed to structure data via utrace(2).  The point of this is to avoid 
kdump(1) needing to discern type of ktrace records based on arbitrary 
size or other parameters and introduces an extensible protocol for new 
types of utrace data.


The utrace2 branch here augments jemalloc to use utrace2 to pass the 
old utrace records, but in addition to pass the return address along 
with the type and size of the allocation:

https://github.com/alfredperlstein/jemalloc/tree/utrace2

-Alfred


Jason,

Here are more convenient links that give diffs against FreeBSD and 
jemalloc for the proposed changes:


FreeBSD:
https://github.com/alfredperlstein/freebsd/compare/13e7228d5b83c8fcfc63a0803a374212018f6b68~1...utrace2

jemalloc:
https://github.com/alfredperlstein/jemalloc/compare/master...utrace2

-Alfred

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: malloc+utrace, tracking memory leaks in a running program.

2013-01-09 Thread Konstantin Belousov
On Thu, Jan 10, 2013 at 01:56:48AM -0500, Alfred Perlstein wrote:
 Here are more convenient links that give diffs against FreeBSD and 
 jemalloc for the proposed changes:
 
 FreeBSD:
 https://github.com/alfredperlstein/freebsd/compare/13e7228d5b83c8fcfc63a0803a374212018f6b68~1...utrace2
 
Why  do you need to expedite the records through the ktrace at all ?
Wouldn't direct write(2)s to a file allow for better performance
due to not stressing kernel memory allocator and single writing thread ?
Also, the malloc coupling to the single-system interface would be
prevented.

I believe that other usermode tracers also behave in the similar way,
using writes and not private kernel interface.

Also, what /proc issues did you mentioned ? There is
sysctl kern.proc.vmmap which is much more convenient than /proc/pid/map
and does not require /proc mounted.

 jemalloc:
 https://github.com/alfredperlstein/jemalloc/compare/master...utrace2
 
 -Alfred
 
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


pgp3u3h6Pt05o.pgp
Description: PGP signature