Re: perf: behavior of poll() changed in 3.18

2015-01-22 Thread Vince Weaver
On Thu, 22 Jan 2015, Jiri Olsa wrote:

> > So what happens if you are using a signal handler to monitor a child and 
> > the child exits?
> 
> AFAICS wrt to SIGIO, we notify only with POLL_IN if there's new data
> and POLL_HUP if we reached the event_limit - the one you set with
> PERF_EVENT_IOC_REFRESH ioctl

I wrote a test for this: if you are monitoring another process with a 
SIGIO handler and the other process dies, then you get no signal for this 
at all, and in  fact depending on the timing the last valid SIGIO overflow 
event signal might get lost (in my test I usually get one fewer 
POLL_IN signal than I would get from the equivelent poll() code).

> I've actually never used the SIGIO interface in perf other than
> when I was checking your gi repo with test code ;-)

Yes, it seems that PAPI is really the only infrastructure that tries to do 
self-monitoring using the SIGIO interface.  Which shows, as it breaks now 
and then.  That was the whole reason I started writing my perf_event_test 
infrastructure in the first place.

Vince
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: behavior of poll() changed in 3.18

2015-01-22 Thread Jiri Olsa
On Thu, Jan 22, 2015 at 01:53:35AM -0500, Vince Weaver wrote:
> On Wed, 21 Jan 2015, Jiri Olsa wrote:
> > 
> > However if we revert this code, we'll loose nice (and standard) way
> > to check if the event is still valid.. not sure how to handle this.
> 
> there's likely no need to revert as my code wasn't really released and 
> I've already fixed it to work with the new interface.
> 
> I was mostly asking just so I could update the manpage to explain the new 
> behavior, as tools that expect to be backwards compatible will have to 
> handle both ways of detecting a process dieing.
> 
> > > Part of why my code doesn't just exit on POLLHUP is because you can
> > > get that result for reasons other than a process exit (for example,
> > > if you are using ioctl(PERF_EVENT_IOC_REFRESH)
> > 
> > Nope, this is related to POOL_HUP (notice the '_') which you'll get
> > accompanied with SIGIO if you setup this.
> 
> So what happens if you are using a signal handler to monitor a child and 
> the child exits?

AFAICS wrt to SIGIO, we notify only with POLL_IN if there's new data
and POLL_HUP if we reached the event_limit - the one you set with
PERF_EVENT_IOC_REFRESH ioctl

> 
> It's a shame the poll and signal handler interfaces are subtly different, 
> though I guess some of that is probably due to historical reasons.

I've actually never used the SIGIO interface in perf other than
when I was checking your gi repo with test code ;-)

so I'm not sure what the correct behaviour should be when monitored
process dies.. I'd say we should send SIGIO with POLL_HUP, but that
clashes with that 'event_limit' thing.. I'll check on that

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: behavior of poll() changed in 3.18

2015-01-22 Thread Jiri Olsa
On Thu, Jan 22, 2015 at 01:53:35AM -0500, Vince Weaver wrote:
 On Wed, 21 Jan 2015, Jiri Olsa wrote:
  
  However if we revert this code, we'll loose nice (and standard) way
  to check if the event is still valid.. not sure how to handle this.
 
 there's likely no need to revert as my code wasn't really released and 
 I've already fixed it to work with the new interface.
 
 I was mostly asking just so I could update the manpage to explain the new 
 behavior, as tools that expect to be backwards compatible will have to 
 handle both ways of detecting a process dieing.
 
   Part of why my code doesn't just exit on POLLHUP is because you can
   get that result for reasons other than a process exit (for example,
   if you are using ioctl(PERF_EVENT_IOC_REFRESH)
  
  Nope, this is related to POOL_HUP (notice the '_') which you'll get
  accompanied with SIGIO if you setup this.
 
 So what happens if you are using a signal handler to monitor a child and 
 the child exits?

AFAICS wrt to SIGIO, we notify only with POLL_IN if there's new data
and POLL_HUP if we reached the event_limit - the one you set with
PERF_EVENT_IOC_REFRESH ioctl

 
 It's a shame the poll and signal handler interfaces are subtly different, 
 though I guess some of that is probably due to historical reasons.

I've actually never used the SIGIO interface in perf other than
when I was checking your gi repo with test code ;-)

so I'm not sure what the correct behaviour should be when monitored
process dies.. I'd say we should send SIGIO with POLL_HUP, but that
clashes with that 'event_limit' thing.. I'll check on that

thanks,
jirka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: behavior of poll() changed in 3.18

2015-01-22 Thread Vince Weaver
On Thu, 22 Jan 2015, Jiri Olsa wrote:

  So what happens if you are using a signal handler to monitor a child and 
  the child exits?
 
 AFAICS wrt to SIGIO, we notify only with POLL_IN if there's new data
 and POLL_HUP if we reached the event_limit - the one you set with
 PERF_EVENT_IOC_REFRESH ioctl

I wrote a test for this: if you are monitoring another process with a 
SIGIO handler and the other process dies, then you get no signal for this 
at all, and in  fact depending on the timing the last valid SIGIO overflow 
event signal might get lost (in my test I usually get one fewer 
POLL_IN signal than I would get from the equivelent poll() code).

 I've actually never used the SIGIO interface in perf other than
 when I was checking your gi repo with test code ;-)

Yes, it seems that PAPI is really the only infrastructure that tries to do 
self-monitoring using the SIGIO interface.  Which shows, as it breaks now 
and then.  That was the whole reason I started writing my perf_event_test 
infrastructure in the first place.

Vince
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: behavior of poll() changed in 3.18

2015-01-21 Thread Vince Weaver
On Wed, 21 Jan 2015, Jiri Olsa wrote:
> 
> However if we revert this code, we'll loose nice (and standard) way
> to check if the event is still valid.. not sure how to handle this.

there's likely no need to revert as my code wasn't really released and 
I've already fixed it to work with the new interface.

I was mostly asking just so I could update the manpage to explain the new 
behavior, as tools that expect to be backwards compatible will have to 
handle both ways of detecting a process dieing.

> > Part of why my code doesn't just exit on POLLHUP is because you can
> > get that result for reasons other than a process exit (for example,
> > if you are using ioctl(PERF_EVENT_IOC_REFRESH)
> 
> Nope, this is related to POOL_HUP (notice the '_') which you'll get
> accompanied with SIGIO if you setup this.

So what happens if you are using a signal handler to monitor a child and 
the child exits?

It's a shame the poll and signal handler interfaces are subtly different, 
though I guess some of that is probably due to historical reasons.

Vince
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: behavior of poll() changed in 3.18

2015-01-21 Thread Jiri Olsa
On Wed, Jan 21, 2015 at 01:06:30AM -0500, Vince Weaver wrote:
> On Tue, 20 Jan 2015, Jiri Olsa wrote:
> 
> > I made this change to get notification that monitored
> > process exited.  We use it in 'perf record' to find out
> > that we have nothing more to monitor and quit.
> > 
> > The logic is to return POLLHUP (via poll syscall) when
> > the monitored process exits for the event.
> > 
> > Could you please share your test code?
> 
> My code does something like this to count the number of overflows
> happening in a child:
> 
>   /* fds contains one fd, which is a harware event */
>   /* attached to a child */
> 
>   while(1) {
> result=poll(fds,1,100);
> if (result==0) {
> waitpid(child,,WNOHANG);

unrelated, but I noticed, that this seems broken.. you 
shouldn't check
WIFEXITED(status) without waitpid(... WNOHANG) returning > 0

> if (WIFEXITED(status)) break;
> if (WIFSIGNALED(status)) {
> printf("Signalled %d!\n",WTERMSIG(status));
> break;
> }
> }
> 
> if (fds[0].revents) count.in++;
> if (fds[0].revents) count.hup++;
> if (fds[0].revents) break;
>   }
> 
> With the change you made this code hangs forever because poll no longer
> returns "0" when the child exits, but instead returns "1" with
> POLLHUP set.

right, once event is HUPed the poll will return POLLHUP in revents

> 
> I admit my code isn't the best out there, but it worked reliably up
> until 3.18 when the change was made.

I think that poll returning POLLHUP for an event which cease to work
due to its process died is correct.

On the other side we broke your code, which is against the policy.

However if we revert this code, we'll loose nice (and standard) way
to check if the event is still valid.. not sure how to handle this.

thoughts?

> 
> Part of why my code doesn't just exit on POLLHUP is because you can
> get that result for reasons other than a process exit (for example,
> if you are using ioctl(PERF_EVENT_IOC_REFRESH)

Nope, this is related to POOL_HUP (notice the '_') which you'll get
accompanied with SIGIO if you setup this.

Before you could get POLLHUP (through poll) only if you'd called
poll on unmapped event.

Now (after commit 179033b3e064) you'll get it also if the process
that the event monitored has died.


jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: behavior of poll() changed in 3.18

2015-01-21 Thread Jiri Olsa
On Wed, Jan 21, 2015 at 01:06:30AM -0500, Vince Weaver wrote:
 On Tue, 20 Jan 2015, Jiri Olsa wrote:
 
  I made this change to get notification that monitored
  process exited.  We use it in 'perf record' to find out
  that we have nothing more to monitor and quit.
  
  The logic is to return POLLHUP (via poll syscall) when
  the monitored process exits for the event.
  
  Could you please share your test code?
 
 My code does something like this to count the number of overflows
 happening in a child:
 
   /* fds contains one fd, which is a harware event */
   /* attached to a child */
 
   while(1) {
 result=poll(fds,1,100);
 if (result==0) {
 waitpid(child,status,WNOHANG);

unrelated, but I noticed, that this seems broken.. you 
shouldn't check
WIFEXITED(status) without waitpid(... WNOHANG) returning  0

 if (WIFEXITED(status)) break;
 if (WIFSIGNALED(status)) {
 printf(Signalled %d!\n,WTERMSIG(status));
 break;
 }
 }
 
 if (fds[0].reventsPOLLIN) count.in++;
 if (fds[0].reventsPOLLHUP) count.hup++;
 if (fds[0].reventsPOLLERR) break;
   }
 
 With the change you made this code hangs forever because poll no longer
 returns 0 when the child exits, but instead returns 1 with
 POLLHUP set.

right, once event is HUPed the poll will return POLLHUP in revents

 
 I admit my code isn't the best out there, but it worked reliably up
 until 3.18 when the change was made.

I think that poll returning POLLHUP for an event which cease to work
due to its process died is correct.

On the other side we broke your code, which is against the policy.

However if we revert this code, we'll loose nice (and standard) way
to check if the event is still valid.. not sure how to handle this.

thoughts?

 
 Part of why my code doesn't just exit on POLLHUP is because you can
 get that result for reasons other than a process exit (for example,
 if you are using ioctl(PERF_EVENT_IOC_REFRESH)

Nope, this is related to POOL_HUP (notice the '_') which you'll get
accompanied with SIGIO if you setup this.

Before you could get POLLHUP (through poll) only if you'd called
poll on unmapped event.

Now (after commit 179033b3e064) you'll get it also if the process
that the event monitored has died.


jirka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: behavior of poll() changed in 3.18

2015-01-21 Thread Vince Weaver
On Wed, 21 Jan 2015, Jiri Olsa wrote:
 
 However if we revert this code, we'll loose nice (and standard) way
 to check if the event is still valid.. not sure how to handle this.

there's likely no need to revert as my code wasn't really released and 
I've already fixed it to work with the new interface.

I was mostly asking just so I could update the manpage to explain the new 
behavior, as tools that expect to be backwards compatible will have to 
handle both ways of detecting a process dieing.

  Part of why my code doesn't just exit on POLLHUP is because you can
  get that result for reasons other than a process exit (for example,
  if you are using ioctl(PERF_EVENT_IOC_REFRESH)
 
 Nope, this is related to POOL_HUP (notice the '_') which you'll get
 accompanied with SIGIO if you setup this.

So what happens if you are using a signal handler to monitor a child and 
the child exits?

It's a shame the poll and signal handler interfaces are subtly different, 
though I guess some of that is probably due to historical reasons.

Vince
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: behavior of poll() changed in 3.18

2015-01-20 Thread Vince Weaver
On Tue, 20 Jan 2015, Jiri Olsa wrote:

> I made this change to get notification that monitored
> process exited.  We use it in 'perf record' to find out
> that we have nothing more to monitor and quit.
> 
> The logic is to return POLLHUP (via poll syscall) when
> the monitored process exits for the event.
> 
> Could you please share your test code?

My code does something like this to count the number of overflows
happening in a child:

/* fds contains one fd, which is a harware event */
/* attached to a child */

while(1) {
result=poll(fds,1,100);
if (result==0) {
waitpid(child,,WNOHANG);
if (WIFEXITED(status)) break;
if (WIFSIGNALED(status)) {
printf("Signalled %d!\n",WTERMSIG(status));
break;
}
}

if (fds[0].revents) count.in++;
if (fds[0].revents) count.hup++;
if (fds[0].revents) break;
}

With the change you made this code hangs forever because poll no longer
returns "0" when the child exits, but instead returns "1" with
POLLHUP set.

I admit my code isn't the best out there, but it worked reliably up
until 3.18 when the change was made.

Part of why my code doesn't just exit on POLLHUP is because you can
get that result for reasons other than a process exit (for example,
if you are using ioctl(PERF_EVENT_IOC_REFRESH)

Vince
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: behavior of poll() changed in 3.18

2015-01-20 Thread Jiri Olsa
On Tue, Jan 20, 2015 at 12:03:19PM -0500, Vince Weaver wrote:
> Hello
> 
> Some of my code that used poll() on a perf_event fd broke sometime between 
> 3.17 and current.  (My perf_event_test validation testsuite caught this
> too but I haven't been running it as regularly as I should).
> 
> I bisected this to:
> 
>   commit 179033b3e064d2cd3f5f9945e76b0a0f0fbf4883
>   Author: Jiri Olsa 
>   Date:   Thu Aug 7 11:48:26 2014 -0400
> 
>   perf: Add PERF_EVENT_STATE_EXIT state for events with exited task
> 
> 
> Before, when polling, my code would get a
>   result=0/WIFEXITED()
> return when a child with events being polled exited.
> 
> Now instead I get a 
>   result=1/POLLHUP
> return.
> 
> Needless to say this broke my code.  Is there a reason this change was 
> made?

hi,
I made this change to get notification that monitored
process exited.  We use it in 'perf record' to find out
that we have nothing more to monitor and quit.

The logic is to return POLLHUP (via poll syscall) when
the monitored process exits for the event.

Could you please share your test code?

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


perf: behavior of poll() changed in 3.18

2015-01-20 Thread Vince Weaver
Hello

Some of my code that used poll() on a perf_event fd broke sometime between 
3.17 and current.  (My perf_event_test validation testsuite caught this
too but I haven't been running it as regularly as I should).

I bisected this to:

commit 179033b3e064d2cd3f5f9945e76b0a0f0fbf4883
Author: Jiri Olsa 
Date:   Thu Aug 7 11:48:26 2014 -0400

perf: Add PERF_EVENT_STATE_EXIT state for events with exited task


Before, when polling, my code would get a
result=0/WIFEXITED()
return when a child with events being polled exited.

Now instead I get a 
result=1/POLLHUP
return.

Needless to say this broke my code.  Is there a reason this change was 
made?

Vince

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


perf: behavior of poll() changed in 3.18

2015-01-20 Thread Vince Weaver
Hello

Some of my code that used poll() on a perf_event fd broke sometime between 
3.17 and current.  (My perf_event_test validation testsuite caught this
too but I haven't been running it as regularly as I should).

I bisected this to:

commit 179033b3e064d2cd3f5f9945e76b0a0f0fbf4883
Author: Jiri Olsa jo...@kernel.org
Date:   Thu Aug 7 11:48:26 2014 -0400

perf: Add PERF_EVENT_STATE_EXIT state for events with exited task


Before, when polling, my code would get a
result=0/WIFEXITED()
return when a child with events being polled exited.

Now instead I get a 
result=1/POLLHUP
return.

Needless to say this broke my code.  Is there a reason this change was 
made?

Vince

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: behavior of poll() changed in 3.18

2015-01-20 Thread Jiri Olsa
On Tue, Jan 20, 2015 at 12:03:19PM -0500, Vince Weaver wrote:
 Hello
 
 Some of my code that used poll() on a perf_event fd broke sometime between 
 3.17 and current.  (My perf_event_test validation testsuite caught this
 too but I haven't been running it as regularly as I should).
 
 I bisected this to:
 
   commit 179033b3e064d2cd3f5f9945e76b0a0f0fbf4883
   Author: Jiri Olsa jo...@kernel.org
   Date:   Thu Aug 7 11:48:26 2014 -0400
 
   perf: Add PERF_EVENT_STATE_EXIT state for events with exited task
 
 
 Before, when polling, my code would get a
   result=0/WIFEXITED()
 return when a child with events being polled exited.
 
 Now instead I get a 
   result=1/POLLHUP
 return.
 
 Needless to say this broke my code.  Is there a reason this change was 
 made?

hi,
I made this change to get notification that monitored
process exited.  We use it in 'perf record' to find out
that we have nothing more to monitor and quit.

The logic is to return POLLHUP (via poll syscall) when
the monitored process exits for the event.

Could you please share your test code?

thanks,
jirka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: behavior of poll() changed in 3.18

2015-01-20 Thread Vince Weaver
On Tue, 20 Jan 2015, Jiri Olsa wrote:

 I made this change to get notification that monitored
 process exited.  We use it in 'perf record' to find out
 that we have nothing more to monitor and quit.
 
 The logic is to return POLLHUP (via poll syscall) when
 the monitored process exits for the event.
 
 Could you please share your test code?

My code does something like this to count the number of overflows
happening in a child:

/* fds contains one fd, which is a harware event */
/* attached to a child */

while(1) {
result=poll(fds,1,100);
if (result==0) {
waitpid(child,status,WNOHANG);
if (WIFEXITED(status)) break;
if (WIFSIGNALED(status)) {
printf(Signalled %d!\n,WTERMSIG(status));
break;
}
}

if (fds[0].reventsPOLLIN) count.in++;
if (fds[0].reventsPOLLHUP) count.hup++;
if (fds[0].reventsPOLLERR) break;
}

With the change you made this code hangs forever because poll no longer
returns 0 when the child exits, but instead returns 1 with
POLLHUP set.

I admit my code isn't the best out there, but it worked reliably up
until 3.18 when the change was made.

Part of why my code doesn't just exit on POLLHUP is because you can
get that result for reasons other than a process exit (for example,
if you are using ioctl(PERF_EVENT_IOC_REFRESH)

Vince
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/