[PATCH RESEND] connector: add parent pid and tgid to coredump and exit events

2018-04-30 Thread Stefan Strogin
The intention is to get notified of process failures as soon
as possible, before a possible core dumping (which could be very long)
(e.g. in some process-manager). Coredump and exit process events
are perfect for such use cases (see 2b5faa4c553f "connector: Added
coredumping event to the process connector").

The problem is that for now the process-manager cannot know the parent
of a dying process using connectors. This could be useful if the
process-manager should monitor for failures only children of certain
parents, so we could filter the coredump and exit events by parent
process and/or thread ID.

Add parent pid and tgid to coredump and exit process connectors event
data.

Signed-off-by: Stefan Strogin 
Acked-by: Evgeniy Polyakov 
---
 drivers/connector/cn_proc.c  | 4 
 include/uapi/linux/cn_proc.h | 4 
 2 files changed, 8 insertions(+)

diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
index a782ce87715c..ed5e42461094 100644
--- a/drivers/connector/cn_proc.c
+++ b/drivers/connector/cn_proc.c
@@ -262,6 +262,8 @@ void proc_coredump_connector(struct task_struct *task)
ev->what = PROC_EVENT_COREDUMP;
ev->event_data.coredump.process_pid = task->pid;
ev->event_data.coredump.process_tgid = task->tgid;
+   ev->event_data.coredump.parent_pid = task->real_parent->pid;
+   ev->event_data.coredump.parent_tgid = task->real_parent->tgid;
 
memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
msg->ack = 0; /* not used */
@@ -288,6 +290,8 @@ void proc_exit_connector(struct task_struct *task)
ev->event_data.exit.process_tgid = task->tgid;
ev->event_data.exit.exit_code = task->exit_code;
ev->event_data.exit.exit_signal = task->exit_signal;
+   ev->event_data.exit.parent_pid = task->real_parent->pid;
+   ev->event_data.exit.parent_tgid = task->real_parent->tgid;
 
memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
msg->ack = 0; /* not used */
diff --git a/include/uapi/linux/cn_proc.h b/include/uapi/linux/cn_proc.h
index 68ff25414700..db210625cee8 100644
--- a/include/uapi/linux/cn_proc.h
+++ b/include/uapi/linux/cn_proc.h
@@ -116,12 +116,16 @@ struct proc_event {
struct coredump_proc_event {
__kernel_pid_t process_pid;
__kernel_pid_t process_tgid;
+   __kernel_pid_t parent_pid;
+   __kernel_pid_t parent_tgid;
} coredump;
 
struct exit_proc_event {
__kernel_pid_t process_pid;
__kernel_pid_t process_tgid;
__u32 exit_code, exit_signal;
+   __kernel_pid_t parent_pid;
+   __kernel_pid_t parent_tgid;
} exit;
 
} event_data;
-- 
2.16.1



Re: [PATCH] connector: add parent pid and tgid to coredump and exit events

2018-04-26 Thread Stefan Strogin
Hi David, Evgeniy,

Sorry to bother you, but could you please comment about the UAPI change and the 
patch?

Thanks, Jesper.

--
Stefan

On 05/04/18 12:07, Jesper Derehag wrote:
> Unless David comes back with something I have (also) missed regarding uapi 
> breakage, this looks good to me.
> 
> /Jesper
> 
> ________
> Från: Stefan Strogin 
> Skickat: den 2 april 2018 17:18
> Till: David Miller
> Kopia: z...@ioremap.net; netdev@vger.kernel.org; 
> linux-ker...@vger.kernel.org; xe-linux-exter...@cisco.com; 
> jdere...@hotmail.com; matt.hels...@gmail.com; mini...@googlemail.com
> Ämne: Re: [PATCH] connector: add parent pid and tgid to coredump and exit 
> events
> 
> Hi David,
> 
> I don't see how it breaks UAPI. The point is that structures
> coredump_proc_event and exit_proc_event are members of *union*
> event_data, thus position of the existing data in the structure is
> unchanged. Furthermore, this change won't increase size of struct
> proc_event, because comm_proc_event (also a member of event_data) is
> of bigger size than the changed structures.
> 
> If I'm wrong, could you please explain what exactly will the change
> break in UAPI?
> 
> 
> On 30/03/18 19:59, David Miller wrote:
>> From: Stefan Strogin 
>> Date: Thu, 29 Mar 2018 17:12:47 +0300
>>
>>> diff --git a/include/uapi/linux/cn_proc.h b/include/uapi/linux/cn_proc.h
>>> index 68ff25414700..db210625cee8 100644
>>> --- a/include/uapi/linux/cn_proc.h
>>> +++ b/include/uapi/linux/cn_proc.h
>>> @@ -116,12 +116,16 @@ struct proc_event {
>>>  struct coredump_proc_event {
>>>  __kernel_pid_t process_pid;
>>>  __kernel_pid_t process_tgid;
>>> +__kernel_pid_t parent_pid;
>>> +__kernel_pid_t parent_tgid;
>>>  } coredump;
>>>
>>>  struct exit_proc_event {
>>>  __kernel_pid_t process_pid;
>>>  __kernel_pid_t process_tgid;
>>>  __u32 exit_code, exit_signal;
>>> +__kernel_pid_t parent_pid;
>>> +__kernel_pid_t parent_tgid;
>>>  } exit;
>>>
>>>  } event_data;
>>
>> I don't think you can add these members without breaking UAPI.
>>
> 



Re: [RFC] connector: add group_exit_code and signal_flags fields to exit_proc_event

2018-04-10 Thread Stefan Strogin
Hi Evgeniy,

On 09/04/18 02:49, Evgeniy Polyakov wrote:
> Hi everyone
> 
> Sorry for that late reply
> 
> 01.03.2018, 21:58, "Stefan Strogin" :
>> So I was thinking to add these two fields to union event_data:
>> task->signal->group_exit_code
>> task->signal->flags
>> This won't increase size of struct proc_event (because of comm_proc_event)
>> and shouldn't break backward compatibility for the user-space. But it will
>> add some useful information about what caused the process death.
>> What do you think, is it an acceptable approach?
> 
> As I saw in other discussion, doesn't it break userspace API, or you are sure 
> that no sizes has been increased?
> You are using the same structure as used for plain signals and add group 
> status there, how will userspace react,
> if it was compiled with older headers? What if it uses zero-field alignment, 
> i.e. allocating exactly the size of structure with byte precision?
> 

Please ignore this RFC, I was wrong about the fields I need for the problem.
I have sent this patch: https://lkml.org/lkml/2018/3/29/531,
would be grateful for a review.

As for breaking UAPI and structure sizes, look:

> struct proc_event {
...
>   union { /* must be last field of proc_event struct */
...
>   struct comm_proc_event {
>   __kernel_pid_t process_pid;
>   __kernel_pid_t process_tgid;
>   char   comm[16];
>   } comm;
__kernel_pid_t is int that is always 4 bytes in Linux, then
sizeof(event_data.comm) == 24 on all platforms.
> 
>   struct coredump_proc_event {
>   __kernel_pid_t process_pid;
>   __kernel_pid_t process_tgid;
> + __kernel_pid_t parent_pid;
> + __kernel_pid_t parent_tgid;
>   } coredump;
sizeof(event_data.coredump) == 16
> 
>   struct exit_proc_event {
>   __kernel_pid_t process_pid;
>   __kernel_pid_t process_tgid;
>   __u32 exit_code, exit_signal;
> + __kernel_pid_t parent_pid;
> + __kernel_pid_t parent_tgid;
>   } exit;
sizeof(event_data.exit) == 24
> 
>   } event_data;
> };

Therefore, sizeof(event_data) is always = 24 - with old headers and new ones.
sizeof(struct proc_event) is the same as well.
Hence user-space software with old and new headers will allocate the same size.

If the user-space program somehow allocates space only for an internal 
structure,
e.g. for event_data.exit, I still don't see any problems if it allocates and
handles only first 16 bytes of the structure using old headers.

--
Stefan


Re: [PATCH] connector: add parent pid and tgid to coredump and exit events

2018-04-02 Thread Stefan Strogin
Hi David,

I don't see how it breaks UAPI. The point is that structures
coredump_proc_event and exit_proc_event are members of *union*
event_data, thus position of the existing data in the structure is
unchanged. Furthermore, this change won't increase size of struct
proc_event, because comm_proc_event (also a member of event_data) is
of bigger size than the changed structures.

If I'm wrong, could you please explain what exactly will the change
break in UAPI?


On 30/03/18 19:59, David Miller wrote:
> From: Stefan Strogin 
> Date: Thu, 29 Mar 2018 17:12:47 +0300
> 
>> diff --git a/include/uapi/linux/cn_proc.h b/include/uapi/linux/cn_proc.h
>> index 68ff25414700..db210625cee8 100644
>> --- a/include/uapi/linux/cn_proc.h
>> +++ b/include/uapi/linux/cn_proc.h
>> @@ -116,12 +116,16 @@ struct proc_event {
>>  struct coredump_proc_event {
>>  __kernel_pid_t process_pid;
>>  __kernel_pid_t process_tgid;
>> +__kernel_pid_t parent_pid;
>> +__kernel_pid_t parent_tgid;
>>  } coredump;
>>  
>>  struct exit_proc_event {
>>  __kernel_pid_t process_pid;
>>  __kernel_pid_t process_tgid;
>>  __u32 exit_code, exit_signal;
>> +__kernel_pid_t parent_pid;
>> +__kernel_pid_t parent_tgid;
>>  } exit;
>>  
>>  } event_data;
> 
> I don't think you can add these members without breaking UAPI.
> 



[PATCH] connector: add parent pid and tgid to coredump and exit events

2018-03-29 Thread Stefan Strogin
The intention is to get notified of process failures as soon
as possible, before a possible core dumping (which could be very long)
(e.g. in some process-manager). Coredump and exit process events
are perfect for such use cases (see 2b5faa4c553f "connector: Added
coredumping event to the process connector").

The problem is that for now the process-manager cannot know the parent
of a dying process using connectors. This could be useful if the
process-manager should monitor for failures only children of certain
parents, so we could filter the coredump and exit events by parent
process and/or thread ID.

Add parent pid and tgid to coredump and exit process connectors event
data.

Signed-off-by: Stefan Strogin 
---
 drivers/connector/cn_proc.c  | 4 
 include/uapi/linux/cn_proc.h | 4 
 2 files changed, 8 insertions(+)

diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
index a782ce87715c..ed5e42461094 100644
--- a/drivers/connector/cn_proc.c
+++ b/drivers/connector/cn_proc.c
@@ -262,6 +262,8 @@ void proc_coredump_connector(struct task_struct *task)
ev->what = PROC_EVENT_COREDUMP;
ev->event_data.coredump.process_pid = task->pid;
ev->event_data.coredump.process_tgid = task->tgid;
+   ev->event_data.coredump.parent_pid = task->real_parent->pid;
+   ev->event_data.coredump.parent_tgid = task->real_parent->tgid;
 
memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
msg->ack = 0; /* not used */
@@ -288,6 +290,8 @@ void proc_exit_connector(struct task_struct *task)
ev->event_data.exit.process_tgid = task->tgid;
ev->event_data.exit.exit_code = task->exit_code;
ev->event_data.exit.exit_signal = task->exit_signal;
+   ev->event_data.exit.parent_pid = task->real_parent->pid;
+   ev->event_data.exit.parent_tgid = task->real_parent->tgid;
 
memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
msg->ack = 0; /* not used */
diff --git a/include/uapi/linux/cn_proc.h b/include/uapi/linux/cn_proc.h
index 68ff25414700..db210625cee8 100644
--- a/include/uapi/linux/cn_proc.h
+++ b/include/uapi/linux/cn_proc.h
@@ -116,12 +116,16 @@ struct proc_event {
struct coredump_proc_event {
__kernel_pid_t process_pid;
__kernel_pid_t process_tgid;
+   __kernel_pid_t parent_pid;
+   __kernel_pid_t parent_tgid;
} coredump;
 
struct exit_proc_event {
__kernel_pid_t process_pid;
__kernel_pid_t process_tgid;
__u32 exit_code, exit_signal;
+   __kernel_pid_t parent_pid;
+   __kernel_pid_t parent_tgid;
} exit;
 
} event_data;
-- 
2.11.0



[RFC] connector: add group_exit_code and signal_flags fields to exit_proc_event

2018-03-01 Thread Stefan Strogin
Hello everyone,

I'm working on a user-space application that should handle events when some
processes (children of a certain process, "App1") are being killed with
SIGKILL. Also these notifications must be filtered somehow, e.g. we don't
need to handle cases when exit() generates SIGKILL to child threads
(group exit), for example an OOM-caused SIGKILL and a group exit-generated one
are totally different and we must somehow distinguish them.
Unfortunately we also cannot use SIGCHLD and waitid() in App1, there
should be another application that monitors children of App1.

My idea is to use PROC_EVENT_EXIT. But unfortunately it seems there is not
enough information in exit_proc_event structure to see if that was group
exit-generated SIGKILL or not. Tell me please if I'm wrong.

So I was thinking to add these two fields to union event_data:
task->signal->group_exit_code
task->signal->flags
This won't increase size of struct proc_event (because of comm_proc_event)
and shouldn't break backward compatibility for the user-space. But it will
add some useful information about what caused the process death.
What do you think, is it an acceptable approach?

Thanks

Signed-off-by: Stefan Strogin 
---
 drivers/connector/cn_proc.c  | 2 ++
 include/uapi/linux/cn_proc.h | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
index a782ce87715c..1006cb506a52 100644
--- a/drivers/connector/cn_proc.c
+++ b/drivers/connector/cn_proc.c
@@ -288,6 +288,8 @@ void proc_exit_connector(struct task_struct *task)
ev->event_data.exit.process_tgid = task->tgid;
ev->event_data.exit.exit_code = task->exit_code;
ev->event_data.exit.exit_signal = task->exit_signal;
+   ev->event_data.exit.group_exit_code = task->signal->group_exit_code;
+   ev->event_data.exit.signal_flags = task->signal->flags;
 
memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
msg->ack = 0; /* not used */
diff --git a/include/uapi/linux/cn_proc.h b/include/uapi/linux/cn_proc.h
index 68ff25414700..edf50f398116 100644
--- a/include/uapi/linux/cn_proc.h
+++ b/include/uapi/linux/cn_proc.h
@@ -122,6 +122,8 @@ struct proc_event {
__kernel_pid_t process_pid;
__kernel_pid_t process_tgid;
__u32 exit_code, exit_signal;
+   int group_exit_code;
+   unsigned int signal_flags;
} exit;
 
} event_data;
-- 
2.16.1