Re: [RFC] [PATCH 00/19] Non disruptive application core dump infrastructure using task_work_add()

2013-10-09 Thread Pavel Emelyanov
On 10/08/2013 02:12 PM, Janani Venkataraman1 wrote:
> 
> 
> 
> 
> From: Pavel Emelyanov 
> To:   Janani Venkataraman1/India/IBM@IBMIN,
> Cc:   , ,
> , ,
> , ,
> , ,
> , ,
> , ,
> , , ,
> , ,
> , , ,
> , ,
> , 
> Date: 10/04/2013 04:08 PM
> Subject:  Re: [RFC] [PATCH 00/19] Non disruptive application core dump
> infrastructure using task_work_add()
> 
> 
> 
> On 10/04/2013 02:30 PM, Janani Venkataraman wrote:
>> Hi all,
>>
> 
>> This series is based on the Task work add approach. We didn't adopt the
> CRIU
>> approch because of the following reasons:
>>
>> * It is not upstream yet.
> 
> It is, starting from criu-v0.7 + linux-3.11
> 
>> * There are concerns about the security of the dump.
> 
> Can you elaborate on this? Is it fixable in CRIU at all?
> 
>> * It involves a lot of changes and this approach provides a UNIX style
>>   interface.
> 
> Can you also shed more light on this -- what changes do you mean?
> 
> We had a prototype ready earlier using the freezer approach.
> http://lwn.net/Articles/419756/
> 
> We made a couple of minor changes to it and implemented using task work
> add.
> We  wanted to know what the community felt about this approach.
> 
> Also in the previous RFD, Andi Kleen had mentioned a concern on the
> security with
> respect to the daemon approach for a self dump in CRIU.

We have this thing addressed -- when one requests a self-dump from criu daemon
the latter 

a) gets pid to dump from SO_PEERCRED, thus requester cannot just send some 
other's pid
b) doesn't dump tasks that belong to user other than the one who requested the 
dump

What other security concerns do you have? We're also interested in addressing 
them.

> Thanks.
> Janani
> 
> .
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH 00/19] Non disruptive application core dump infrastructure using task_work_add()

2013-10-09 Thread Pavel Emelyanov
On 10/08/2013 02:12 PM, Janani Venkataraman1 wrote:
 
 
 
 
 From: Pavel Emelyanov xe...@parallels.com
 To:   Janani Venkataraman1/India/IBM@IBMIN,
 Cc:   linux-kernel@vger.kernel.org, amw...@redhat.com,
 rdun...@xenotime.net, a...@firstfloor.org,
 aravi...@linux.vnet.ibm.com, h...@lst.de,
 mhira...@redhat.com, jeremy.fitzhardi...@citrix.com,
 suz...@linux.vnet.ibm.com, kosaki.motoh...@jp.fujitsu.com,
 adobri...@gmail.com, tarun...@linux.vnet.ibm.com,
 vap...@gentoo.org, rol...@hack.frob.com, t...@kernel.org,
 ana...@linux.vnet.ibm.com, gorcu...@openvz.org,
 ava...@openvz.org, o...@redhat.com, epa...@redhat.com,
 d.hatay...@jp.fujitsu.com, james.ho...@imgtec.com,
 a...@linux-foundation.org, torva...@linux-foundation.org
 Date: 10/04/2013 04:08 PM
 Subject:  Re: [RFC] [PATCH 00/19] Non disruptive application core dump
 infrastructure using task_work_add()
 
 
 
 On 10/04/2013 02:30 PM, Janani Venkataraman wrote:
 Hi all,

 
 This series is based on the Task work add approach. We didn't adopt the
 CRIU
 approch because of the following reasons:

 * It is not upstream yet.
 
 It is, starting from criu-v0.7 + linux-3.11
 
 * There are concerns about the security of the dump.
 
 Can you elaborate on this? Is it fixable in CRIU at all?
 
 * It involves a lot of changes and this approach provides a UNIX style
   interface.
 
 Can you also shed more light on this -- what changes do you mean?
 
 We had a prototype ready earlier using the freezer approach.
 http://lwn.net/Articles/419756/
 
 We made a couple of minor changes to it and implemented using task work
 add.
 We  wanted to know what the community felt about this approach.
 
 Also in the previous RFD, Andi Kleen had mentioned a concern on the
 security with
 respect to the daemon approach for a self dump in CRIU.

We have this thing addressed -- when one requests a self-dump from criu daemon
the latter 

a) gets pid to dump from SO_PEERCRED, thus requester cannot just send some 
other's pid
b) doesn't dump tasks that belong to user other than the one who requested the 
dump

What other security concerns do you have? We're also interested in addressing 
them.

 Thanks.
 Janani
 
 .
 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH 00/19] Non disruptive application core dump infrastructure using task_work_add()

2013-10-08 Thread Janani Venkataraman1




From:   Tejun Heo 
To: Pavel Emelyanov ,
Cc: Janani Venkataraman1/India/IBM@IBMIN,
linux-kernel@vger.kernel.org, amw...@redhat.com,
rdun...@xenotime.net, a...@firstfloor.org,
aravi...@linux.vnet.ibm.com, h...@lst.de, mhira...@redhat.com,
jeremy.fitzhardi...@citrix.com, suz...@linux.vnet.ibm.com,
kosaki.motoh...@jp.fujitsu.com, adobri...@gmail.com,
tarun...@linux.vnet.ibm.com, vap...@gentoo.org,
rol...@hack.frob.com, ana...@linux.vnet.ibm.com,
gorcu...@openvz.org, ava...@openvz.org, o...@redhat.com,
epa...@redhat.com, d.hatay...@jp.fujitsu.com,
james.ho...@imgtec.com, a...@linux-foundation.org,
torva...@linux-foundation.org
Date:   10/08/2013 12:26 AM
Subject:Re: [RFC] [PATCH 00/19] Non disruptive application core dump
infrastructure using task_work_add()
Sent by:Tejun Heo 



Hello,

On Fri, Oct 04, 2013 at 02:38:43PM +0400, Pavel Emelyanov wrote:
> > * It is not upstream yet.
>
> It is, starting from criu-v0.7 + linux-3.11
>
> > * There are concerns about the security of the dump.
>
> Can you elaborate on this? Is it fixable in CRIU at all?
>
> > * It involves a lot of changes and this approach provides a UNIX style
> >   interface.
>
> Can you also shed more light on this -- what changes do you mean?

Yeah, I'd like to hear more too.  It doesn't make much sense to me to
add something completely new if it can be served mostly by the
existing infrastructure.  Also, what do you mean by "disruption"?  You
mentioned signal but PTRACE_SEIZE is completely transparent
w.r.t. signals. If you mean without stopping the target process's
execution, what are you trying to use the dumping for and how much
gain are we talking about?  Also, isn't it kinda mandatory to stop the
process to get a consistent dump?  What am I missing here?

By disruption we do mean not using signals. PTRACE_SEIZE doesn't use
signals,
but the concern is, a seize can't be done on oneself and we are looking at
also
a self dump.

Thanks.
Janani

Thanks.

--
tejun



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH 00/19] Non disruptive application core dump infrastructure using task_work_add()

2013-10-08 Thread Janani Venkataraman1




From:   Pavel Emelyanov 
To: Janani Venkataraman1/India/IBM@IBMIN,
Cc: , ,
, ,
, ,
, ,
, ,
, ,
, , ,
, ,
, , ,
, ,
, 
Date:   10/04/2013 04:08 PM
Subject:Re: [RFC] [PATCH 00/19] Non disruptive application core dump
infrastructure using task_work_add()



On 10/04/2013 02:30 PM, Janani Venkataraman wrote:
> Hi all,
>

> This series is based on the Task work add approach. We didn't adopt the
CRIU
> approch because of the following reasons:
>
> * It is not upstream yet.

It is, starting from criu-v0.7 + linux-3.11

> * There are concerns about the security of the dump.

Can you elaborate on this? Is it fixable in CRIU at all?

> * It involves a lot of changes and this approach provides a UNIX style
>   interface.

Can you also shed more light on this -- what changes do you mean?

We had a prototype ready earlier using the freezer approach.
http://lwn.net/Articles/419756/

We made a couple of minor changes to it and implemented using task work
add.
We  wanted to know what the community felt about this approach.

Also in the previous RFD, Andi Kleen had mentioned a concern on the
security with
respect to the daemon approach for a self dump in CRIU.

Thanks.
Janani

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH 00/19] Non disruptive application core dump infrastructure using task_work_add()

2013-10-08 Thread Janani Venkataraman1




From:   Pavel Emelyanov xe...@parallels.com
To: Janani Venkataraman1/India/IBM@IBMIN,
Cc: linux-kernel@vger.kernel.org, amw...@redhat.com,
rdun...@xenotime.net, a...@firstfloor.org,
aravi...@linux.vnet.ibm.com, h...@lst.de,
mhira...@redhat.com, jeremy.fitzhardi...@citrix.com,
suz...@linux.vnet.ibm.com, kosaki.motoh...@jp.fujitsu.com,
adobri...@gmail.com, tarun...@linux.vnet.ibm.com,
vap...@gentoo.org, rol...@hack.frob.com, t...@kernel.org,
ana...@linux.vnet.ibm.com, gorcu...@openvz.org,
ava...@openvz.org, o...@redhat.com, epa...@redhat.com,
d.hatay...@jp.fujitsu.com, james.ho...@imgtec.com,
a...@linux-foundation.org, torva...@linux-foundation.org
Date:   10/04/2013 04:08 PM
Subject:Re: [RFC] [PATCH 00/19] Non disruptive application core dump
infrastructure using task_work_add()



On 10/04/2013 02:30 PM, Janani Venkataraman wrote:
 Hi all,


 This series is based on the Task work add approach. We didn't adopt the
CRIU
 approch because of the following reasons:

 * It is not upstream yet.

It is, starting from criu-v0.7 + linux-3.11

 * There are concerns about the security of the dump.

Can you elaborate on this? Is it fixable in CRIU at all?

 * It involves a lot of changes and this approach provides a UNIX style
   interface.

Can you also shed more light on this -- what changes do you mean?

We had a prototype ready earlier using the freezer approach.
http://lwn.net/Articles/419756/

We made a couple of minor changes to it and implemented using task work
add.
We  wanted to know what the community felt about this approach.

Also in the previous RFD, Andi Kleen had mentioned a concern on the
security with
respect to the daemon approach for a self dump in CRIU.

Thanks.
Janani

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH 00/19] Non disruptive application core dump infrastructure using task_work_add()

2013-10-08 Thread Janani Venkataraman1




From:   Tejun Heo t...@kernel.org
To: Pavel Emelyanov xe...@parallels.com,
Cc: Janani Venkataraman1/India/IBM@IBMIN,
linux-kernel@vger.kernel.org, amw...@redhat.com,
rdun...@xenotime.net, a...@firstfloor.org,
aravi...@linux.vnet.ibm.com, h...@lst.de, mhira...@redhat.com,
jeremy.fitzhardi...@citrix.com, suz...@linux.vnet.ibm.com,
kosaki.motoh...@jp.fujitsu.com, adobri...@gmail.com,
tarun...@linux.vnet.ibm.com, vap...@gentoo.org,
rol...@hack.frob.com, ana...@linux.vnet.ibm.com,
gorcu...@openvz.org, ava...@openvz.org, o...@redhat.com,
epa...@redhat.com, d.hatay...@jp.fujitsu.com,
james.ho...@imgtec.com, a...@linux-foundation.org,
torva...@linux-foundation.org
Date:   10/08/2013 12:26 AM
Subject:Re: [RFC] [PATCH 00/19] Non disruptive application core dump
infrastructure using task_work_add()
Sent by:Tejun Heo hte...@gmail.com



Hello,

On Fri, Oct 04, 2013 at 02:38:43PM +0400, Pavel Emelyanov wrote:
  * It is not upstream yet.

 It is, starting from criu-v0.7 + linux-3.11

  * There are concerns about the security of the dump.

 Can you elaborate on this? Is it fixable in CRIU at all?

  * It involves a lot of changes and this approach provides a UNIX style
interface.

 Can you also shed more light on this -- what changes do you mean?

Yeah, I'd like to hear more too.  It doesn't make much sense to me to
add something completely new if it can be served mostly by the
existing infrastructure.  Also, what do you mean by disruption?  You
mentioned signal but PTRACE_SEIZE is completely transparent
w.r.t. signals. If you mean without stopping the target process's
execution, what are you trying to use the dumping for and how much
gain are we talking about?  Also, isn't it kinda mandatory to stop the
process to get a consistent dump?  What am I missing here?

By disruption we do mean not using signals. PTRACE_SEIZE doesn't use
signals,
but the concern is, a seize can't be done on oneself and we are looking at
also
a self dump.

Thanks.
Janani

Thanks.

--
tejun



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH 00/19] Non disruptive application core dump infrastructure using task_work_add()

2013-10-07 Thread Tejun Heo
Hello,

On Fri, Oct 04, 2013 at 02:38:43PM +0400, Pavel Emelyanov wrote:
> > * It is not upstream yet.
> 
> It is, starting from criu-v0.7 + linux-3.11
> 
> > * There are concerns about the security of the dump.
> 
> Can you elaborate on this? Is it fixable in CRIU at all?
> 
> > * It involves a lot of changes and this approach provides a UNIX style 
> >   interface.
> 
> Can you also shed more light on this -- what changes do you mean?

Yeah, I'd like to hear more too.  It doesn't make much sense to me to
add something completely new if it can be served mostly by the
existing infrastructure.  Also, what do you mean by "disruption"?  You
mentioned signal but PTRACE_SEIZE is completely transparent
w.r.t. signals.  If you mean without stopping the target process's
execution, what are you trying to use the dumping for and how much
gain are we talking about?  Also, isn't it kinda mandatory to stop the
process to get a consistent dump?  What am I missing here?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH 00/19] Non disruptive application core dump infrastructure using task_work_add()

2013-10-07 Thread Andi Kleen
> > Couldn't they just use the new process_vm_readv() syscalls instead?
> > AFAIK those do not require ptrace.
> > 
> We need the register set and hence would need a ptrace.

But the kernel needs to stop to to read the registers.

Do you have data how much the latency difference is between
an optimized ptrace reader (using PTRACE_GETREGSET) vs the kernel ?

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH 00/19] Non disruptive application core dump infrastructure using task_work_add()

2013-10-07 Thread Oleg Nesterov
On 10/07, Suzuki K. Poulose wrote:
>
> On 10/04/2013 07:14 PM, Andi Kleen wrote:
>
> > Couldn't they just use the new process_vm_readv() syscalls instead?
> > AFAIK those do not require ptrace.
> >
> We need the register set and hence would need a ptrace.

Or the task itself can dump its memory/registers/whatever in responce
to the request from dumper.

gencore_work() is only used to freeze the process, but it can do more?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH 00/19] Non disruptive application core dump infrastructure using task_work_add()

2013-10-07 Thread Suzuki K. Poulose
On 10/04/2013 07:14 PM, Andi Kleen wrote:
> On Fri, Oct 04, 2013 at 04:00:12PM +0530, Janani Venkataraman wrote:
>> Hi all,
>>
>> The following series implements an infrastructure for capturing the core of 
>> an 
>> application without disrupting its process.
> 
> The problem is that gcore et.al. have to stop the process briefly
> to attach and then use the pid mmap ptrace interfaces, right?
> 
Correct.

> Couldn't they just use the new process_vm_readv() syscalls instead?
> AFAIK those do not require ptrace.
> 
We need the register set and hence would need a ptrace.

> Then this could be all done in user space.
> 
> Or are there some specific races with this approach?
> 
Cheers
Suzuki

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH 00/19] Non disruptive application core dump infrastructure using task_work_add()

2013-10-07 Thread Suzuki K. Poulose
On 10/04/2013 07:14 PM, Andi Kleen wrote:
 On Fri, Oct 04, 2013 at 04:00:12PM +0530, Janani Venkataraman wrote:
 Hi all,

 The following series implements an infrastructure for capturing the core of 
 an 
 application without disrupting its process.
 
 The problem is that gcore et.al. have to stop the process briefly
 to attach and then use the pid mmap ptrace interfaces, right?
 
Correct.

 Couldn't they just use the new process_vm_readv() syscalls instead?
 AFAIK those do not require ptrace.
 
We need the register set and hence would need a ptrace.

 Then this could be all done in user space.
 
 Or are there some specific races with this approach?
 
Cheers
Suzuki

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH 00/19] Non disruptive application core dump infrastructure using task_work_add()

2013-10-07 Thread Oleg Nesterov
On 10/07, Suzuki K. Poulose wrote:

 On 10/04/2013 07:14 PM, Andi Kleen wrote:

  Couldn't they just use the new process_vm_readv() syscalls instead?
  AFAIK those do not require ptrace.
 
 We need the register set and hence would need a ptrace.

Or the task itself can dump its memory/registers/whatever in responce
to the request from dumper.

gencore_work() is only used to freeze the process, but it can do more?

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH 00/19] Non disruptive application core dump infrastructure using task_work_add()

2013-10-07 Thread Andi Kleen
  Couldn't they just use the new process_vm_readv() syscalls instead?
  AFAIK those do not require ptrace.
  
 We need the register set and hence would need a ptrace.

But the kernel needs to stop to to read the registers.

Do you have data how much the latency difference is between
an optimized ptrace reader (using PTRACE_GETREGSET) vs the kernel ?

-Andi
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH 00/19] Non disruptive application core dump infrastructure using task_work_add()

2013-10-07 Thread Tejun Heo
Hello,

On Fri, Oct 04, 2013 at 02:38:43PM +0400, Pavel Emelyanov wrote:
  * It is not upstream yet.
 
 It is, starting from criu-v0.7 + linux-3.11
 
  * There are concerns about the security of the dump.
 
 Can you elaborate on this? Is it fixable in CRIU at all?
 
  * It involves a lot of changes and this approach provides a UNIX style 
interface.
 
 Can you also shed more light on this -- what changes do you mean?

Yeah, I'd like to hear more too.  It doesn't make much sense to me to
add something completely new if it can be served mostly by the
existing infrastructure.  Also, what do you mean by disruption?  You
mentioned signal but PTRACE_SEIZE is completely transparent
w.r.t. signals.  If you mean without stopping the target process's
execution, what are you trying to use the dumping for and how much
gain are we talking about?  Also, isn't it kinda mandatory to stop the
process to get a consistent dump?  What am I missing here?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH 00/19] Non disruptive application core dump infrastructure using task_work_add()

2013-10-04 Thread Andi Kleen
On Fri, Oct 04, 2013 at 04:00:12PM +0530, Janani Venkataraman wrote:
> Hi all,
> 
> The following series implements an infrastructure for capturing the core of 
> an 
> application without disrupting its process.

The problem is that gcore et.al. have to stop the process briefly
to attach and then use the pid mmap ptrace interfaces, right?

Couldn't they just use the new process_vm_readv() syscalls instead?
AFAIK those do not require ptrace.

Then this could be all done in user space.

Or are there some specific races with this approach?

-Andi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH 00/19] Non disruptive application core dump infrastructure using task_work_add()

2013-10-04 Thread Pavel Emelyanov
On 10/04/2013 02:30 PM, Janani Venkataraman wrote:
> Hi all,
> 

> This series is based on the Task work add approach. We didn't adopt the CRIU
> approch because of the following reasons:
> 
> * It is not upstream yet.

It is, starting from criu-v0.7 + linux-3.11

> * There are concerns about the security of the dump.

Can you elaborate on this? Is it fixable in CRIU at all?

> * It involves a lot of changes and this approach provides a UNIX style 
>   interface.

Can you also shed more light on this -- what changes do you mean?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC] [PATCH 00/19] Non disruptive application core dump infrastructure using task_work_add()

2013-10-04 Thread Janani Venkataraman
Hi all,

The following series implements an infrastructure for capturing the core of an 
application without disrupting its process.

So ideally what we are trying to do is to export the infrastructure using
/proc/pid/core. Reading the file would give an ELF Format core-dump at that
instant non-disruptively, without sending signals.

This would involve basically three operations:

1) Holding the threads of a process without sending a signal (SIGSTOP). At this
point we can collect the register set snapshot and collect other information
required to  create the ELF header. The above operation could be initiated with
the open() call.

2) Once the ELF header is created, read() can return the CORE DUMP data
including, the process memory page-by-page, based on the fpos (file position).

3) The threads could be released upon a close().

We discussed various approaches for the implemenation in the post given below.
-https://lkml.org/lkml/2013/9/3/122

This series is based on the Task work add approach. We didn't adopt the CRIU
approch because of the following reasons:

* It is not upstream yet.

* There are concerns about the security of the dump.

* It involves a lot of changes and this approach provides a UNIX style 
  interface.

Task work add

task_work_add() is an interface and an API. The task work add will run any
queued work before returning to user space from the kernel. So that work is
guaranteed to be done before user space can run again. So basically it queues a 
work for a task which is guaranteed to be executed when the task returns from 
kernel space to user space.

* Exploit this function to hold the threads when they are returning to the
  user space.

* Wait until all the threads of the process to be dumped, reach task_work_add.

* Once all the threads have reached, the dump is taken and they are released.

TODO:

* A mechanism to know when all the threads have reached the task added.

* A way to handle a case when one of the threads of the task to be dumped
  is blocked in the kernel.

* We could also add the infrastructure under a config option,
  say:CONFIG_ELF_GENCORE

* The current implementation doesn't wait for the threads to reach 
  wait_for_completion(). Hence there is no guarantee of collecting the
  'register set' reliably. We will address this issue in the next version.
  This is a prototype implementation to get reviews and comments.

Patches 1 to 8 deals with re-arranging the ELF code to be reusable by the
infrastructure.

Patches 9 to 19 implements the infrastructure.

Please let me know your reviews and comments.

Janani Venkataraman (19):
  Create elfcore-common.c for ELF class independent core generation helpers
  Make vma_dump_size() generic
  Make fill_psinfo generic
  Rename compat versions of the reusable core generation routines
  Export the reusable ELF core generation routines
  Define API for reading arch specif Program Headers for Core
  ia64 impelementation for elf_core_copy_extra_phdrs()
  elf_core_copy_extra_phdrs() for UML
  Create /proc/pid/core entry
  Track the core generation requests
  Check if the process is an ELF executable
  Hold the threads using task_work_add
  Create ELF Header
  Create ELF Core notes Data
  Calculate the size of the core file
  Generate the data sections for ELF Core
  Identify the ELF class of the process
  Adding support for compat ELF class data structures
  Compat ELF class core generation support


 arch/ia64/kernel/elfcore.c   |   34 +++
 arch/x86/um/elfcore.c|   32 +++
 fs/Makefile  |1 
 fs/binfmt_elf.c  |  190 ++--
 fs/compat_binfmt_elf.c   |7 +
 fs/elfcore-common.c  |  169 ++
 fs/proc/Makefile |2 
 fs/proc/base.c   |2 
 fs/proc/gencore-compat-elf.c |   62 +
 fs/proc/gencore-elf.c|  458 ++
 fs/proc/gencore.c|  262 ++
 fs/proc/gencore.h|   74 ++
 fs/proc/internal.h   |1 
 include/linux/elfcore-internal.h |   72 ++
 include/linux/elfcore.h  |3 
 kernel/elfcore.c |6 
 16 files changed, 1209 insertions(+), 166 deletions(-)
 create mode 100644 fs/elfcore-common.c
 create mode 100644 fs/proc/gencore-compat-elf.c
 create mode 100644 fs/proc/gencore-elf.c
 create mode 100644 fs/proc/gencore.c
 create mode 100644 fs/proc/gencore.h
 create mode 100644 include/linux/elfcore-internal.h

-- 
Janani 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC] [PATCH 00/19] Non disruptive application core dump infrastructure using task_work_add()

2013-10-04 Thread Janani Venkataraman
Hi all,

The following series implements an infrastructure for capturing the core of an 
application without disrupting its process.

So ideally what we are trying to do is to export the infrastructure using
/proc/pid/core. Reading the file would give an ELF Format core-dump at that
instant non-disruptively, without sending signals.

This would involve basically three operations:

1) Holding the threads of a process without sending a signal (SIGSTOP). At this
point we can collect the register set snapshot and collect other information
required to  create the ELF header. The above operation could be initiated with
the open() call.

2) Once the ELF header is created, read() can return the CORE DUMP data
including, the process memory page-by-page, based on the fpos (file position).

3) The threads could be released upon a close().

We discussed various approaches for the implemenation in the post given below.
-https://lkml.org/lkml/2013/9/3/122

This series is based on the Task work add approach. We didn't adopt the CRIU
approch because of the following reasons:

* It is not upstream yet.

* There are concerns about the security of the dump.

* It involves a lot of changes and this approach provides a UNIX style 
  interface.

Task work add

task_work_add() is an interface and an API. The task work add will run any
queued work before returning to user space from the kernel. So that work is
guaranteed to be done before user space can run again. So basically it queues a 
work for a task which is guaranteed to be executed when the task returns from 
kernel space to user space.

* Exploit this function to hold the threads when they are returning to the
  user space.

* Wait until all the threads of the process to be dumped, reach task_work_add.

* Once all the threads have reached, the dump is taken and they are released.

TODO:

* A mechanism to know when all the threads have reached the task added.

* A way to handle a case when one of the threads of the task to be dumped
  is blocked in the kernel.

* We could also add the infrastructure under a config option,
  say:CONFIG_ELF_GENCORE

* The current implementation doesn't wait for the threads to reach 
  wait_for_completion(). Hence there is no guarantee of collecting the
  'register set' reliably. We will address this issue in the next version.
  This is a prototype implementation to get reviews and comments.

Patches 1 to 8 deals with re-arranging the ELF code to be reusable by the
infrastructure.

Patches 9 to 19 implements the infrastructure.

Please let me know your reviews and comments.

Janani Venkataraman (19):
  Create elfcore-common.c for ELF class independent core generation helpers
  Make vma_dump_size() generic
  Make fill_psinfo generic
  Rename compat versions of the reusable core generation routines
  Export the reusable ELF core generation routines
  Define API for reading arch specif Program Headers for Core
  ia64 impelementation for elf_core_copy_extra_phdrs()
  elf_core_copy_extra_phdrs() for UML
  Create /proc/pid/core entry
  Track the core generation requests
  Check if the process is an ELF executable
  Hold the threads using task_work_add
  Create ELF Header
  Create ELF Core notes Data
  Calculate the size of the core file
  Generate the data sections for ELF Core
  Identify the ELF class of the process
  Adding support for compat ELF class data structures
  Compat ELF class core generation support


 arch/ia64/kernel/elfcore.c   |   34 +++
 arch/x86/um/elfcore.c|   32 +++
 fs/Makefile  |1 
 fs/binfmt_elf.c  |  190 ++--
 fs/compat_binfmt_elf.c   |7 +
 fs/elfcore-common.c  |  169 ++
 fs/proc/Makefile |2 
 fs/proc/base.c   |2 
 fs/proc/gencore-compat-elf.c |   62 +
 fs/proc/gencore-elf.c|  458 ++
 fs/proc/gencore.c|  262 ++
 fs/proc/gencore.h|   74 ++
 fs/proc/internal.h   |1 
 include/linux/elfcore-internal.h |   72 ++
 include/linux/elfcore.h  |3 
 kernel/elfcore.c |6 
 16 files changed, 1209 insertions(+), 166 deletions(-)
 create mode 100644 fs/elfcore-common.c
 create mode 100644 fs/proc/gencore-compat-elf.c
 create mode 100644 fs/proc/gencore-elf.c
 create mode 100644 fs/proc/gencore.c
 create mode 100644 fs/proc/gencore.h
 create mode 100644 include/linux/elfcore-internal.h

-- 
Janani 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH 00/19] Non disruptive application core dump infrastructure using task_work_add()

2013-10-04 Thread Pavel Emelyanov
On 10/04/2013 02:30 PM, Janani Venkataraman wrote:
 Hi all,
 

 This series is based on the Task work add approach. We didn't adopt the CRIU
 approch because of the following reasons:
 
 * It is not upstream yet.

It is, starting from criu-v0.7 + linux-3.11

 * There are concerns about the security of the dump.

Can you elaborate on this? Is it fixable in CRIU at all?

 * It involves a lot of changes and this approach provides a UNIX style 
   interface.

Can you also shed more light on this -- what changes do you mean?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] [PATCH 00/19] Non disruptive application core dump infrastructure using task_work_add()

2013-10-04 Thread Andi Kleen
On Fri, Oct 04, 2013 at 04:00:12PM +0530, Janani Venkataraman wrote:
 Hi all,
 
 The following series implements an infrastructure for capturing the core of 
 an 
 application without disrupting its process.

The problem is that gcore et.al. have to stop the process briefly
to attach and then use the pid mmap ptrace interfaces, right?

Couldn't they just use the new process_vm_readv() syscalls instead?
AFAIK those do not require ptrace.

Then this could be all done in user space.

Or are there some specific races with this approach?

-Andi

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/