Re: [Xen-devel] [PATCH v4 5/7] libxl: support unmapping static shared memory areas during domain destruction

2018-02-26 Thread Ian Jackson
Wei Liu writes ("Re: [Xen-devel] [PATCH v4 5/7] libxl: support unmapping static 
shared memory areas during domain destruction"):
> On Mon, Feb 12, 2018 at 03:24:26PM +, Julien Grall wrote:
> > In any case, the worst that could happen is the unmap is called twice on the
> > same region. So you would get spurious error message. Not that bad.
> 
> Yeah, not that bad. Not going to be a security issue, not going to leak
> resources in the end.
> 
> To avoid spurious unmap, can we maybe unmap the pages after the xenstore
> transaction is committed? In that case, only the successful one gets to
> unmap, the ones that aren't committed will bail.
> 
> (Just tossing around ideas)

It should be the other way around.  Because, your way, if your process
crashes for some reason between the xenstore commit and the unmap, the
memory is leaked.

Instead, do the unmap first.  Check the error code to see if it means
"this was already unmapped" and if so report that only via a debug log
message.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 5/7] libxl: support unmapping static shared memory areas during domain destruction

2018-02-14 Thread Wei Liu
On Mon, Feb 12, 2018 at 03:24:26PM +, Julien Grall wrote:
> 
> 
> On 12/02/18 15:17, Zhongze Liu wrote:
> > Hi Julien,
> 
> Hi,
> 
> > 
> > 2018-02-12 23:09 GMT+08:00 Julien Grall :
> > > Hi,
> > > 
> > > On 12/02/18 14:52, Zhongze Liu wrote:
> > > > 
> > > > 2018-02-08 0:54 GMT+08:00 Julien Grall :
> > > > > 
> > > > > On 07/02/18 16:27, Zhongze Liu wrote:
> > > > 
> > > > It seems that I mistakenly use transaction as a global lock. Now I don't
> > > > have
> > > > any reasons not putting the unmap out of the transaction, but this will
> > > > break
> > > > the original transaction into two, and I do think that we need some
> > > > explicit
> > > > locking here.
> > > 
> > > 
> > > Can you explain why you need specific locking here? What are you trying to
> > > protect? Are you trying to protect against two process doing the unmap?
> > > 
> > 
> > Yes.
> 
> I don't think you have to worry about the locking here. With the current
> interface, the regions cannot be modified after the guest has booted. So the
> addresses will always stay valid.
> 
> This code path should never be called concurrently, as a lot of code in
> libxl, so I think someone else will take care about that for you (I will let
> Wei confirm here).
> 
> In any case, the worst that could happen is the unmap is called twice on the
> same region. So you would get spurious error message. Not that bad.

Yeah, not that bad. Not going to be a security issue, not going to leak
resources in the end.

To avoid spurious unmap, can we maybe unmap the pages after the xenstore
transaction is committed? In that case, only the successful one gets to
unmap, the ones that aren't committed will bail.

(Just tossing around ideas)

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 5/7] libxl: support unmapping static shared memory areas during domain destruction

2018-02-14 Thread Wei Liu
On Mon, Feb 12, 2018 at 03:24:26PM +, Julien Grall wrote:
> 
> 
> On 12/02/18 15:17, Zhongze Liu wrote:
> > Hi Julien,
> 
> Hi,
> 
> > 
> > 2018-02-12 23:09 GMT+08:00 Julien Grall :
> > > Hi,
> > > 
> > > On 12/02/18 14:52, Zhongze Liu wrote:
> > > > 
> > > > 2018-02-08 0:54 GMT+08:00 Julien Grall :
> > > > > 
> > > > > On 07/02/18 16:27, Zhongze Liu wrote:
> > > > 
> > > > It seems that I mistakenly use transaction as a global lock. Now I don't
> > > > have
> > > > any reasons not putting the unmap out of the transaction, but this will
> > > > break
> > > > the original transaction into two, and I do think that we need some
> > > > explicit
> > > > locking here.
> > > 
> > > 
> > > Can you explain why you need specific locking here? What are you trying to
> > > protect? Are you trying to protect against two process doing the unmap?
> > > 
> > 
> > Yes.
> 
> I don't think you have to worry about the locking here. With the current
> interface, the regions cannot be modified after the guest has booted. So the
> addresses will always stay valid.
> 
> This code path should never be called concurrently, as a lot of code in
> libxl, so I think someone else will take care about that for you (I will let
> Wei confirm here).

I think you mean "can be called concurrently but locking is taken care
of".

If two processes / threads with different libxl ctx (not just xl. xl has
a global lock) try to manipulate the same domain, libxl APIs use file
locks. For the same same process, multiple threads but sharing one
libxl_ctx there is pthread mutex in libxl_ctx.

In a way, "someone else" (in libxl code) has already taken care of the
locking.

Fundamentally We need to assume the code can be called concurrently. It
would be good thing if this is taken into consideration if new code is
added.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 5/7] libxl: support unmapping static shared memory areas during domain destruction

2018-02-12 Thread Julien Grall



On 12/02/18 15:17, Zhongze Liu wrote:

Hi Julien,


Hi,



2018-02-12 23:09 GMT+08:00 Julien Grall :

Hi,

On 12/02/18 14:52, Zhongze Liu wrote:


2018-02-08 0:54 GMT+08:00 Julien Grall :


On 07/02/18 16:27, Zhongze Liu wrote:


It seems that I mistakenly use transaction as a global lock. Now I don't
have
any reasons not putting the unmap out of the transaction, but this will
break
the original transaction into two, and I do think that we need some
explicit
locking here.



Can you explain why you need specific locking here? What are you trying to
protect? Are you trying to protect against two process doing the unmap?



Yes.


I don't think you have to worry about the locking here. With the current 
interface, the regions cannot be modified after the guest has booted. So 
the addresses will always stay valid.


This code path should never be called concurrently, as a lot of code in 
libxl, so I think someone else will take care about that for you (I will 
let Wei confirm here).


In any case, the worst that could happen is the unmap is called twice on 
the same region. So you would get spurious error message. Not that bad.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 5/7] libxl: support unmapping static shared memory areas during domain destruction

2018-02-12 Thread Zhongze Liu
Hi Julien,

2018-02-12 23:09 GMT+08:00 Julien Grall :
> Hi,
>
> On 12/02/18 14:52, Zhongze Liu wrote:
>>
>> 2018-02-08 0:54 GMT+08:00 Julien Grall :
>>>
>>> On 07/02/18 16:27, Zhongze Liu wrote:
>>
>> It seems that I mistakenly use transaction as a global lock. Now I don't
>> have
>> any reasons not putting the unmap out of the transaction, but this will
>> break
>> the original transaction into two, and I do think that we need some
>> explicit
>> locking here.
>
>
> Can you explain why you need specific locking here? What are you trying to
> protect? Are you trying to protect against two process doing the unmap?
>

Yes.

Cheers,

Zhongze Liu

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 5/7] libxl: support unmapping static shared memory areas during domain destruction

2018-02-12 Thread Julien Grall

Hi,

On 12/02/18 14:52, Zhongze Liu wrote:

2018-02-08 0:54 GMT+08:00 Julien Grall :

On 07/02/18 16:27, Zhongze Liu wrote:

It seems that I mistakenly use transaction as a global lock. Now I don't have
any reasons not putting the unmap out of the transaction, but this will break
the original transaction into two, and I do think that we need some explicit
locking here.


Can you explain why you need specific locking here? What are you trying 
to protect? Are you trying to protect against two process doing the unmap?


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 5/7] libxl: support unmapping static shared memory areas during domain destruction

2018-02-12 Thread Zhongze Liu
Hi Julien an Wei,

2018-02-08 0:54 GMT+08:00 Julien Grall :
> On 07/02/18 16:27, Zhongze Liu wrote:
>>
>> Hi Wei and Julien,
>
>
> Hi,
>
>
>> 2018-02-07 2:06 GMT+08:00 Wei Liu :
>>>
>>> On Tue, Feb 06, 2018 at 01:24:30PM +, Julien Grall wrote:
>
>if (libxl__device_pci_destroy_all(gc, domid) < 0)
>LOGD(ERROR, domid, "Pci shutdown failed");
>rc = xc_domain_pause(ctx->xch, domid);
> diff --git a/tools/libxl/libxl_internal.h
> b/tools/libxl/libxl_internal.h
> index 2cfe4c08a7..c398b6a6b8 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -4424,6 +4424,8 @@ static inline bool libxl__string_is_default(char
> **s)
>_hidden int libxl__sshm_add(libxl__gc *gc, uint32_t domid,
>libxl_static_shm *sshm, int len);
> +_hidden int libxl__sshm_del(libxl__gc *gc, uint32_t domid);
> +
>_hidden int libxl__sshm_check_overlap(libxl__gc *gc, uint32_t domid,
>  libxl_static_shm *sshms, int
> len);
>_hidden int libxl__sshm_setdefault(libxl__gc *gc, uint32_t domid,
> diff --git a/tools/libxl/libxl_sshm.c b/tools/libxl/libxl_sshm.c
> index 562f46f299..1bf4d4c2dc 100644
> --- a/tools/libxl/libxl_sshm.c
> +++ b/tools/libxl/libxl_sshm.c
> @@ -86,6 +86,112 @@ int libxl__sshm_check_overlap(libxl__gc *gc,
> uint32_t domid,
>return 0;
>}
> +/* Decrease the refcount of an sshm. When refcount reaches 0,


 NIT: Libxl coding style regarding the comment seems to be uncleared
 (Ian,
 Wei?). But I feel keep /* and */ in separate line is nicer.
>>>
>>>
>>> I don't have an opinion here.
>>>

> + * clean up the whole sshm path.
> + */
> +static void libxl__sshm_decref(libxl__gc *gc, xs_transaction_t xt,
> +   const char *sshm_path)
> +static void libxl__sshm_del_slave(libxl__gc *gc, xs_transaction_t xt,
> +  uint32_t domid, const char *id, bool
> isretry)
> +{
> +const char *slave_path, *begin_str, *end_str;
> +uint64_t begin, end;
> +
> +slave_path = GCSPRINTF("%s/slaves/%"PRIu32, SSHM_PATH(id), domid);
> +
> +begin_str = libxl__xs_read(gc, xt, GCSPRINTF("%s/begin",
> slave_path));
> +end_str = libxl__xs_read(gc, xt, GCSPRINTF("%s/end", slave_path));
> +begin = strtoull(begin_str, NULL, 16);
> +end = strtoull(end_str, NULL, 16);
> +
> +/* Avoid calling do_unmap many times in case of xs transaction
> retry */
> +if (!isretry)
> +libxl__sshm_do_unmap(gc, domid, id, begin, end);


 IHMO, by unmapping the regions in middle of the transaction, you
 increase
 the potential failure of it. I would move that out of the transaction
 path.

 I would be interested to hear the opinion of the tools maintainers here.

>>>
>>> If you move the unmap after the loop you create a window in which
>>> the pages are still mapped but the toolstack thinks they are unmapped.
>>>
>>> While the code as-is now makes sure (assuming no error in unmap) the
>>> pages are unmapped no later than the transaction is committed. I think
>>> this can be done by moving unmap before the transaction.
>>>
>>> Zhongze, do you think the unmap must be done inside the loop? What kind
>>> of invariants do you have in mind?
>>>
>>> Then there is the question of "what do we do if unmap fails". Honestly I
>>> don't have an answer. It seems rather screwed up in that case and I
>>> doubt there is much libxl can do to rectify things.
>>>
>>
>> I put the unmap inside the transaction because I want to make the whole
>> read_mapping_info->unmap->update_mapping_info process atomic. If
>> I put unmap outside the transaction:  after I read out the information
>> that I need to do the unmap, and before I do the unmap and decrease the
>> refcnt, there could be another instance of this code trying to do the same
>> thing, which might lead to race condition.
>
>
> AFAIU, the transaction is not a "global" lock. You will just not see the the
> change from the others during the transactions. Your changes will be only be
> visible at the end. So two transaction can be happily started concurrently,
> and try to do the unmap together. Not even your code will protect against
> that.
>
> So can you give a bit more details here?
>

It seems that I mistakenly use transaction as a global lock. Now I don't have
any reasons not putting the unmap out of the transaction, but this will break
the original transaction into two, and I do think that we need some explicit
locking here.

@Wei. Do you have any suggestions here?

Cheers,

Zhongze Liu

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org

Re: [Xen-devel] [PATCH v4 5/7] libxl: support unmapping static shared memory areas during domain destruction

2018-02-07 Thread Julien Grall

On 07/02/18 16:27, Zhongze Liu wrote:

Hi Wei and Julien,


Hi,


2018-02-07 2:06 GMT+08:00 Wei Liu :

On Tue, Feb 06, 2018 at 01:24:30PM +, Julien Grall wrote:

   if (libxl__device_pci_destroy_all(gc, domid) < 0)
   LOGD(ERROR, domid, "Pci shutdown failed");
   rc = xc_domain_pause(ctx->xch, domid);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 2cfe4c08a7..c398b6a6b8 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -4424,6 +4424,8 @@ static inline bool libxl__string_is_default(char **s)
   _hidden int libxl__sshm_add(libxl__gc *gc, uint32_t domid,
   libxl_static_shm *sshm, int len);
+_hidden int libxl__sshm_del(libxl__gc *gc, uint32_t domid);
+
   _hidden int libxl__sshm_check_overlap(libxl__gc *gc, uint32_t domid,
 libxl_static_shm *sshms, int len);
   _hidden int libxl__sshm_setdefault(libxl__gc *gc, uint32_t domid,
diff --git a/tools/libxl/libxl_sshm.c b/tools/libxl/libxl_sshm.c
index 562f46f299..1bf4d4c2dc 100644
--- a/tools/libxl/libxl_sshm.c
+++ b/tools/libxl/libxl_sshm.c
@@ -86,6 +86,112 @@ int libxl__sshm_check_overlap(libxl__gc *gc, uint32_t domid,
   return 0;
   }
+/* Decrease the refcount of an sshm. When refcount reaches 0,


NIT: Libxl coding style regarding the comment seems to be uncleared (Ian,
Wei?). But I feel keep /* and */ in separate line is nicer.


I don't have an opinion here.




+ * clean up the whole sshm path.
+ */
+static void libxl__sshm_decref(libxl__gc *gc, xs_transaction_t xt,
+   const char *sshm_path)
+static void libxl__sshm_del_slave(libxl__gc *gc, xs_transaction_t xt,
+  uint32_t domid, const char *id, bool isretry)
+{
+const char *slave_path, *begin_str, *end_str;
+uint64_t begin, end;
+
+slave_path = GCSPRINTF("%s/slaves/%"PRIu32, SSHM_PATH(id), domid);
+
+begin_str = libxl__xs_read(gc, xt, GCSPRINTF("%s/begin", slave_path));
+end_str = libxl__xs_read(gc, xt, GCSPRINTF("%s/end", slave_path));
+begin = strtoull(begin_str, NULL, 16);
+end = strtoull(end_str, NULL, 16);
+
+/* Avoid calling do_unmap many times in case of xs transaction retry */
+if (!isretry)
+libxl__sshm_do_unmap(gc, domid, id, begin, end);


IHMO, by unmapping the regions in middle of the transaction, you increase
the potential failure of it. I would move that out of the transaction path.

I would be interested to hear the opinion of the tools maintainers here.



If you move the unmap after the loop you create a window in which
the pages are still mapped but the toolstack thinks they are unmapped.

While the code as-is now makes sure (assuming no error in unmap) the
pages are unmapped no later than the transaction is committed. I think
this can be done by moving unmap before the transaction.

Zhongze, do you think the unmap must be done inside the loop? What kind
of invariants do you have in mind?

Then there is the question of "what do we do if unmap fails". Honestly I
don't have an answer. It seems rather screwed up in that case and I
doubt there is much libxl can do to rectify things.



I put the unmap inside the transaction because I want to make the whole
read_mapping_info->unmap->update_mapping_info process atomic. If
I put unmap outside the transaction:  after I read out the information
that I need to do the unmap, and before I do the unmap and decrease the
refcnt, there could be another instance of this code trying to do the same
thing, which might lead to race condition.


AFAIU, the transaction is not a "global" lock. You will just not see the 
the change from the others during the transactions. Your changes will be 
only be visible at the end. So two transaction can be happily started 
concurrently, and try to do the unmap together. Not even your code will 
protect against that.


So can you give a bit more details here?

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 5/7] libxl: support unmapping static shared memory areas during domain destruction

2018-02-07 Thread Zhongze Liu
Hi Wei and Julien,

2018-02-07 2:06 GMT+08:00 Wei Liu :
> On Tue, Feb 06, 2018 at 01:24:30PM +, Julien Grall wrote:
>> >   if (libxl__device_pci_destroy_all(gc, domid) < 0)
>> >   LOGD(ERROR, domid, "Pci shutdown failed");
>> >   rc = xc_domain_pause(ctx->xch, domid);
>> > diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
>> > index 2cfe4c08a7..c398b6a6b8 100644
>> > --- a/tools/libxl/libxl_internal.h
>> > +++ b/tools/libxl/libxl_internal.h
>> > @@ -4424,6 +4424,8 @@ static inline bool libxl__string_is_default(char **s)
>> >   _hidden int libxl__sshm_add(libxl__gc *gc, uint32_t domid,
>> >   libxl_static_shm *sshm, int len);
>> > +_hidden int libxl__sshm_del(libxl__gc *gc, uint32_t domid);
>> > +
>> >   _hidden int libxl__sshm_check_overlap(libxl__gc *gc, uint32_t domid,
>> > libxl_static_shm *sshms, int len);
>> >   _hidden int libxl__sshm_setdefault(libxl__gc *gc, uint32_t domid,
>> > diff --git a/tools/libxl/libxl_sshm.c b/tools/libxl/libxl_sshm.c
>> > index 562f46f299..1bf4d4c2dc 100644
>> > --- a/tools/libxl/libxl_sshm.c
>> > +++ b/tools/libxl/libxl_sshm.c
>> > @@ -86,6 +86,112 @@ int libxl__sshm_check_overlap(libxl__gc *gc, uint32_t 
>> > domid,
>> >   return 0;
>> >   }
>> > +/* Decrease the refcount of an sshm. When refcount reaches 0,
>>
>> NIT: Libxl coding style regarding the comment seems to be uncleared (Ian,
>> Wei?). But I feel keep /* and */ in separate line is nicer.
>
> I don't have an opinion here.
>
>>
>> > + * clean up the whole sshm path.
>> > + */
>> > +static void libxl__sshm_decref(libxl__gc *gc, xs_transaction_t xt,
>> > +   const char *sshm_path)
>> > +static void libxl__sshm_del_slave(libxl__gc *gc, xs_transaction_t xt,
>> > +  uint32_t domid, const char *id, bool 
>> > isretry)
>> > +{
>> > +const char *slave_path, *begin_str, *end_str;
>> > +uint64_t begin, end;
>> > +
>> > +slave_path = GCSPRINTF("%s/slaves/%"PRIu32, SSHM_PATH(id), domid);
>> > +
>> > +begin_str = libxl__xs_read(gc, xt, GCSPRINTF("%s/begin", slave_path));
>> > +end_str = libxl__xs_read(gc, xt, GCSPRINTF("%s/end", slave_path));
>> > +begin = strtoull(begin_str, NULL, 16);
>> > +end = strtoull(end_str, NULL, 16);
>> > +
>> > +/* Avoid calling do_unmap many times in case of xs transaction retry 
>> > */
>> > +if (!isretry)
>> > +libxl__sshm_do_unmap(gc, domid, id, begin, end);
>>
>> IHMO, by unmapping the regions in middle of the transaction, you increase
>> the potential failure of it. I would move that out of the transaction path.
>>
>> I would be interested to hear the opinion of the tools maintainers here.
>>
>
> If you move the unmap after the loop you create a window in which
> the pages are still mapped but the toolstack thinks they are unmapped.
>
> While the code as-is now makes sure (assuming no error in unmap) the
> pages are unmapped no later than the transaction is committed. I think
> this can be done by moving unmap before the transaction.
>
> Zhongze, do you think the unmap must be done inside the loop? What kind
> of invariants do you have in mind?
>
> Then there is the question of "what do we do if unmap fails". Honestly I
> don't have an answer. It seems rather screwed up in that case and I
> doubt there is much libxl can do to rectify things.
>

I put the unmap inside the transaction because I want to make the whole
read_mapping_info->unmap->update_mapping_info process atomic. If
I put unmap outside the transaction:  after I read out the information
that I need to do the unmap, and before I do the unmap and decrease the
refcnt, there could be another instance of this code trying to do the same
thing, which might lead to race condition.

And yes, I don't think we can do much in case of something go wrong, so
what I'm doing here is just to do as best as I can -- unmapping all that pages
that can be unmapped and cleanup the xs entries.

Cheers,

Zhongze Liu

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 5/7] libxl: support unmapping static shared memory areas during domain destruction

2018-02-06 Thread Wei Liu
On Tue, Feb 06, 2018 at 01:24:30PM +, Julien Grall wrote:
> >   if (libxl__device_pci_destroy_all(gc, domid) < 0)
> >   LOGD(ERROR, domid, "Pci shutdown failed");
> >   rc = xc_domain_pause(ctx->xch, domid);
> > diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> > index 2cfe4c08a7..c398b6a6b8 100644
> > --- a/tools/libxl/libxl_internal.h
> > +++ b/tools/libxl/libxl_internal.h
> > @@ -4424,6 +4424,8 @@ static inline bool libxl__string_is_default(char **s)
> >   _hidden int libxl__sshm_add(libxl__gc *gc, uint32_t domid,
> >   libxl_static_shm *sshm, int len);
> > +_hidden int libxl__sshm_del(libxl__gc *gc, uint32_t domid);
> > +
> >   _hidden int libxl__sshm_check_overlap(libxl__gc *gc, uint32_t domid,
> > libxl_static_shm *sshms, int len);
> >   _hidden int libxl__sshm_setdefault(libxl__gc *gc, uint32_t domid,
> > diff --git a/tools/libxl/libxl_sshm.c b/tools/libxl/libxl_sshm.c
> > index 562f46f299..1bf4d4c2dc 100644
> > --- a/tools/libxl/libxl_sshm.c
> > +++ b/tools/libxl/libxl_sshm.c
> > @@ -86,6 +86,112 @@ int libxl__sshm_check_overlap(libxl__gc *gc, uint32_t 
> > domid,
> >   return 0;
> >   }
> > +/* Decrease the refcount of an sshm. When refcount reaches 0,
> 
> NIT: Libxl coding style regarding the comment seems to be uncleared (Ian,
> Wei?). But I feel keep /* and */ in separate line is nicer.

I don't have an opinion here.

> 
> > + * clean up the whole sshm path.
> > + */
> > +static void libxl__sshm_decref(libxl__gc *gc, xs_transaction_t xt,
> > +   const char *sshm_path)
> > +static void libxl__sshm_del_slave(libxl__gc *gc, xs_transaction_t xt,
> > +  uint32_t domid, const char *id, bool 
> > isretry)
> > +{
> > +const char *slave_path, *begin_str, *end_str;
> > +uint64_t begin, end;
> > +
> > +slave_path = GCSPRINTF("%s/slaves/%"PRIu32, SSHM_PATH(id), domid);
> > +
> > +begin_str = libxl__xs_read(gc, xt, GCSPRINTF("%s/begin", slave_path));
> > +end_str = libxl__xs_read(gc, xt, GCSPRINTF("%s/end", slave_path));
> > +begin = strtoull(begin_str, NULL, 16);
> > +end = strtoull(end_str, NULL, 16);
> > +
> > +/* Avoid calling do_unmap many times in case of xs transaction retry */
> > +if (!isretry)
> > +libxl__sshm_do_unmap(gc, domid, id, begin, end);
> 
> IHMO, by unmapping the regions in middle of the transaction, you increase
> the potential failure of it. I would move that out of the transaction path.
> 
> I would be interested to hear the opinion of the tools maintainers here.
> 

If you move the unmap after the loop you create a window in which
the pages are still mapped but the toolstack thinks they are unmapped.

While the code as-is now makes sure (assuming no error in unmap) the
pages are unmapped no later than the transaction is committed. I think
this can be done by moving unmap before the transaction.

Zhongze, do you think the unmap must be done inside the loop? What kind
of invariants do you have in mind?

Then there is the question of "what do we do if unmap fails". Honestly I
don't have an answer. It seems rather screwed up in that case and I
doubt there is much libxl can do to rectify things.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 5/7] libxl: support unmapping static shared memory areas during domain destruction

2018-02-06 Thread Julien Grall

Hi,

On 01/30/2018 05:50 PM, Zhongze Liu wrote:

Add libxl__sshm_del to unmap static shared memory areas mapped by
libxl__sshm_add during domain creation. The unmapping process is:

* For a master: decrease the refcount of the sshm region, if the refcount
   reaches 0, cleanup the whole sshm path.
* For a slave: 1) unmap the shared pages, and cleanup related xs entries. If
   the system works normally, all the shared pages will be unmapped, so there
   won't be page leaks. In case of errors, the unmapping process will go on and
   unmap all the other pages that can be unmapped, so the other pages won't
   be leaked, either. 2) Decrease the refcount of the sshm region, if the
   refcount reaches 0, cleanup the whole sshm path.


May I ask to add a newline line 1) and 2). This would make clearer the 2 
steps.




This is for the proposal "Allow setting up shared memory areas between VMs
from xl config file" (see [1]).

[1] https://lists.xen.org/archives/html/xen-devel/2017-08/msg03242.html

Signed-off-by: Zhongze Liu 

Cc: Ian Jackson 
Cc: Wei Liu 
Cc: Stefano Stabellini 
Cc: Julien Grall 
Cc: xen-de...@lists.xen.org
---
  tools/libxl/libxl_domain.c   |   5 ++
  tools/libxl/libxl_internal.h |   2 +
  tools/libxl/libxl_sshm.c | 106 +++
  3 files changed, 113 insertions(+)

diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c
index 13b1c73d40..37f001554b 100644
--- a/tools/libxl/libxl_domain.c
+++ b/tools/libxl/libxl_domain.c
@@ -1026,6 +1026,11 @@ void libxl__destroy_domid(libxl__egc *egc, 
libxl__destroy_domid_state *dis)
  goto out;
  }
  
+rc = libxl__sshm_del(gc, domid);

+if (rc) {
+LOGD(ERROR, domid, "Deleting static shm failed.");
+}
+
  if (libxl__device_pci_destroy_all(gc, domid) < 0)
  LOGD(ERROR, domid, "Pci shutdown failed");
  rc = xc_domain_pause(ctx->xch, domid);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 2cfe4c08a7..c398b6a6b8 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -4424,6 +4424,8 @@ static inline bool libxl__string_is_default(char **s)
  _hidden int libxl__sshm_add(libxl__gc *gc, uint32_t domid,
  libxl_static_shm *sshm, int len);
  
+_hidden int libxl__sshm_del(libxl__gc *gc, uint32_t domid);

+
  _hidden int libxl__sshm_check_overlap(libxl__gc *gc, uint32_t domid,
libxl_static_shm *sshms, int len);
  _hidden int libxl__sshm_setdefault(libxl__gc *gc, uint32_t domid,
diff --git a/tools/libxl/libxl_sshm.c b/tools/libxl/libxl_sshm.c
index 562f46f299..1bf4d4c2dc 100644
--- a/tools/libxl/libxl_sshm.c
+++ b/tools/libxl/libxl_sshm.c
@@ -86,6 +86,112 @@ int libxl__sshm_check_overlap(libxl__gc *gc, uint32_t domid,
  return 0;
  }
  
+/* Decrease the refcount of an sshm. When refcount reaches 0,


NIT: Libxl coding style regarding the comment seems to be uncleared 
(Ian, Wei?). But I feel keep /* and */ in separate line is nicer.



+ * clean up the whole sshm path.
+ */
+static void libxl__sshm_decref(libxl__gc *gc, xs_transaction_t xt,
+   const char *sshm_path)
+{
+int count;
+const char *count_path, *count_string;
+
+count_path = GCSPRINTF("%s/usercnt", sshm_path);
+if (libxl__xs_read_checked(gc, xt, count_path, _string))
+return;
+count = atoi(count_string);
+
+if (--count == 0) {
+libxl__xs_path_cleanup(gc, xt, sshm_path);
+return;
+}
+
+count_string = GCSPRINTF("%d", count);
+libxl__xs_write_checked(gc, xt, count_path, count_string);


See my comment about incref in a patch #4.


+
+return;
+}
+
+static void libxl__sshm_do_unmap(libxl__gc *gc, uint32_t domid, const char *id,
+ uint64_t begin, uint64_t end)
+{
+begin >>= XC_PAGE_SHIFT;
+end >>= XC_PAGE_SHIFT;
+for (; begin < end; ++begin) {
+if (xc_domain_remove_from_physmap(CTX->xch, domid, begin)) {
+SSHM_ERROR(domid, id,
+   "unable to unmap shared page at 0x%"PRIx64".",
+   begin);
+}
+}
+}
+
+static void libxl__sshm_del_slave(libxl__gc *gc, xs_transaction_t xt,
+  uint32_t domid, const char *id, bool isretry)
+{
+const char *slave_path, *begin_str, *end_str;
+uint64_t begin, end;
+
+slave_path = GCSPRINTF("%s/slaves/%"PRIu32, SSHM_PATH(id), domid);
+
+begin_str = libxl__xs_read(gc, xt, GCSPRINTF("%s/begin", slave_path));
+end_str = libxl__xs_read(gc, xt, GCSPRINTF("%s/end", slave_path));
+begin = strtoull(begin_str, NULL, 16);
+end = strtoull(end_str, NULL, 16);
+
+/* Avoid calling do_unmap many times in case of xs transaction retry */
+if (!isretry)
+libxl__sshm_do_unmap(gc, domid, id, begin,