Re: Problem with migration/rdma

2024-03-11 Thread Yu Zhang
>> I have reviewed and tested the change. Have tweaked the commit message
>> accordingly.
>> I hope that's okay with you Yu Zhang :)
it's okay for me. As it's a tiny fix, you may modify or include it in
your own commits.

Best regards,
Yu Zhang
11.03.2024

On Mon, Mar 11, 2024 at 3:57 PM Het Gala  wrote:
>
>
> On 11/03/24 8:16 pm, Peter Xu wrote:
> > On Mon, Mar 11, 2024 at 08:00:06PM +0530, Het Gala wrote:
> >> Let me send a proper patch to qemu devel mailing list and cc all the people
> >> involved.
> >>
> >> I have reviewed and tested the change. Have tweaked the commit message
> >> accordingly.
> >> I hope that's okay with you Yu Zhang :)
> > Het - don't worry, I've had it in my queue.  Thanks,
> Okay. Thanks
> >
> > =
> >  From 694451b89b21b3b67c404cbcfa2b84e3afae0c5d Mon Sep 17 00:00:00 2001
> > From: Yu Zhang 
> > Date: Wed, 6 Mar 2024 09:06:54 +0100
> > Subject: [PATCH] migration/rdma: Fix a memory issue for migration
> >
> > In commit 3fa9642ff7 change was made to convert the RDMA backend to
> > accept MigrateAddress struct. However, the assignment of "host" leads
> > to data corruption on the target host and the failure of migration.
> >
> >  isock->host = rdma->host;
>
> Regards,
>
> Het Gala
>



Re: Problem with migration/rdma

2024-03-11 Thread Het Gala



On 11/03/24 8:16 pm, Peter Xu wrote:

On Mon, Mar 11, 2024 at 08:00:06PM +0530, Het Gala wrote:

Let me send a proper patch to qemu devel mailing list and cc all the people
involved.

I have reviewed and tested the change. Have tweaked the commit message
accordingly.
I hope that's okay with you Yu Zhang :)

Het - don't worry, I've had it in my queue.  Thanks,

Okay. Thanks


=
 From 694451b89b21b3b67c404cbcfa2b84e3afae0c5d Mon Sep 17 00:00:00 2001
From: Yu Zhang 
Date: Wed, 6 Mar 2024 09:06:54 +0100
Subject: [PATCH] migration/rdma: Fix a memory issue for migration

In commit 3fa9642ff7 change was made to convert the RDMA backend to
accept MigrateAddress struct. However, the assignment of "host" leads
to data corruption on the target host and the failure of migration.

 isock->host = rdma->host;


Regards,

Het Gala




Re: Problem with migration/rdma

2024-03-11 Thread Peter Xu
On Mon, Mar 11, 2024 at 08:00:06PM +0530, Het Gala wrote:
> Let me send a proper patch to qemu devel mailing list and cc all the people
> involved.
> 
> I have reviewed and tested the change. Have tweaked the commit message
> accordingly.
> I hope that's okay with you Yu Zhang :)

Het - don't worry, I've had it in my queue.  Thanks,

=
>From 694451b89b21b3b67c404cbcfa2b84e3afae0c5d Mon Sep 17 00:00:00 2001
From: Yu Zhang 
Date: Wed, 6 Mar 2024 09:06:54 +0100
Subject: [PATCH] migration/rdma: Fix a memory issue for migration

In commit 3fa9642ff7 change was made to convert the RDMA backend to
accept MigrateAddress struct. However, the assignment of "host" leads
to data corruption on the target host and the failure of migration.

isock->host = rdma->host;

By allocating the memory explicitly for it with g_strdup_printf(), the
issue is fixed and the migration doesn't fail any more.

Fixes: 3fa9642ff7 ("migration: convert rdma backend to accept MigrateAddress")
Cc: qemu-stable 
Cc: Li Zhijian 
Link: 
https://lore.kernel.org/r/CAHEcVy4L_D6tuhJ8h=xlr4wapaprje3nnxzaeyunotrxq6c...@mail.gmail.com
Signed-off-by: Yu Zhang 
[peterx: use g_strdup() instead of g_strdup_printf(), per Zhijian]
Signed-off-by: Peter Xu 
---
 migration/rdma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index a355dcea89..855753c671 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3357,7 +3357,7 @@ static int qemu_rdma_accept(RDMAContext *rdma)
 goto err_rdma_dest_wait;
 }
 
-isock->host = rdma->host;
+isock->host = g_strdup(rdma->host);
 isock->port = g_strdup_printf("%d", rdma->port);
 
 /*
-- 
2.44.0


-- 
Peter Xu




Re: Problem with migration/rdma

2024-03-11 Thread Het Gala
Let me send a proper patch to qemu devel mailing list and cc all the 
people involved.


I have reviewed and tested the change. Have tweaked the commit message 
accordingly.

I hope that's okay with you Yu Zhang :)

On 11/03/24 4:44 pm, Yu Zhang wrote:

Hello Peter and Zhijian,

I created a MR in gitlab. You may have a look and let me know whether
it's fine for you.
 
https://urldefense.proofpoint.com/v2/url?u=https-3A__gitlab.com_qemu_qemu_-2D_merge-5Frequests_4=DwIFaQ=s883GpUCOChKOHiocYtGcg=-qwZZzrw4EKSsq0BK7MBd3wW1WEpXmJeng3ZUT5uBCg=BBbf8F-Z4S5Gg9uS4sGrM2VjND752LeFYFQa4wRG27sLihZCc3Pot0NzTBxor1IK=1GBQqozpModsL-VR_THJwca1SMdWBo8pznn1un5RsPo=

Best regards,
Yu Zhang @ IONOS Compute Platform
11.03.2024

On Fri, Mar 8, 2024 at 10:13 AM Yu Zhang  wrote:

Hallo Alexei,

vielen Dank! Habe probiert, aber auch an Permission Problem gestoßen.
Ich versuche, Google Application-specific password zu erstellen für Zukunft.

QEMU Upstream bietet diese Lösung für kleinen Patch an im Notfall.
Sie werden es behanden.

Viele Grüße
Yu

On Fri, Mar 8, 2024 at 8:09 AM Alexei Pastuchov
 wrote:



Regards,

Het Gala




Re: Problem with migration/rdma

2024-03-11 Thread Yu Zhang
Hello Peter and Zhijian,

I created a MR in gitlab. You may have a look and let me know whether
it's fine for you.
https://gitlab.com/qemu/qemu/-/merge_requests/4

Best regards,
Yu Zhang @ IONOS Compute Platform
11.03.2024

On Fri, Mar 8, 2024 at 10:13 AM Yu Zhang  wrote:
>
> Hallo Alexei,
>
> vielen Dank! Habe probiert, aber auch an Permission Problem gestoßen.
> Ich versuche, Google Application-specific password zu erstellen für Zukunft.
>
> QEMU Upstream bietet diese Lösung für kleinen Patch an im Notfall.
> Sie werden es behanden.
>
> Viele Grüße
> Yu
>
> On Fri, Mar 8, 2024 at 8:09 AM Alexei Pastuchov
>  wrote:
> >
> > Hi Yu, I think you could create a pull request to the project
> > https://github.com/qemu/qemu for the purpose.
> > Best,
> > Alexei
> >
> > On Fri, Mar 8, 2024 at 7:28 AM Yu Zhang  wrote:
> > >
> > > Hello Zhijian and Peter,
> > >
> > > Thank you so much for testing and confirming it.
> > > I created a patch in the email format, unfortunately got an issue for
> > > setting up the
> > > "Application-specific Password" in Gmail. It seems that in my gmail
> > > account there
> > > is no option at all for selecting "mail" before creating the
> > > application password.
> > >
> > > As it's tiny, I attached it in this email at this time (not elegant.),
> > > so that it can get
> > > included before the soft freezing.
> > >
> > > Really sorry for this inconvenience.
> > > --
> > > From c9fb6a6debfbd5e103aa90f30e9a028316449104 Mon Sep 17 00:00:00 2001
> > > From: Yu Zhang 
> > > Date: Wed, 6 Mar 2024 09:06:54 +0100
> > > Subject: [PATCH] migration/rdma: Fix a memory issue for migration
> > >
> > > In commit 3fa9642ff7 change was made to convert the RDMA backend to
> > > accept MigrateAddress struct. However, the assignment of "host" leads
> > > to data corruption on the target host and the failure of migration.
> > >
> > > isock->host = rdma->host;
> > >
> > > By allocating the memory explicitly for it with g_strdup_printf(), the
> > > issue is fixed and the migration doesn't fail any more.
> > >
> > > Fixes: 3fa9642ff7 ("migration: convert rdma backend to accept 
> > > MigrateAddress")
> > > Cc: qemu-stable 
> > > Cc: Li Zhijian 
> > > Cc: Peter Xu 
> > > Signed-off-by: Yu Zhang 
> > > ---
> > >  migration/rdma.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/migration/rdma.c b/migration/rdma.c
> > > index a355dcea89..d6abe718b5 100644
> > > --- a/migration/rdma.c
> > > +++ b/migration/rdma.c
> > > @@ -3357,7 +3357,7 @@ static int qemu_rdma_accept(RDMAContext *rdma)
> > >  goto err_rdma_dest_wait;
> > >  }
> > >
> > > -isock->host = rdma->host;
> > > +isock->host = g_strdup_printf("%s", rdma->host);
> > >  isock->port = g_strdup_printf("%d", rdma->port);
> > >
> > >  /*
> > > --
> > > 2.25.1
> > > --
> > >
> > > Best regards,
> > > Yu Zhang @ IONOS Compute Platform
> > > 08.03.2024
> > >
> > > On Thu, Mar 7, 2024 at 4:37 AM Peter Xu  wrote:
> > > >
> > > > On Thu, Mar 07, 2024 at 02:41:37AM +, Zhijian Li (Fujitsu) via 
> > > > wrote:
> > > > > Yu,
> > > > >
> > > > >
> > > > > On 07/03/2024 00:30, Philippe Mathieu-Daudé wrote:
> > > > > > Cc'ing RDMA migration reviewers/maintainers:
> > > > > >
> > > > > > $ ./scripts/get_maintainer.pl -f migration/rdma.c
> > > > > > Li Zhijian  (reviewer:RDMA Migration)
> > > > > > Peter Xu  (maintainer:Migration)
> > > > > > Fabiano Rosas  (maintainer:Migration)
> > > > > >
> > > > > > On 5/3/24 22:32, Yu Zhang wrote:
> > > > > >> Hello Het and all,
> > > > > >>
> > > > > >> while I was testing qemu-8.2, I saw a lot of our migration test 
> > > > > >> cases failed.
> > > > > >> After debugging the commits of the 8.2 branch, I saw the issue and 
> > > > > >> mad a diff:
> > > > > >>
> > > > > >> diff --git a/migration/rdma.c b/migration/rdma.c
> > > > > >> index 6a29e53daf..f10d56f556 100644
> > > > > >> --- a/migration/rdma.c
> > > > > >> +++ b/migration/rdma.c
> > > > > >> @@ -3353,9 +3353,9 @@ static int qemu_rdma_accept(RDMAContext 
> > > > > >> *rdma)
> > > > > >>   goto err_rdma_dest_wait;
> > > > > >>   }
> > > > > >>
> > > > > >> -isock->host = rdma->host;
> > > > > >> +isock->host = g_strdup_printf("%s", rdma->host);
> > > > > >>   isock->port = g_strdup_printf("%d", rdma->port);
> > > > >
> > > > >
> > > > > Thanks for your analysis.
> > > > >
> > > > > It will be great if you send this as a patch.
> > > > >
> > > > >
> > > > > isock is defined as a _autoptr VVV
> > > > >  _autoptr(InetSocketAddress) isock = g_new0(InetSocketAddress, 1);
> > > > >
> > > > > I'm surprised that it seems the auto free scheme will free the member 
> > > > > of isock as well
> > > > > see below valrind log. That will cause a double free.
> > > >
> > > > Right, all the QAPI-free is a deep one.  Thanks for checking this up,
> > > > Zhijian.
> > > >
> > > > Yu, would you please send a formal patch (better before this week ends) 
> > > > so
> > 

Re: Problem with migration/rdma

2024-03-07 Thread Peter Xu
On Fri, Mar 08, 2024 at 07:03:56AM +, Zhijian Li (Fujitsu) wrote:
> 
> 
> On 08/03/2024 14:55, Peter Xu wrote:
> > On Fri, Mar 08, 2024 at 07:27:59AM +0100, Yu Zhang wrote:
> >> Hello Zhijian and Peter,
> >>
> >> Thank you so much for testing and confirming it.
> >> I created a patch in the email format, unfortunately got an issue for
> >> setting up the
> >> "Application-specific Password" in Gmail. It seems that in my gmail
> >> account there
> >> is no option at all for selecting "mail" before creating the
> >> application password.
> >>
> >> As it's tiny, I attached it in this email at this time (not elegant.),
> > 
> > I ququed it, thanks!
> > 
> 
> > -isock->host = rdma->host;
> > +isock->host = g_strdup_printf("%s", rdma->host);
> 
> 
> Peter,
> 
> g_strdup(rdma->host) is enough i guess.

Thanks Zhijian, I'll touch it up.

-- 
Peter Xu




Re: Problem with migration/rdma

2024-03-07 Thread Zhijian Li (Fujitsu)


On 08/03/2024 14:55, Peter Xu wrote:
> On Fri, Mar 08, 2024 at 07:27:59AM +0100, Yu Zhang wrote:
>> Hello Zhijian and Peter,
>>
>> Thank you so much for testing and confirming it.
>> I created a patch in the email format, unfortunately got an issue for
>> setting up the
>> "Application-specific Password" in Gmail. It seems that in my gmail
>> account there
>> is no option at all for selecting "mail" before creating the
>> application password.
>>
>> As it's tiny, I attached it in this email at this time (not elegant.),
> 
> I ququed it, thanks!
> 

> -isock->host = rdma->host;
> +isock->host = g_strdup_printf("%s", rdma->host);


Peter,

g_strdup(rdma->host) is enough i guess.


Thanks
Zhijian






>  isock->port = g_strdup_printf("%d", rdma->port);


Re: Problem with migration/rdma

2024-03-07 Thread Peter Xu
On Fri, Mar 08, 2024 at 07:27:59AM +0100, Yu Zhang wrote:
> Hello Zhijian and Peter,
> 
> Thank you so much for testing and confirming it.
> I created a patch in the email format, unfortunately got an issue for
> setting up the
> "Application-specific Password" in Gmail. It seems that in my gmail
> account there
> is no option at all for selecting "mail" before creating the
> application password.
> 
> As it's tiny, I attached it in this email at this time (not elegant.),

I ququed it, thanks!

-- 
Peter Xu




Re: Problem with migration/rdma

2024-03-07 Thread Yu Zhang
Hello Zhijian and Peter,

Thank you so much for testing and confirming it.
I created a patch in the email format, unfortunately got an issue for
setting up the
"Application-specific Password" in Gmail. It seems that in my gmail
account there
is no option at all for selecting "mail" before creating the
application password.

As it's tiny, I attached it in this email at this time (not elegant.),
so that it can get
included before the soft freezing.

Really sorry for this inconvenience.
--
>From c9fb6a6debfbd5e103aa90f30e9a028316449104 Mon Sep 17 00:00:00 2001
From: Yu Zhang 
Date: Wed, 6 Mar 2024 09:06:54 +0100
Subject: [PATCH] migration/rdma: Fix a memory issue for migration

In commit 3fa9642ff7 change was made to convert the RDMA backend to
accept MigrateAddress struct. However, the assignment of "host" leads
to data corruption on the target host and the failure of migration.

isock->host = rdma->host;

By allocating the memory explicitly for it with g_strdup_printf(), the
issue is fixed and the migration doesn't fail any more.

Fixes: 3fa9642ff7 ("migration: convert rdma backend to accept MigrateAddress")
Cc: qemu-stable 
Cc: Li Zhijian 
Cc: Peter Xu 
Signed-off-by: Yu Zhang 
---
 migration/rdma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index a355dcea89..d6abe718b5 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3357,7 +3357,7 @@ static int qemu_rdma_accept(RDMAContext *rdma)
 goto err_rdma_dest_wait;
 }

-isock->host = rdma->host;
+isock->host = g_strdup_printf("%s", rdma->host);
 isock->port = g_strdup_printf("%d", rdma->port);

 /*
-- 
2.25.1
--

Best regards,
Yu Zhang @ IONOS Compute Platform
08.03.2024

On Thu, Mar 7, 2024 at 4:37 AM Peter Xu  wrote:
>
> On Thu, Mar 07, 2024 at 02:41:37AM +, Zhijian Li (Fujitsu) via wrote:
> > Yu,
> >
> >
> > On 07/03/2024 00:30, Philippe Mathieu-Daudé wrote:
> > > Cc'ing RDMA migration reviewers/maintainers:
> > >
> > > $ ./scripts/get_maintainer.pl -f migration/rdma.c
> > > Li Zhijian  (reviewer:RDMA Migration)
> > > Peter Xu  (maintainer:Migration)
> > > Fabiano Rosas  (maintainer:Migration)
> > >
> > > On 5/3/24 22:32, Yu Zhang wrote:
> > >> Hello Het and all,
> > >>
> > >> while I was testing qemu-8.2, I saw a lot of our migration test cases 
> > >> failed.
> > >> After debugging the commits of the 8.2 branch, I saw the issue and mad a 
> > >> diff:
> > >>
> > >> diff --git a/migration/rdma.c b/migration/rdma.c
> > >> index 6a29e53daf..f10d56f556 100644
> > >> --- a/migration/rdma.c
> > >> +++ b/migration/rdma.c
> > >> @@ -3353,9 +3353,9 @@ static int qemu_rdma_accept(RDMAContext *rdma)
> > >>   goto err_rdma_dest_wait;
> > >>   }
> > >>
> > >> -isock->host = rdma->host;
> > >> +isock->host = g_strdup_printf("%s", rdma->host);
> > >>   isock->port = g_strdup_printf("%d", rdma->port);
> >
> >
> > Thanks for your analysis.
> >
> > It will be great if you send this as a patch.
> >
> >
> > isock is defined as a _autoptr VVV
> >  _autoptr(InetSocketAddress) isock = g_new0(InetSocketAddress, 1);
> >
> > I'm surprised that it seems the auto free scheme will free the member of 
> > isock as well
> > see below valrind log. That will cause a double free.
>
> Right, all the QAPI-free is a deep one.  Thanks for checking this up,
> Zhijian.
>
> Yu, would you please send a formal patch (better before this week ends) so
> that I can include it for the last pull for 9.0 soft-freeze (March 12th)?
> As 8.2 affected, please also attach proper tags:
>
> Cc: qemu-stable 
> Fixes: 3fa9642ff7 ("migration: convert rdma backend to accept MigrateAddress")
>
> >
> > ==809138== Invalid free() / delete / delete[] / realloc()
> > ==809138==at 0x483A9F5: free (vg_replace_malloc.c:538)
> > ==809138==by 0x598F70C: g_free (in /usr/lib64/libglib-2.0.so.0.6600.8)
> > ==809138==by 0x79B6AD: qemu_rdma_cleanup (rdma.c:2432)
> > ==809138==by 0x79CEE6: qio_channel_rdma_close_rcu (rdma.c:3108)
> > ==809138==by 0xC2E339: call_rcu_thread (rcu.c:301)
> > ==809138==by 0xC2116A: qemu_thread_start (qemu-thread-posix.c:541)
> > ==809138==by 0x72683F8: ??? (in /usr/lib64/libpthread-2.32.so)
> > ==809138==by 0x73824C2: clone (in /usr/lib64/libc-2.32.so)
> > ==809138==  Address 0x13daa070 is 0 bytes inside a block of size 14 free'd
> > ==809138==at 0x483A9F5: free (vg_replace_malloc.c:538)
> > ==809138==by 0x598F70C: g_free (in /usr/lib64/libglib-2.0.so.0.6600.8)
> > ==809138==by 0xC058CF: qapi_dealloc_type_str (qapi-dealloc-visitor.c:68)
> > ==809138==by 0xC09EF3: visit_type_str (qapi-visit-core.c:349)
> > ==809138==by 0xBDDECC: visit_type_InetSocketAddressBase_members 
> > (qapi-visit-sockets.c:29)
> > ==809138==by 0xBDE055: visit_type_InetSocketAddress_members 
> > (qapi-visit-sockets.c:67)
> > ==809138==by 0xBDE30D: visit_type_InetSocketAddress 
> > (qapi-visit-sockets.c:119)
> > ==809138== 

Re: Problem with migration/rdma

2024-03-06 Thread Peter Xu
On Thu, Mar 07, 2024 at 02:41:37AM +, Zhijian Li (Fujitsu) via wrote:
> Yu,
> 
> 
> On 07/03/2024 00:30, Philippe Mathieu-Daudé wrote:
> > Cc'ing RDMA migration reviewers/maintainers:
> > 
> > $ ./scripts/get_maintainer.pl -f migration/rdma.c
> > Li Zhijian  (reviewer:RDMA Migration)
> > Peter Xu  (maintainer:Migration)
> > Fabiano Rosas  (maintainer:Migration)
> > 
> > On 5/3/24 22:32, Yu Zhang wrote:
> >> Hello Het and all,
> >>
> >> while I was testing qemu-8.2, I saw a lot of our migration test cases 
> >> failed.
> >> After debugging the commits of the 8.2 branch, I saw the issue and mad a 
> >> diff:
> >>
> >> diff --git a/migration/rdma.c b/migration/rdma.c
> >> index 6a29e53daf..f10d56f556 100644
> >> --- a/migration/rdma.c
> >> +++ b/migration/rdma.c
> >> @@ -3353,9 +3353,9 @@ static int qemu_rdma_accept(RDMAContext *rdma)
> >>   goto err_rdma_dest_wait;
> >>   }
> >>
> >> -    isock->host = rdma->host;
> >> +    isock->host = g_strdup_printf("%s", rdma->host);
> >>   isock->port = g_strdup_printf("%d", rdma->port);
> 
> 
> Thanks for your analysis.
> 
> It will be great if you send this as a patch.
> 
> 
> isock is defined as a _autoptr VVV
>  _autoptr(InetSocketAddress) isock = g_new0(InetSocketAddress, 1);
> 
> I'm surprised that it seems the auto free scheme will free the member of 
> isock as well
> see below valrind log. That will cause a double free.

Right, all the QAPI-free is a deep one.  Thanks for checking this up,
Zhijian.

Yu, would you please send a formal patch (better before this week ends) so
that I can include it for the last pull for 9.0 soft-freeze (March 12th)?
As 8.2 affected, please also attach proper tags:

Cc: qemu-stable 
Fixes: 3fa9642ff7 ("migration: convert rdma backend to accept MigrateAddress")

> 
> ==809138== Invalid free() / delete / delete[] / realloc()
> ==809138==at 0x483A9F5: free (vg_replace_malloc.c:538)
> ==809138==by 0x598F70C: g_free (in /usr/lib64/libglib-2.0.so.0.6600.8)
> ==809138==by 0x79B6AD: qemu_rdma_cleanup (rdma.c:2432)
> ==809138==by 0x79CEE6: qio_channel_rdma_close_rcu (rdma.c:3108)
> ==809138==by 0xC2E339: call_rcu_thread (rcu.c:301)
> ==809138==by 0xC2116A: qemu_thread_start (qemu-thread-posix.c:541)
> ==809138==by 0x72683F8: ??? (in /usr/lib64/libpthread-2.32.so)
> ==809138==by 0x73824C2: clone (in /usr/lib64/libc-2.32.so)
> ==809138==  Address 0x13daa070 is 0 bytes inside a block of size 14 free'd
> ==809138==at 0x483A9F5: free (vg_replace_malloc.c:538)
> ==809138==by 0x598F70C: g_free (in /usr/lib64/libglib-2.0.so.0.6600.8)
> ==809138==by 0xC058CF: qapi_dealloc_type_str (qapi-dealloc-visitor.c:68)
> ==809138==by 0xC09EF3: visit_type_str (qapi-visit-core.c:349)
> ==809138==by 0xBDDECC: visit_type_InetSocketAddressBase_members 
> (qapi-visit-sockets.c:29)
> ==809138==by 0xBDE055: visit_type_InetSocketAddress_members 
> (qapi-visit-sockets.c:67)
> ==809138==by 0xBDE30D: visit_type_InetSocketAddress 
> (qapi-visit-sockets.c:119)
> ==809138==by 0xBDDB38: qapi_free_InetSocketAddress 
> (qapi-types-sockets.c:51)
> ==809138==by 0x792351: glib_autoptr_clear_InetSocketAddress 
> (qapi-types-sockets.h:109)
> ==809138==by 0x79236F: glib_autoptr_cleanup_InetSocketAddress 
> (qapi-types-sockets.h:109)
> ==809138==by 0x79D956: qemu_rdma_accept (rdma.c:3341)
> ==809138==by 0x79F05A: rdma_accept_incoming_migration (rdma.c:4041)
> ==809138==  Block was alloc'd at
> ==809138==at 0x4839809: malloc (vg_replace_malloc.c:307)
> ==809138==by 0x5992BB8: g_malloc (in /usr/lib64/libglib-2.0.so.0.6600.8)
> ==809138==by 0x59A7FE3: g_strdup (in /usr/lib64/libglib-2.0.so.0.6600.8)
> ==809138==by 0x79C2A8: qemu_rdma_data_init (rdma.c:2731)
> ==809138==by 0x79F183: rdma_start_incoming_migration (rdma.c:4081)
> ==809138==by 0x76F200: qemu_start_incoming_migration (migration.c:581)
> ==809138==by 0x77193A: qmp_migrate_incoming (migration.c:1735)
> ==809138==by 0x74B3D3: qmp_x_exit_preconfig (vl.c:2718)
> ==809138==by 0x74DB6F: qemu_init (vl.c:3753)
> ==809138==by 0xA14F3F: main (main.c:47)

-- 
Peter Xu




Re: Problem with migration/rdma

2024-03-06 Thread Zhijian Li (Fujitsu)
Yu,


On 07/03/2024 00:30, Philippe Mathieu-Daudé wrote:
> Cc'ing RDMA migration reviewers/maintainers:
> 
> $ ./scripts/get_maintainer.pl -f migration/rdma.c
> Li Zhijian  (reviewer:RDMA Migration)
> Peter Xu  (maintainer:Migration)
> Fabiano Rosas  (maintainer:Migration)
> 
> On 5/3/24 22:32, Yu Zhang wrote:
>> Hello Het and all,
>>
>> while I was testing qemu-8.2, I saw a lot of our migration test cases failed.
>> After debugging the commits of the 8.2 branch, I saw the issue and mad a 
>> diff:
>>
>> diff --git a/migration/rdma.c b/migration/rdma.c
>> index 6a29e53daf..f10d56f556 100644
>> --- a/migration/rdma.c
>> +++ b/migration/rdma.c
>> @@ -3353,9 +3353,9 @@ static int qemu_rdma_accept(RDMAContext *rdma)
>>   goto err_rdma_dest_wait;
>>   }
>>
>> -    isock->host = rdma->host;
>> +    isock->host = g_strdup_printf("%s", rdma->host);
>>   isock->port = g_strdup_printf("%d", rdma->port);


Thanks for your analysis.

It will be great if you send this as a patch.


isock is defined as a _autoptr VVV
 _autoptr(InetSocketAddress) isock = g_new0(InetSocketAddress, 1);

I'm surprised that it seems the auto free scheme will free the member of isock 
as well
see below valrind log. That will cause a double free.

==809138== Invalid free() / delete / delete[] / realloc()
==809138==at 0x483A9F5: free (vg_replace_malloc.c:538)
==809138==by 0x598F70C: g_free (in /usr/lib64/libglib-2.0.so.0.6600.8)
==809138==by 0x79B6AD: qemu_rdma_cleanup (rdma.c:2432)
==809138==by 0x79CEE6: qio_channel_rdma_close_rcu (rdma.c:3108)
==809138==by 0xC2E339: call_rcu_thread (rcu.c:301)
==809138==by 0xC2116A: qemu_thread_start (qemu-thread-posix.c:541)
==809138==by 0x72683F8: ??? (in /usr/lib64/libpthread-2.32.so)
==809138==by 0x73824C2: clone (in /usr/lib64/libc-2.32.so)
==809138==  Address 0x13daa070 is 0 bytes inside a block of size 14 free'd
==809138==at 0x483A9F5: free (vg_replace_malloc.c:538)
==809138==by 0x598F70C: g_free (in /usr/lib64/libglib-2.0.so.0.6600.8)
==809138==by 0xC058CF: qapi_dealloc_type_str (qapi-dealloc-visitor.c:68)
==809138==by 0xC09EF3: visit_type_str (qapi-visit-core.c:349)
==809138==by 0xBDDECC: visit_type_InetSocketAddressBase_members 
(qapi-visit-sockets.c:29)
==809138==by 0xBDE055: visit_type_InetSocketAddress_members 
(qapi-visit-sockets.c:67)
==809138==by 0xBDE30D: visit_type_InetSocketAddress 
(qapi-visit-sockets.c:119)
==809138==by 0xBDDB38: qapi_free_InetSocketAddress (qapi-types-sockets.c:51)
==809138==by 0x792351: glib_autoptr_clear_InetSocketAddress 
(qapi-types-sockets.h:109)
==809138==by 0x79236F: glib_autoptr_cleanup_InetSocketAddress 
(qapi-types-sockets.h:109)
==809138==by 0x79D956: qemu_rdma_accept (rdma.c:3341)
==809138==by 0x79F05A: rdma_accept_incoming_migration (rdma.c:4041)
==809138==  Block was alloc'd at
==809138==at 0x4839809: malloc (vg_replace_malloc.c:307)
==809138==by 0x5992BB8: g_malloc (in /usr/lib64/libglib-2.0.so.0.6600.8)
==809138==by 0x59A7FE3: g_strdup (in /usr/lib64/libglib-2.0.so.0.6600.8)
==809138==by 0x79C2A8: qemu_rdma_data_init (rdma.c:2731)
==809138==by 0x79F183: rdma_start_incoming_migration (rdma.c:4081)
==809138==by 0x76F200: qemu_start_incoming_migration (migration.c:581)
==809138==by 0x77193A: qmp_migrate_incoming (migration.c:1735)
==809138==by 0x74B3D3: qmp_x_exit_preconfig (vl.c:2718)
==809138==by 0x74DB6F: qemu_init (vl.c:3753)
==809138==by 0xA14F3F: main (main.c:47)


Thanks
Zhijian


>>
>> which was introduced by the commit below:
>>
>> commit 3fa9642ff7d51f7fc3ba68e6ccd13a939d5bd609 (HEAD)
>> Author: Het Gala 
>> Date:   Mon Oct 23 15:20:45 2023 -0300
>>
>>  migration: convert rdma backend to accept MigrateAddress
>>
>>  RDMA based transport backend for 'migrate'/'migrate-incoming' QAPIs
>>  accept new wire protocol of MigrateAddress struct.
>>
>>  It is achived by parsing 'uri' string and storing migration parameters
>>  required for RDMA connection into well defined InetSocketAddress struct.
>>  ...
>>
>> A debug line
>>   isock->host = rdma->host;
>>   isock->port = g_strdup_printf("%d", rdma->port);
>> +fprintf(stdout, "QEMU: %s, host %s, port %s\n", __func__,
>> isock->host, isock->port);
>>
>> produced this error:
>> QEMU: qemu_rdma_accept, host ::, port 8089
>> corrupted size vs. prev_size in fastbins
>>
>> on the target host, which may indicate a crash related to the memory
>> allocation or a memory
>> corruption of the data. With the patch, it doesn't happen any more,
>> and the migration is fine.
>> Could you be kind to test this and confirm the issue?
>>
>> Furthermore, I'm confused by the two struct:
>>
>> struct InetSocketAddressBase {
>>  char *host;
>>  char *port;
>> };
>>
>> struct InetSocketAddress {
>>  /* Members inherited from InetSocketAddressBase: */
>>  char *host;
>>  char *port;
>>
>> To my understanding, they are used to consolidate the 

Re: Problem with migration/rdma

2024-03-06 Thread Philippe Mathieu-Daudé

Cc'ing RDMA migration reviewers/maintainers:

$ ./scripts/get_maintainer.pl -f migration/rdma.c
Li Zhijian  (reviewer:RDMA Migration)
Peter Xu  (maintainer:Migration)
Fabiano Rosas  (maintainer:Migration)

On 5/3/24 22:32, Yu Zhang wrote:

Hello Het and all,

while I was testing qemu-8.2, I saw a lot of our migration test cases failed.
After debugging the commits of the 8.2 branch, I saw the issue and mad a diff:

diff --git a/migration/rdma.c b/migration/rdma.c
index 6a29e53daf..f10d56f556 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3353,9 +3353,9 @@ static int qemu_rdma_accept(RDMAContext *rdma)
  goto err_rdma_dest_wait;
  }

-isock->host = rdma->host;
+isock->host = g_strdup_printf("%s", rdma->host);
  isock->port = g_strdup_printf("%d", rdma->port);

which was introduced by the commit below:

commit 3fa9642ff7d51f7fc3ba68e6ccd13a939d5bd609 (HEAD)
Author: Het Gala 
Date:   Mon Oct 23 15:20:45 2023 -0300

 migration: convert rdma backend to accept MigrateAddress

 RDMA based transport backend for 'migrate'/'migrate-incoming' QAPIs
 accept new wire protocol of MigrateAddress struct.

 It is achived by parsing 'uri' string and storing migration parameters
 required for RDMA connection into well defined InetSocketAddress struct.
 ...

A debug line
  isock->host = rdma->host;
  isock->port = g_strdup_printf("%d", rdma->port);
+fprintf(stdout, "QEMU: %s, host %s, port %s\n", __func__,
isock->host, isock->port);

produced this error:
QEMU: qemu_rdma_accept, host ::, port 8089
corrupted size vs. prev_size in fastbins

on the target host, which may indicate a crash related to the memory
allocation or a memory
corruption of the data. With the patch, it doesn't happen any more,
and the migration is fine.
Could you be kind to test this and confirm the issue?

Furthermore, I'm confused by the two struct:

struct InetSocketAddressBase {
 char *host;
 char *port;
};

struct InetSocketAddress {
 /* Members inherited from InetSocketAddressBase: */
 char *host;
 char *port;

To my understanding, they are used to consolidate the separated data
to a well-defined
struct "MigrateAddress", while the struct whose member receive their
data has a different type:

typedef struct RDMAContext {
 char *host;
 int port;
 ...
}

Is there any reason to keep "port" like this (char* instead of int) or
can we improve it?
Thank you so much for any of your comments!

Best regards,

Yu Zhang @ IONOS Compute Platform
05.03.2024