On 20/02/2025 23:55, Peter Xu wrote:
> On Thu, Feb 20, 2025 at 05:40:38PM +0800, Li Zhijian wrote:
>> On 19/02/2025 22:11, Peter Xu wrote:
>>>>>>> then
>>>>>>> in the test it tries to detect rdma link and fetch the ip only
>>>>>> It should work without root permission if we just*detect* and*fetch ip*.
>>>>>>
>>>>>> Do you also mean we can split new-rdma-link.sh to 2 separate scripts
>>>>>> - add-rdma-link.sh # optionally, execute by user before the test
>>>>>> (require root permission)
>>>>>> - detect-fetch-rdma.sh # execute from the migration-test
>>>>> Hmm indeed we still need a script to scan over all the ports..
>>>>>
>>>>> If having --rdma is a good idea, maybe we can further make it a parameter
>>>>> to --rdma?
>>>>>
>>>>> $ migration-test --rdma $RDMA_IP
>>>>>
>>>>> Or:
>>>>>
>>>>> $ migration-test --rdma-ip $RDMA_IP
>>>> I think --rdma only makes sense if it's going to do something
>>>> special. The optmimal scenario is that it always runs the test when it
>>>> can and sets up/tears down anything it needs.
>>>>
>>>> If it needs root, I'd prefer the test informs about this and does the
>>>> work itself.
>>>>
>>>> It would also be good to have the add + detect separate so we have more
>>>> flexibility, maybe we manage to enable this in CI even.
>>>>
>>>> So:
>>>>
>>>> ./add.sh
>>>> migration-test
>>>> (runs detect.sh + runs rdma test)
>>>> (leaves stuff behind)
>>>>
>>>> migration-test
>>>> (skips rdma test with message that it needs root)
>>>>
>>>> sudo migration-test
>>>> (runs add.sh + detect.sh + runs rdma test)
>>>> (cleans itself up)
>>>>
>>>> Does that make sense to you? I hope it's not too much work.
>>> Looks good here. We can also keep all the rdma stuff into one file, taking
>>> parameters.
>>>
>>> ./rdma-helper.sh setup
>>> ./rdma-helper.sh detect-ip
>>
>> Hi Peter and Fabiano
>>
>> Many thanks for your kindly idea and suggestion.
>> Please take another look at the changes below.
>> - I don't copy script to the build dir, just execute the script like
>> misc-tests.c
>> - It will automatically create a new RXE if it doesn't exit when running in
>> root
>
> Thanks! This is much better. Comments below.
>
>>
>> [PATCH] migration: Add qtest for migration over RDMA
>>
>> This qtest requires there is RDMA(RoCE) link in the host.
>> Introduce a scripts/rdma-migration-helper.sh to
>> - setup a new RXE if it's root
>> - detect existing RoCE link
>> to make the qtest work smoothly.
>>
>> Test will be skip if there is no available RoCE link.
>> # Start of rdma tests
>> # Running /x86_64/migration/precopy/rdma/plain
>> ok 1 /x86_64/migration/precopy/rdma/plain # SKIP There is no available
>> rdma link in the host.
>> Maybe you are not running with the root permission
>> # End of rdma tests
>>
>> Admin is able to remove the RXE by passing 'cleanup' to this script.
>>
>> Signed-off-by: Li Zhijian <[email protected]>
>> ---
>> scripts/rdma-migration-helper.sh | 40 +++++++++++++++++++
>> tests/qtest/migration/precopy-tests.c | 57 +++++++++++++++++++++++++++
>> 2 files changed, 97 insertions(+)
>> create mode 100755 scripts/rdma-migration-helper.sh
>>
>> diff --git a/scripts/rdma-migration-helper.sh
>> b/scripts/rdma-migration-helper.sh
>> new file mode 100755
>> index 0000000000..4ef62baf0f
>> --- /dev/null
>> +++ b/scripts/rdma-migration-helper.sh
>> @@ -0,0 +1,40 @@
>> +#!/bin/bash
>> +
>> +# Copied from blktests
>> +get_ipv4_addr() {
>> + ip -4 -o addr show dev "$1" |
>> + sed -n 's/.*[[:blank:]]inet[[:blank:]]*\([^[:blank:]/]*\).*/\1/p' |
>> + tr -d '\n'
>> +}
>> +
>> +has_soft_rdma() {
>> + rdma link | grep -q " netdev $1[[:blank:]]*\$"
>> +}
>> +
>> +rdma_rxe_setup_detect()
>> +{
>> + (
>> + cd /sys/class/net &&
>> + for i in *; do
>> + [ -e "$i" ] || continue
>> + [ "$i" = "lo" ] && continue
>> + [ "$(<"$i/addr_len")" = 6 ] || continue
>> + [ "$(<"$i/carrier")" = 1 ] || continue
>> +
>> + has_soft_rdma "$i" && break
>> + [ "$operation" = "setup" ] && rdma link add "${i}_rxe" type
>> rxe netdev "$i" && break
>> + done
>> + has_soft_rdma "$i" || return
>> + get_ipv4_addr $i
>> + )
>> +}
>> +
>> +operation=${1:-setup}
>> +
>> +if [ "$operation" == "setup" ] || [ "$operation" == "detect" ]; then
>> + rdma_rxe_setup_detect
>> +elif [ "$operation" == "cleanup" ]; then
>> + modprobe -r rdma_rxe
>
> Would this affects host when there's existing user of rdma_rxe (or fail
> with -EBUSY)?
'modprobe -r ' will fail with EBUSY in this case.
> If there's no major side effect of leftover rdma link, we
> could also drop the "cleanup" for now and start from simple.
In my understanding, there is no any side effect, I'm fine to drop it in this
time.
>
>> +else
>> + echo "Usage: $0 [setup | cleanup | detect]"
>> +fi
>> diff --git a/tests/qtest/migration/precopy-tests.c
>> b/tests/qtest/migration/precopy-tests.c
>> index ba273d10b9..8c72eb699b 100644
>> --- a/tests/qtest/migration/precopy-tests.c
>> +++ b/tests/qtest/migration/precopy-tests.c
>> @@ -99,6 +99,59 @@ static void test_precopy_unix_dirty_ring(void)
>> test_precopy_common(&args);
>> }
>>
>> +#ifdef CONFIG_RDMA
>> +
>> +#define RDMA_MIGRATION_HELPER "scripts/rdma-migration-helper.sh"
>> +static int new_rdma_link(char *buffer) {
>
> Nit: newline before "{".
Good catch!
>
>> + const char *argument = (geteuid() == 0) ? "setup" : "detect";
>> + char command[1024];
>> +
>> + snprintf(command, sizeof(command), "%s %s", RDMA_MIGRATION_HELPER,
>> argument);
>> +
>> + FILE *pipe = popen(command, "r");
>> + if (pipe == NULL) {
>> + perror("Failed to run script");
>> + return -1;
>> + }
>> +
>> + int idx = 0;
>> + while (fgets(buffer + idx, 128 - idx, pipe) != NULL) {
>> + idx += strlen(buffer);
>> + }
>> +
>> + int status = pclose(pipe);
>> + if (status == -1) {
>> + perror("Error reported by pclose()");
>> + return -1;
>> + } else if (WIFEXITED(status)) {
>> + return WEXITSTATUS(status);
>> + }
>> +
>> + return -1;
>> +}
>> +
>> +static void test_precopy_rdma_plain(void)
>> +{
>> + char buffer[128] = {};
>> +
>> + if (new_rdma_link(buffer)) {
>> + g_test_skip("There is no available rdma link in the host.\n"
>> + "Maybe you are not running with the root permission");
>
> Nit: can be slightly more verbose?
>
> g_test_skip("\nThere is no available rdma link to run RDMA
> migration test. To enable the test:\n"
> "(1) Run \'%s setup\' with root and rerun the test\n"
> "(2) Run the test with root privilege\n",
> RDMA_MIGRATION_HELPER);
Yes, it's much better.
Thanks
Zhijian
>
>
>> + return;
>> + }
>> +
>> + /* FIXME: query a free port instead of hard code. */
>> + g_autofree char *uri = g_strdup_printf("rdma:%s:7777", buffer);
>> +
>> + MigrateCommon args = {
>> + .listen_uri = uri,
>> + .connect_uri = uri,
>> + };
>> +
>> + test_precopy_common(&args);
>> +}
>> +#endif
>> +
>> static void test_precopy_tcp_plain(void)
>> {
>> MigrateCommon args = {
>> @@ -1124,6 +1177,10 @@ static void
>> migration_test_add_precopy_smoke(MigrationTestEnv *env)
>> test_multifd_tcp_uri_none);
>> migration_test_add("/migration/multifd/tcp/plain/cancel",
>> test_multifd_tcp_cancel);
>> +#ifdef CONFIG_RDMA
>> + migration_test_add("/migration/precopy/rdma/plain",
>> + test_precopy_rdma_plain);
>> +#endif
>> }
>>
>> void migration_test_add_precopy(MigrationTestEnv *env)
>> --
>> 2.41.0
>>
>