Hi Dell,
I actually found a very simple reproducer.
The issue is not with the kickstart environment per se, but with
memory allocation within DSU.
It turns out that the RHEL/CentOS kickstart environment sets up the
MALLOC_PERTURB_ [1],[2] environment variable. And surely enough, as
soon as you define it in a regular, non-kickstart environment, DSU
starts exhibiting all sorts of aberrant behavior. Which very likely
indicates that parts of DSU's code use memory before it's even
initialized, or mistakenly re-use freed memory.
Setting MALLOC_PERTURB_, the first effect is that collecting inventory
doesn't work in non-interactive mode:
-- 8< ------------------------------------------------------------
# export MALLOC_PERTURB_=$(($RANDOM % 255 + 1))
# dsu -n
Dell System Update 1.4.2
Copyright (C) 2014 Dell Proprietary.
Verifying catalog installation ...
Installing catalog from repository ...
Fetching dsucatalog ...
Reading the catalog ...
Fetching invcol_M18HC_LN64_17.06.000.1171_A00 ...
Verifying inventory collector installation ...
Getting System Inventory ...
Could not get the inventory
Exiting DSU!
-- 8< ------------------------------------------------------------
For reference, here's the "dsu -i" output when MALLOC_PERTURB_ is undefined:
-- 8< ------------------------------------------------------------
[root@sh-101-59 ~]# unset MALLOC_PERTURB_
[root@sh-101-59 ~]# dsu -i
Dell System Update 1.4.2
Copyright (C) 2014 Dell Proprietary.
Verifying catalog installation ...
Installing catalog from repository ...
Fetching dsucatalog ...
Reading the catalog ...
Fetching invcol_M18HC_LN64_17.06.000.1171_A00 ...
Verifying inventory collector installation ...
Getting System Inventory ...
warning: Inventory collector returned with partial failure.
1. OpenManage Server Administrator ( Version : 8.5.0 )
2. BIOS ( Version : 2.4.2 )
3. Lifecycle Controller ( Version : 2.41.40.40 )
4. Dell 32 Bit uEFI Diagnostics ( Version : 4239A36 )
5. OS COLLECTOR 1.1 ( Version : OSC_1.1 )
6. System CPLD ( Version : 1.0.0 )
7. iDRAC ( Version : 2.41.40.40 )
8. Intel(R) Ethernet 10G X520 LOM ( Version : 17.5.10 )
9. Intel(R) Ethernet 10G X520 LOM ( Version : 17.5.10 )
10. OpenManage | iDRAC Service Module ( Version : 0 )
Exiting DSU!
-- 8< ------------------------------------------------------------
Ok, so let's try to generate the inventory by hand:
-- 8< ------------------------------------------------------------
# chmod +x /usr/libexec/dell_dup/invcol_M18HC_LN64_17.06.000.1171_A00
# /usr/libexec/dell_dup/invcol_M18HC_LN64_17.06.000.1171_A00
-outc=/dev/shm/inv.xml
<?xml version="1.0" encoding="UTF-8"?>
<InventoryError lang="en"><SPStatus result="false" module="linddcfg.sh -inv -s">
<Message> Unsupported operating system. </Message>
</SPStatus><SPStatus result="false" module="linddcfg.sh -inv
-s"><Message>Invalid inventory results.</Message></SPStatus><SPStatus
result="false" module="brcmlinddcfg.sh -inv -s">
<Message> Unsupported operating system. </Message>
</SPStatus><SPStatus result="false" module="brcmlinddcfg.sh -inv
-s"><Message>Invalid inventory
results.</Message></SPStatus></InventoryError>
-- 8< ------------------------------------------------------------
Not minding the XML error output, we now have an inventory file that
we can pass to DSU, in *non-interactive* mode (-n flag). See how it
now thinks that it's been invoked in preview mode (-p):
-- 8< ------------------------------------------------------------
# export MALLOC_PERTURB_=$(($RANDOM % 255 + 1))
# dsu --input-inventory-file=/dev/shm/inv.xml -n
Dell System Update 1.4.2
Copyright (C) 2014 Dell Proprietary.
Verifying catalog installation ...
Installing catalog from repository ...
Fetching dsucatalog ...
Reading the catalog ...
Determining Applicable Updates ...
--------- Update Preview ---------
# : Component : Version : Filename
----------------------------------
1 : OpenManage | iDRAC Service Module : 2.5 :
Systems-Management_Application_DT30R_LN64_2.5_A00
NOTE: The preview option displays the components which can be updated
based on Catalog. This option does not update.
Run --inventory for checking the component status post DSU commit.
Exiting DSU!
-- 8< ------------------------------------------------------------
Clearly the memory range where the arguments are stored is not the one
that is being read.
Finally, to quote Ulrich Drepper on the use of MALLOC_PERTURB_:
> This technique can find hard to detect bugs. It is therefore suggested to
> always use this flag (at least temporarily) when testing out code or a new
> distribution.
I would put a lot of emphasis on the word "testing"...
Now that you have an easy way to reproduce the issue, I hope that
you'll take the time to run Valgrind on DSU and fix all the memory
allocation/free'ing issues in a reasonable timeframe. Thanks.
[1]: http://udrepper.livejournal.com/11429.html
[2]: http://jhrozek.livejournal.com/1755.html
Cheers,
--
Kilian
PS: Please don't suggest to undefine MALLOC_PERTURB_ to work around the issue.
_______________________________________________
Linux-PowerEdge mailing list
[email protected]
https://lists.us.dell.com/mailman/listinfo/linux-poweredge