Hi,
A. Current Fuel PDF integration status
Fuel and Armband teams have been working on moving to the new PDF as a single
input configuration file.
We have proposed a new installer adapter template for Fuel in Pharos [1], as
well as new PDFs in securedlab for the PODs Fuel uses:
- lf-pod2 [2];
- arm-pod5 has been around for a while;
- ericsson-pod1 and zte-pod1 already have PDFs, but might require smalls
updates to work with the Fuel installer adapter;
While working on the PDFs, we proposed some patches that should improve/extend
the verify job for securedlab, use the Pharos git repo for PDF parsing,
respectively some minor cleanup/code folding:
- securedlab verify job should switch to using Pharos installer adapters and
generate_config [3];
- yamllint fixes and code folding for existing PDFs [4];
- add verify job summary, run the whole test matrix instead of bailing on the
first error [5];
- extend verify with yamllint runs for PDF files, as well as output yaml
file(s) generated by check_jinja2 [6];
- fix missing IPMI credentials for lf-pod4 (caught by linting the output yaml
described above) [7];
- (unrelated to PDF, Fuel cleanup) remove old Fuel configuration files we no
longer use [8];
B. PDF specification limitations for Fuel
Tbh, I have a hard time summarizing this, but I'll try.
Currently, we have some PDFs that define a 'net_config' section (global per
PDF), while the spec and most PDFs don't.
This resulted from:
- the need to support multiple VLANs over the same physical interface;
- installers expecting a network-centric mapping between existing/to-be-created
networks and available interfaces;
Looking at what Fuel expects as input, we'd like to be able to easily map IPs
in a certain network (e.g. admin, mgmt, private, public) to a particular
interface name.
But the PDF does not allow specifying interface names directly, as those depend
not only on target OS, but also on OS config (e.g. net.ifnames=0 biosdevname=1
vs net.ifnames=1 biosdevname=0).
Indexing interfaces is also fragile, as a bios upgrade might change PCI layout
and therefore the NIC ordering (quite rare, but still).
What happens if we add a new interface that ends up at index 1, between
existing interfaces 0 and 2?
The 'net_config' section is a compromise that solves part of the problem, but
still does not provide a solid solution for the physical interface name
mapping, as it also relies on interface index.
What if the controller nodes have a different set of interfaces than the
compute nodes have?
Adding 'net_config' to PODs aligned with the current spec is also problematic,
as it'd duplicate the VLAN information, making the whole thing even more
fragile.
Tl;dr I submited a proof of concept refactor of net_config, which triesto align
more with the current PDF spec, although it is not 100% compatible (we can't
model multiple VLANs on the same physical interface with the current spec) [9].
Observations wrt this change:
- network-centric approach, installer-friendly;
- physical interface to virtual interface mapping is NOT 1:1 (multiple VLANs on
same physical NIC);
- virtual inteface to network mapping is 1:1;
- network to IP mapping is 1:1;
- networks are global for the POD (including network to virtual network);
- network to physical inteface mapping is NOT global per POD, and should be
overrideable on a per-node basis;
- features and speed params were silently discarded during net_config addition,
bring them back for the physical intefaces;
- converting from current spec-compatible PDF to this proposed format should be
trivial for PDF files, and should require very little work on the installer
adapters;
Etc.
Please take some time to review this and let's come up with a better solution
if we can find one, or at least align our current PDFs to one format or another.
The PoC does not solve the interface indexing issue, but at least it provides a
mechanism that makes it per-node configurable.
C. PDF current implementation issues
This section covers some divergent aspects in the current PDFs in securedlab.
The PDF spec is quite clear on some of these, so I don't understand why there
are so many mutations in the wild.
If parsing the PDF is hard / does not align with the installer-expected input,
we can define some new custom filters in generate_config.py, although most of
these should be fixable within the current tool set.
1. 'vlan: 0' vs 'vlan: native'
The spec uses 'native' to differentiate from '0', which might have a
special/reserved meaning on some network equipments.
We currently have both formats in securedlab.
2. 'disk_rotation: ssd' or 'disk_rotation: 15000' for SSD drives
Here, the spec does not provide a special value for SSDs. Afaik, no installer
adapter consumes this yet, but we should extend the PDF spec and then adhere to
the new value.
We currently have a mix of both formats in securedlab, and both are wrong imo
:).
3. 'features: dpdk|sriov' vs 'features: dpdk, sriov'
Again, the spec is quite clear on the format using '|', but we have divergent
implementations in the wild.
4. 'features: null' vs 'features: ' vs omitting it altogether
We should agree on the preffered value for no features and align all PDFs.
5. inconsistent node naming
Some PDFs use 'pod1-jump' + 'pod1-node1', while others go for different, custom
names (e.g. 'lfpod4-jumpserver' + 'lfpod4-node1', 'CI-POD1-HOST' +
'CI-ERICSSON-POD1-NODE1').
Not the biggest issue, but it'd be nice to agree on a common format here.
6. Address/netmask
The spec defines all adresses as IP/mask, e.g. 10.20.0.2/24.
However, this is hard to parse by different installers, so in practive we see
multiple occurences of specifying just the IP address without the mask.
Would it be possible to split the spec format in 'address: 10.20.0.2' +
'netmask: 24' (or 'netmask: 255.255.255.0')?
If not, we should agree on the format going forward, and update all current
occurances that don't respect it.
Note that this also applies to IP address in 'remote_management'.
7. 'os' in POD nodes
Imo, this parameter does not belong in the PDF at all - it makes sense for the
jump server, since that is preinstalled, but not for the other nodes.
I think it should be removed, especially since some PDFs only define it for the
first node and not for the rest ...
8. IPMI interface MAC on the 'interfaces' list
Should the IPMI interface be present in the 'interfaces' list? I think I saw
this in 1 or 2 PDFs, so I thought I should ask here first, before removing it.
D. Installer adapter issues
1. Most installer adapter templates generate invalid YAML files (only JOID and
the proposed Fuel pass this test, and not for all PODs);
2. Some installer adapter templates rely on the global 'remote_params', which
is not mandatory, plus each node might define its own parameters;
e.g.:
ipmi_user: {{ conf['jumphost']['remote_params']['user'] }} # wrong, relies on
non-mandatory 'remote_params' (which had a different name in
pharos/config/pod1.yaml for a start), does not allow per-node creds
ipmi_user: {{ conf['nodes'][0]['remote_management']['user'] }} # right
etc.
There are multiple minor design issues in the current installer adapters, and
I'm sure each installer team knows about how fragile the current templates are.
I saw some comments in the templates as well, so I'm sure I don't have to
re-iterate that here.
The above is not an exhaustive list, it merely covers the questions I gathered
while working with PDF + Fuel.
I didn't proofread this, sorry for any mistakes that slipped through.
See you at today's Infra meeting.
BR,
Alex
[1] https://gerrit.opnfv.org/gerrit/#/c/42759/ (Add fuel installer adapter)
[2] https://gerrit.opnfv.org/gerrit/#/c/42875/ (lf-pod2: Pod Descriptor File)
[3] https://gerrit.opnfv.org/gerrit/#/c/42343/ (securedlab: Use Pharos git sub
for PDF validation)
[4] https://gerrit.opnfv.org/gerrit/#/c/42729/ (PDF: Fix yamllint warnings &
fold reusable code)
[5] https://gerrit.opnfv.org/gerrit/#/c/42599/ (PDF: Add result summary to
check-jinja2)
[6] https://gerrit.opnfv.org/gerrit/#/c/42711/ (PDF: Run YAML Linter on pod
descriptors / output)
[7] https://gerrit.opnfv.org/gerrit/#/c/42881/ (lf-pod4: Add missing IPMI
user/pass)
[8] https://gerrit.opnfv.org/gerrit/#/c/42805/ (cleanup: fuel: Remove obsolete
reap, dea, dha)
[9] https://gerrit.opnfv.org/gerrit/#/c/42893/ (PoC: net_config refactor)
_______________________________________________
opnfv-tech-discuss mailing list
[email protected]
https://lists.opnfv.org/mailman/listinfo/opnfv-tech-discuss