Nish and Lucas, I and a few members of my team have a lot of experience doing HW attrib style test scheduling so please feel free specifically bring me and Chris Nelson into future detailed discussions for the scheduling part of this discussion. We are both pretty new to Linux still so you probably no better than us how to get the attrib info, but we might have some ideas in the space of adding some tables and implementing queries to schedule jobs in this way.
Just FYI, we called our prior solution constraints based scheduling (as in HW and SW constraints being options) and had support for constraints being both hard and soft. The idea was that a hard constraint was similar to the autotest "only if needed" labels in that if a job didn't request a hard constraint explicitly it wouldn't be considered to match the host. I think I like your terminology (only if needed) better, and think it could equally be applied to HW attribute scheduling as for label based scheduling. That said, this would need to be a per-host & per-attribute setting, and not applied to every host that has the attribute (as opposed to how "only if needed" is currently a label specific attribute that would apply to every host that uses that label). Does that makes sense? See below for more comments. -Dan DeFolo > -----Original Message----- > From: autotest-boun...@test.kernel.org [mailto:autotest- > boun...@test.kernel.org] On Behalf Of Nishanth Aravamudan > Sent: Wednesday, May 16, 2012 2:51 PM > To: l...@redhat.com > Cc: autotest@test.kernel.org > Subject: [Autotest] [RFC] adding hardware inventory to autotest > > Hi everyone, > > Lucas and I were discussing a new feature that I think would be very useful > for Autotest: hardware inventory. > > With the upcoming/ongoing feature to add tighter Cobbler support to > Autotest I, at least, have a need to know what kinds of machines are > available to run jobs on. I want to know things like: CPU information, amount > of memory, PCI devices, etc. > > - My initial proposal is to use lshw to acquire information from > running systems. > - An alternative would be smolt. I've found some issues with the > distribution versions of smolt on ppc64, at least. > - In theory, we could have a pluggable infrastructure, much like the > install_server support itself, and the administrator could specify > how to obtain the information. That makes sense to me. One thing to consider is subdividing the inventory commands as makes sense by category to the average user and just return back a dictionary of attributes. For example, if the inventory interface command were called hwprobe I could imagine calling something like this to get CPU info: hwprobe -t cpu { 'cpu_model: 'Intel(R) Xeon(R) CPU X5550' , 'cpu_speed': '2.27GHz', 'cpu_cores': 16, .... } Admins could add new categories of things to probe as well as control where the information for each category of probe should come from. Also, there should be some standard default value to report (e.g. Unknown, "", None, ??, etc) when the architecture specific probing logic isn't able to come up with a value for an expected field. Finally, there should be some normalization logic for the size and speed attributes so they are all stored using standard units. For example, convert all CPU speeds to GHz before story, all memory sizes to either MB or GB (probably MB), all disk sizes to GB, etc. This is the one thing that perhaps shouldn't completely be up to admins. At most, before recording their values into the DB they should be normalized. I'll comment more on this later. > - Adding hardware information immediately requires us to extend the > information stored in a Machine object > - My current list includes (sources in parentheses): > Machine model (dmidecode, lsvpd) > CPU version (/proc/cpuinfo) > CPU speed (/proc/cpuinfo) > Memory installed (/sys/devices/system/memory or /proc/meminfo) > Swap size (/proc/meminfo, /proc/swaps?) > Serial number (dmidecode, lsvpd) > CPU topology (# of cores, # of hardware threads, sysfs, /proc/cpuinfo)) > PCI devices (lspci) > KVM capable (/proc/cpuinfo) > CPU flags (/proc/cpuinfo) > Timestamp of last inventory > Version of last inventory tool used > BIOS/system firmware level I think all of the above sounds like a great start for the core HW info to gather. I would also add the following to the list of things to gather (I'm still trying to map the hp-ux way of doing this to Linux so if anything below sounds a bit off, bring me back in line!): * basic disk device attribs (type, size, device, pci/hw path) mapped to a unique ID that won't change with each OS install (lsscsi, /proc/scsi) * SAN disk attribs (WWID, transport) (lsscsi -t, ??) * admin attribs for disks - is disk available for anyone to use (scratch) vs reserved with non-specific purpose(don't touch) vs reserved for specific purpose(swap, dump, reserved for VM disk images, etc) * CPU hyperthreading - on/off (??) > - Reading through this list, I imagine the implementation becomes > an InventoryInterface, with per-architecture implementations of > how to actually obtain the data. > - Additionally, one can imagine end-users may have site-specific, > internal or controlled extra data they would like to be searchable > & storable per-machine. I think it makes sense to have an > additional, admin-interface defined table to store a list of labels > and how to get the data that should be stored with that label. We > would then store a hash of the obtained information in the > inventory job. This hash would have all the HW data right? Not just the stuff the admin added (so default + admin added). I think there is value in doing the inventory for each job and actually storing that information in the job regardless of if the job was scheduled through classic mechanisms (hostname or label matching) or HW attrib comparative scheduling (discussed below). Saving the HW info in the job is important as all we can confirm right now is which host the job ran on. If the host is changing over time (having memory, IO cards, storage, etc added/removed) having that data stored and available in queryable form would be valuable. I recognize much of this info is in perhaps available in the job sysinfo files, but if DB tables are going to be added to make these attribs available for scheduling I would vote to make them available as job attribs as well. I think this is what you were after with the "store a hash of obtained information in the inventory job", but I wasn't clear if there would be an inventory job linked to each test job. NOTE: one way to handle this would be to potentially revision host entries in inventory part of the DB so that host + inventory_revision would be a unique snapshot of these attribs and that would be all you need to store in each test job. > - Have a contrib script appropriate for > /var/lib/cobbler/triggers/add/system/ to automatically run an > inventory job on system add. Would we additionally want to trigger inventory calls each time a job runs on a host (during the VERIFY state)? It seems like you are proposing this in the next bullet below but I didn't see it explicitly listed > - Extend the create job UI to allow searching by hardware information. > This requires seeing what once satisfied the request, re-running > inventory on that host, and if the required data is still present, > running the job. In general it sounds like your process is in alignment with what I was thinking of, but having implemented a similar HW attribute scheduling system before I know this is getting to a potentially complicated and performance sensitive part of this discussion. You need scheduling to be fast to make the queing efficient and reports like I may someday want to add (estimated completion time reports using scheduling simulations + historic test duration data) possible. Further, I want to make sure that the HW attribute comparison is actually not finalized until the end of the process you describe above. For example, if after the first evaluation the scheduler picks host 1, if host 1 doesn't still match after the re-inventory the scheduler should just treat that like any other failed host verify (repeat the comparison and perhaps come up with host 3, and then try that). In terms of implementation comments... The first issue is with units. If a user asks for memory > 3030MB and you are storing your inventory info for memory in as a string using a mish-mash of units (perhaps because different architectures report different units) then it becomes increasingly hard to translate the requested criteria for the job into a basic DB query. You first have to get memory for all machines, then convert it to their units, then do the comparison. As per the above, I suggest avoiding this by standardizing the units for the inventory interface (so you know what unit is in the DB for every field with a numeric value) and then doing a conversion from the user's requested unit to the DB native unit before you do the scheduling query. The second issue you have covered above in terms of needing to do a re-inventory. Since we can't re-inventory your entire HW pool each time a new job request comes in, you are right that doing a first pass guess and then re-checking in the verify stage is the right thing to do. If we do the inventory with every job or at most, after every re-install, it should be pretty accurate and there will be very few cases of that first guess not being a good one. In this model of delayed HW attribute scheduling, the hw_attribute_scheduler module (think of the metahost scheduler module) logic to handle at least 4 scenarios during each scheduler tick (or if performance is slow, every X ticks): 1) queued - the job still matches a host in a normal state. If so, just wait until one of those hosts is Ready. 2) ready - a matching host becomes "ready", queue it up to run just like a normal job (assuming the re-inventory will put the job back in queued if it no longer matches) 3) starved - the job doesn't match anything but failed hosts (which shouldn't be touched by HW attrib scheduled jobs in my mind) or reserved hosts (hopefully coming soon as per issue #360). In this case, it might be a long time before the job starts so an email to admins or user might be in order 4) orphaned - the job doesn't match any hosts at all - probably handle the same as starved, but just call out that the difference (starved may eventually run, orphaned is unlikely to ever run w/o intervention). NOTE: Starved and orphaned jobs don't necessarily require immediate action in the scheduler (e.g. aborting the job) as depending on the use case and the attribute that is starved on, it might be something an admin is staffed to keep an eye on and fix (perhaps swapping IO cards, configuring storage, etc). It might also just clear up after the next inventory refresh as perhaps a system was temporarily modified for a special job and will be put back to its normal state at the end of it. That said, the job status should clearly show starved an orphaned jobs with either a new status value (not just queued) and somehow bring attention to the fact that there are some jobs waiting on HW attribs X, Y, Z. > This becomes necessary in particular for the case of > PCI devices being removed, but also could come about when hostnames > get recycled (for instance) or hardware is upgraded. Also, it is necessary for unplanned things like memory dying, disks dying, etc. I'm not proposing there be a sanity check for HW yet (e.g. during the equivalent of the verify stage my old automation would avoid using host that unexpected HW attributes change and ask an admin to review the host) as that can create a lot of overhead depending on how many hosts there are and how volatile your HW is, but there needs to be allowances for HW attribs frequently changing. > Lucas & I thought the discussion from here was better-served being on the > mailing-list to allow everyone interested to chime in. > > Thanks, > Nish > > -- > Nishanth Aravamudan <n...@us.ibm.com> > IBM Linux Technology Center > > _______________________________________________ > Autotest mailing list > Autotest@test.kernel.org > http://test.kernel.org/cgi-bin/mailman/listinfo/autotest _______________________________________________ Autotest mailing list Autotest@test.kernel.org http://test.kernel.org/cgi-bin/mailman/listinfo/autotest