Hi Daniel, Paolo,
Here my last questions before wrapping up and send v4, or maybe call off my attempt to add RAPL interface in QEMU. Daniel P. Berrangé, Jan 30, 2024 at 10:39: > > + rcu_register_thread(); > > + > > + /* Get QEMU PID*/ > > + pid = getpid(); > > + > > + /* Nb of CPUS per packages */ > > + maxcpus = vmsr_get_maxcpus(0); > > + > > + /* Nb of Physical Packages on the system */ > > + maxpkgs = vmsr_get_max_physical_package(maxcpus); > > This function can fail so this needs to be checked & reported. > > > + > > + /* Those MSR values should not change as well */ > > + vmsr->msr_unit = vmsr_read_msr(MSR_RAPL_POWER_UNIT, 0, pid, > > + s->msr_energy.socket_path); > > + vmsr->msr_limit = vmsr_read_msr(MSR_PKG_POWER_LIMIT, 0, pid, > > + s->msr_energy.socket_path); > > + vmsr->msr_info = vmsr_read_msr(MSR_PKG_POWER_INFO, 0, pid, > > + s->msr_energy.socket_path); > > This function can fail for a variety of reasons, most especially if someone > gave an incorrect socket path, or if the daemon is not running. This is not > getting diagnosed, and even if we try to report it here, we're in a background > thread at this point. > > I think we need to connect and report errors before even starting this > thread, so that QEMU startup gets aborted upon configuration error. > Fair enough. Would it be ok to do the sanity check before rcu_register_thread() and "return NULL;" in case of error or would you prefer me to check all of this before even calling the qemu_thread_create() ? > > + /* Populate all the thread stats */ > > + for (int i = 0; i < num_threads; i++) { > > + thd_stat[i].utime = g_new0(unsigned long long, 2); > > + thd_stat[i].stime = g_new0(unsigned long long, 2); > > + thd_stat[i].thread_id = thread_ids[i]; > > + vmsr_read_thread_stat(&thd_stat[i], pid, 0); > > It is non-obvious that the 3rd parameter here is an index into > the utime & stime array. This function would be saner to review > if called as: > > vmsr_read_thread_stat(pid, > thd_stat[i].thread_id, > &thd_stat[i].utime[0], > &thd_stat[i].stime[0], > &thd_stat[i].cpu_id); > > so we see what are input parameters and what are output parameters. > > Also this method can fail, eg if the thread has exited already, > so we need to take that into account and stop trying to get info > for that thread in later code. eg by setting 'thread_id' to 0 > and then skipping any thread_id == 0 later. > > Good point. I'll rework the function and return "thread_id" to 0 in case of failure in order to test it later on. > > + thd_stat[i].numa_node_id = > > numa_node_of_cpu(thd_stat[i].cpu_id); > > + } > > + > > + /* Retrieve all packages power plane energy counter */ > > + for (int i = 0; i <= maxpkgs; i++) { > > + for (int j = 0; j < num_threads; j++) { > > + /* > > + * Use the first thread we found that ran on the CPU > > + * of the package to read the packages energy counter > > + */ > > + if (thd_stat[j].numa_node_id == i) { > > 'i' is a CPU ID value, while 'numa_node_id' is a NUMA node ID value. > I don't think it is semantically valid to compare them for equality. > > I'm not sure the NUMA node is even relevant, since IIUC from the docs > earlier, the power values are scoped per package, which would mean per > CPU socket. > 'i' here is the package number on the host. I'm using functions of libnuma to populate the maxpkgs of the host. I tested this on different Intel CPU with multiple packages and this has always returned the good number of packages. A false positive ? So here I'm checking if the thread has run on the package number 'i'. I populate 'numa_node_id' with numa_node_of_cpu(). I did not wanted to reinvent the wheel and the only lib that was talking about "node" was libnuma. Maybe I'm wrong assuming that a "node" (defined as an area where all memory has the same speed as seen from a particular CPU) could lead me to the packages number ? And this is what I see you wrote below: "A numa node isn't a package AFAICT." Regards, Anthony