On Tue, May 13, 2025 at 3:12 PM Ian Rogers <irog...@google.com> wrote:
>
> On Tue, May 13, 2025 at 2:13 PM Arnaldo Carvalho de Melo
> <a...@kernel.org> wrote:
> >
> > On Fri, May 02, 2025 at 01:14:32PM +0530, Mukesh Kumar Chaurasiya wrote:
> > > On Fri, Apr 25, 2025 at 02:46:43PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > Maybe that max() call in perf_cpu_map__intersect() somehow makes the
> > > > compiler happy.
> >
> > > > And in perf_cpu_map__alloc() all calls seems to validate it.
> >
> > > > Like:
> >
> > > > +++ b/tools/lib/perf/cpumap.c
> > > > @@ -411,7 +411,7 @@ int perf_cpu_map__merge(struct perf_cpu_map **orig, 
> > > > struct perf_cpu_map *other)
> > > >         }
> > > >
> > > >         tmp_len = __perf_cpu_map__nr(*orig) + __perf_cpu_map__nr(other);
> > > > -       tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
> > > > +       tmp_cpus = calloc(tmp_len, sizeof(struct perf_cpu));
> > > >         if (!tmp_cpus)
> > > >                 return -ENOMEM;
> >
> > > > ⬢ [acme@toolbx perf-tools-next]$
> >
> > > > And better, do the max size that the compiler is trying to help us
> > > > catch?
> >
> > > Isn't it better to use perf_cpu_map__nr. That should fix this problem.
> >
> > Maybe, have you tried it?
> >
> > > One question I have, in perf_cpu_map__nr, the function is returning
> > > 1 in case *cpus is NULL. Is it ok to do that? wouldn't it cause problems?
> >
> > Indeed this better be documented, as by just looking at:
> >
> > int perf_cpu_map__nr(const struct perf_cpu_map *cpus)
> > {
> >         return cpus ? __perf_cpu_map__nr(cpus) : 1;
> > }
> >
> > It really doesn't make much sense to say that a NULL map has one entry.
> >
> > But the next functions are:
> >
> > bool perf_cpu_map__has_any_cpu_or_is_empty(const struct perf_cpu_map *map)
> > {
> >         return map ? __perf_cpu_map__cpu(map, 0).cpu == -1 : true;
> > }
> >
> > bool perf_cpu_map__is_any_cpu_or_is_empty(const struct perf_cpu_map *map)
> > {
> >         if (!map)
> >                 return true;
> >
> >         return __perf_cpu_map__nr(map) == 1 && __perf_cpu_map__cpu(map, 
> > 0).cpu == -1;
> > }
> >
> > bool perf_cpu_map__is_empty(const struct perf_cpu_map *map)
> > {
> >         return map == NULL;
> > }
> >
> > So it seems that a NULL cpu map means "any/all CPU) and a map with just
> > one entry would have as its content "-1" that would mean "any/all CPU".
> >
> > Ian did work on trying to simplify/clarify this, so maybe he can chime
> > in :-)
>
> So I've tried to improve the naming but not vary the implementation
> greatly - initially I was in the code fixing reference count checking
> issues. There is an important distinction between "all" meaning a
> range of CPUs like 0-15 on a 16 core/hyperthread system, and "any"
> meaning the special "-1" value. It is possible to have a perf_cpu_map
> to both be "all" and "any", iterating an empty perf_cpu_map has
> strangely also meant the "any" and so the code isn't specific but
> relies on these odd properties.

I just remembered, there was also a discussion with Adrian IIRC. I
wanted "all" to mean all possible CPUs on a system, and "online" to be
either the same or a subset of that, in case CPUs were offline. There
was a different view that the distinction between "all" and "online"
wasn't useful. So there is some confusion in the code in this regard
and some errors may exist when CPU cores are offline. There's
definitely work to make the API better.

Thanks,
Ian

> Anyway, I'm not sure on the implication of this with
> malloc/calloc/unsigned... It would seem reasonable to me for
> __perf_cpu_map__nr to return an unsigned number and to propagate that
> to fix the new GCC issue.
>
> Thanks,
> Ian
>
>
> > - Arnaldo

Reply via email to