libperf: avoid moving of fds at fdarray__filter() call

Alexey Budankov Mon, 29 Jun 2020 11:57:57 -0700


On 26.06.2020 13:06, Alexey Budankov wrote:
> 
> On 26.06.2020 12:37, Jiri Olsa wrote:
>> On Thu, Jun 25, 2020 at 10:32:29PM +0300, Alexey Budankov wrote:
>>>
>>> On 25.06.2020 20:14, Jiri Olsa wrote:
>>>> On Wed, Jun 24, 2020 at 08:19:32PM +0300, Alexey Budankov wrote:
>>>>>
>>>>> On 17.06.2020 11:35, Alexey Budankov wrote:
>>>>>>
>>>>>> Skip fds with zeroed revents field from count and avoid fds moving
>>>>>> at fdarray__filter() call so fds indices returned by fdarray__add()
>>>>>> call stay the same and can be used for direct access and processing
>>>>>> of fd revents status field at entries array of struct fdarray object.
>>>>>>
>>>>>> Signed-off-by: Alexey Budankov <[email protected]>
>>>>>> ---
>>>>>>  tools/lib/api/fd/array.c   | 11 +++++------
>>>>>>  tools/perf/tests/fdarray.c | 20 ++------------------
>>>>>>  2 files changed, 7 insertions(+), 24 deletions(-)
>>>>>>
>>>>>> diff --git a/tools/lib/api/fd/array.c b/tools/lib/api/fd/array.c
>>>>>> index 58d44d5eee31..97843a837370 100644
>>>>>> --- a/tools/lib/api/fd/array.c
>>>>>> +++ b/tools/lib/api/fd/array.c
>>>>>> @@ -93,22 +93,21 @@ int fdarray__filter(struct fdarray *fda, short 
>>>>>> revents,
>>>>>>                  return 0;
>>>>>>  
>>>>>>          for (fd = 0; fd < fda->nr; ++fd) {
>>>>>> +                if (!fda->entries[fd].revents)
>>>>>> +                        continue;
>>>>>> +
>>>>>
>>>>> So it looks like this condition also filters out non signaling events 
>>>>> fds, not only
>>>>> control and others fds, and this should be somehow avoided so such event 
>>>>> related fds
>>>>> would be counted. Several options have been proposed so far:
>>>>>
>>>>> 1) Explicit typing of fds via API extension and filtering based on the 
>>>>> types:
>>>>>    a) with separate fdarray__add_stat() call
>>>>>    b) with type arg of existing fdarray__add() call
>>>>>    c) various memory management design is possible
>>>>>
>>>>> 2) Playing tricks with fd positions inside entries and assumptions on 
>>>>> fdarray API calls ordering
>>>>>    - looks more like a hack than a designed solution
>>>>>
>>>>> 3) Rewrite of fdarray class to allocate separate object for every added 
>>>>> fds
>>>>>    - can be replaced with nonscrewing of fds by __filter()
>>>>>
>>>>> 4) Distinct between fds types at fdarray__filter() using .revents == 0 
>>>>> condition
>>>>>    - seems to have corner cases and thus not applicable
>>>>>
>>>>> 5) Extension of fdarray__poll(, *arg_ptr, arg_size) with arg of fds array 
>>>>> to atomically poll
>>>>>    on fdarray_add()-ed fds and external arg fds and then external arg fds 
>>>>> processing
>>>>>
>>>>> 6) Rewrite of fdarray class on epoll() call basis
>>>>>    - introduces new scalability restrictions for Perf tool
>>>>
>>>> hum, how many fds for polling do you expect in your workloads?
>>>
>>> Currently it is several hundreds so default of 1K is easily hit and 
>>> "Profile a Large Number of PMU Events on Multi-Core Systems" section [1]
>>> recommends:
>>>
>>> soft nofile 65535
>>> hard nofile 65535
>>
>> I'm confused, are you talking about file descriptors limit now?
>> this wont be affected by epoll change.. what do I miss?
> 
> Currently there is already uname -n limit on the amount of open file 
> descriptors
> and Perf tool process is affected by that limit.
> 
> Moving to epoll() will impose one more max_user_watches limit and that can 
> additionally
> confine Perf applicability even though default value on some machines 
> currently
> is high enough.


Prior making v9 I would prefer to agree on some design to be implemented in 
order to
avoid guessing and redundant reiterating.

Options that I see as good balanced ones are 1) or 5), + non screwing of fds to 
fix
staleness of pos(=fdarray__add()).

Are there any thoughts so far?

~Aleksei

Re: [PATCH v8 01/13] tools/libperf: avoid moving of fds at fdarray__filter() call

Reply via email to