On 5/29/26 15:14, Vladimir Riabchun wrote:
>
> On 5/29/26 14:20, Pavel Tikhomirov wrote:
>> Expose the per-VE BPF program load limit via two ve cgroup files:
>>
>> bpf_prog_max_nr - rw, writable only from ve0, restricts loads
>> bpf_prog_avail_nr - ro, remaining quota
>>
>> Writes adjust the avail counter by the delta so that already-loaded
>> programs are not retroactively rejected when the cap is lowered.
>>
>> https://virtuozzo.atlassian.net/browse/VSTOR-131947
>> Signed-off-by: Pavel Tikhomirov <[email protected]>
>> Feature: ve: allow BPF in Containers
>> ---
>> kernel/ve/ve.c | 39 +++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 39 insertions(+)
>>
>> diff --git a/kernel/ve/ve.c b/kernel/ve/ve.c
>> index 48da546117bb7..9c3be61a4366a 100644
>> --- a/kernel/ve/ve.c
>> +++ b/kernel/ve/ve.c
>> @@ -1315,6 +1315,35 @@ static s64 ve_netif_avail_nr_read(struct
>> cgroup_subsys_state *css, struct cftype
>> return atomic_read(&css_to_ve(css)->netif_avail_nr);
>> }
>> +static u64 ve_bpf_prog_max_nr_read(struct cgroup_subsys_state *css,
>> struct cftype *cft)
>> +{
>> + return css_to_ve(css)->bpf_prog_max_nr;
>
> Read is not protected by ve->op_sem, possible race.
We only protect write against write to preserve full consistency
between bpf_prog_max_nr and bpf_prog_avail_nr pair. We are fine with eventual
consistency on read with benefit of avoiding excess locking.
>
>> +}
>> +
>> +static int ve_bpf_prog_max_nr_write(struct cgroup_subsys_state *css, struct
>> cftype *cft, u64 val)
>> +{
>> + struct ve_struct *ve = css_to_ve(css);
>> + int delta;
>> +
>> + if (!ve_is_super(get_exec_env()))
>> + return -EPERM;
>> +
>> + if (val > INT_MAX)
>> + return -EOVERFLOW;
>> +
>> + down_write(&ve->op_sem);
>> + delta = val - ve->bpf_prog_max_nr;
>> + ve->bpf_prog_max_nr = val;
>> + atomic_add(delta, &ve->bpf_prog_avail_nr);
>
> We should check ve->bpf_prog_avail_nr + delta >= 0, otherwise we can have more
> programs than allowed.
That is ok, we allow existing programs to take more than allowed, since they
already have it, we only prevent new programs unless avail_nr becomes positive.
>
>> + up_write(&ve->op_sem);
>> + return 0;
>> +}
>> +
>> +static s64 ve_bpf_prog_avail_nr_read(struct cgroup_subsys_state *css,
>> struct cftype *cft)
>> +{
>> + return atomic_read(&css_to_ve(css)->bpf_prog_avail_nr);
>> +}
>> +
>> static int ve_os_release_read(struct seq_file *sf, void *v)
>> {
>> struct cgroup_subsys_state *css = seq_css(sf);
>> @@ -1786,6 +1815,16 @@ static struct cftype ve_cftypes[] = {
>> .name = "netif_avail_nr",
>> .read_s64 = ve_netif_avail_nr_read,
>> },
>> + {
>> + .name = "bpf_prog_max_nr",
>> + .flags = CFTYPE_NOT_ON_ROOT,
>> + .read_u64 = ve_bpf_prog_max_nr_read,
>> + .write_u64 = ve_bpf_prog_max_nr_write,
>> + },
>> + {
>> + .name = "bpf_prog_avail_nr",
>> + .read_s64 = ve_bpf_prog_avail_nr_read,
>
> Why signed value?
It can be negative if limit is set to less than used.
>
>> + },
>> {
>> .name = "os_release",
>> .max_write_len = __NEW_UTS_LEN + 1,
>
> --
> Best regards, Riabchun Vladimir
> Linux Kernel Developer, Virtuozzo
>
--
Best regards, Pavel Tikhomirov
Senior Software Developer, Virtuozzo.
_______________________________________________
Devel mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/devel