On Wed, Apr 30, 2025 at 3:47 AM Pingfan Liu <pi...@redhat.com> wrote: > > On Wed, Apr 30, 2025 at 8:04 AM Alexei Starovoitov > <alexei.starovoi...@gmail.com> wrote: > > > > On Mon, Apr 28, 2025 at 9:13 PM Pingfan Liu <pi...@redhat.com> wrote: > > +__bpf_kfunc struct mem_range_result *bpf_kexec_decompress(char > > *image_gz_payload, int image_gz_sz, > > > + unsigned int expected_decompressed_sz) > > > +{ > > > + decompress_fn decompressor; > > > + //todo, use flush to cap the memory size used by decompression > > > + long (*flush)(void*, unsigned long) = NULL; > > > + struct mem_range_result *range; > > > + const char *name; > > > + void *output_buf; > > > + char *input_buf; > > > + int ret; > > > + > > > + range = kmalloc(sizeof(struct mem_range_result), GFP_KERNEL); > > > + if (!range) { > > > + pr_err("fail to allocate mem_range_result\n"); > > > + return NULL; > > > + } > > > + refcount_set(&range->usage, 1); > > > + > > > + input_buf = vmalloc(image_gz_sz); > > > + if (!input_buf) { > > > + pr_err("fail to allocate input buffer\n"); > > > + kfree(range); > > > + return NULL; > > > + } > > > + > > > + ret = copy_from_kernel_nofault(input_buf, image_gz_payload, > > > image_gz_sz); > > > + if (ret < 0) { > > > + pr_err("Error when copying from 0x%px, size:0x%x\n", > > > + image_gz_payload, image_gz_sz); > > > + kfree(range); > > > + vfree(input_buf); > > > + return NULL; > > > + } > > > + > > > + output_buf = vmalloc(expected_decompressed_sz); > > > + if (!output_buf) { > > > + pr_err("fail to allocate output buffer\n"); > > > + kfree(range); > > > + vfree(input_buf); > > > + return NULL; > > > + } > > > + > > > + decompressor = decompress_method(input_buf, image_gz_sz, &name); > > > + if (!decompressor) { > > > + pr_err("Can not find decompress method\n"); > > > + kfree(range); > > > + vfree(input_buf); > > > + vfree(output_buf); > > > + return NULL; > > > + } > > > + //to do, use flush > > > + ret = decompressor(image_gz_payload, image_gz_sz, NULL, NULL, > > > + output_buf, NULL, NULL); > > > + > > > + /* Update the range map */ > > > + if (ret == 0) { > > > + range->buf = output_buf; > > > + range->size = expected_decompressed_sz; > > > + range->status = 0; > > > + } else { > > > + pr_err("Decompress error\n"); > > > + vfree(output_buf); > > > + kfree(range); > > > + return NULL; > > > + } > > > + pr_info("%s, return range 0x%lx\n", __func__, range); > > > + return range; > > > +} > > > > These kfuncs look like generic decompress routines. > > They're not related to kexec and probably should be in kernel/bpf/helpers.c > > or kernel/bpf/compression.c instead of kernel/kexec_pe_image.c. > > > > Thanks for your suggestion. I originally considered using these kfuncs > only in kexec context (Later, introducing a dedicated BPF_PROG_TYPE > for kexec).
We do not add new prog types anymore. They're frozen just like the list of helpers. > They are placed under a lock so that a malice attack can > not exhaust the memory through repeatedly calling to the decompress > kfunc. attack? This is all root only anyway and all memory is counted towards memcg. Make sure to use GFP_KERNEL_ACCOUNT and something similar to bpf_map_get_memcg. > To generalize these kfunc, I think I can add some boundary control of > the memory usage to prevent such attacks. Don't reinvent the wheel. memcg is the mechanism. > > They also must be KF_SLEEPABLE. > > Please test your patches with all kernel debugs enabled. > > Otherwise you would have seen all these "sleeping while atomic" > > issues yourself. > > > > See, I will have all these debug options for the V3 test. > > Appreciate your insight. > > Regards, > > Pingfan >