Re: Advice on modifying Lavapipe to isolate JIT compilation in separate process

2023-04-27 Thread Dave Airlie
On Thu, 27 Apr 2023 at 15:18, Josh Gargus  wrote:
>
> Thanks for your advice!  I hadn't looked at Venus, but that seems like a very 
> promising place to start.
>
> The other approach feels more approachable now too; it feels like there are 
> less "unknown unknowns", although there are plenty of known unknowns to 
> investigate (address independence was one that was already bugging be before 
> I wrote to this list).

I think it shouldn't be too horrible to work out, another option might
be to abuse the cache somehow, but I think that still needs writable +
executable which probably doesn't help, but stuff should be address
independent as I do write x86 asm programs to the cache and read them
back out, only relocating around the global symbols.

>
> It seems like Venus is the more straightforward approach, so I'm inclined to 
> just go with it.  However, it seems like there would be a performance hit 
> compared to only doing JIT compilation in a separate process.  Do you have a 
> rough sense of the performance hit of serializing everything over Venus?  The 
> answer will depend on the workload, I know.

Yeah I think you'll definitely see a large perf hit than just moving
compilation out to a separate process, I'm not sure of the raw venus
overhead numbers here, someone else might have more information
available.

Dave.


Re: Advice on modifying Lavapipe to isolate JIT compilation in separate process

2023-04-27 Thread Jose Fonseca
Perhaps I'm getting confused with the terminology, but I don't think moving 
compilation to a separate process helps here.  IIUC, compilation (as in LLVM IR 
-> x86 code) can happen anywhere, the problem is loading the JITed code (ie, 
make writeable memory executable.)

As mentioned, there are many places this can be done.

If Venus does not suite your needs, the easiest way to achieve this would be to:
- back all buffer/texture memory with shared memory, visible to the 2nd process
- modify gallivm to tell LLVM either spit out .so files, or to use shared memory
- modify gallivm_jit_function to return a wrapper that marshals the call into 
the 2nd process:
  - either a generic C wrapper which can introspect the LLVM IR Function 
arguments
  - or hand written C wrappers for every function prototype returned by 
gallivm_jit_function

- there are also a few places where JIT code refers to host process memory 
addresses explicitly (e.g, util_format_xxx helper functions) which need to be 
handled separately (e.g, by passing these addresses in a structure which can be 
rewritten to match the 2nd process)

Jose


From: mesa-dev  on behalf of Dave 
Airlie 
Sent: Thursday, April 27, 2023 08:39
To: Josh Gargus 
Cc: mesa-dev@lists.freedesktop.org 
Subject: Re: Advice on modifying Lavapipe to isolate JIT compilation in 
separate process

!! External Email

On Thu, 27 Apr 2023 at 15:18, Josh Gargus  wrote:
>
> Thanks for your advice!  I hadn't looked at Venus, but that seems like a very 
> promising place to start.
>
> The other approach feels more approachable now too; it feels like there are 
> less "unknown unknowns", although there are plenty of known unknowns to 
> investigate (address independence was one that was already bugging be before 
> I wrote to this list).

I think it shouldn't be too horrible to work out, another option might
be to abuse the cache somehow, but I think that still needs writable +
executable which probably doesn't help, but stuff should be address
independent as I do write x86 asm programs to the cache and read them
back out, only relocating around the global symbols.

>
> It seems like Venus is the more straightforward approach, so I'm inclined to 
> just go with it.  However, it seems like there would be a performance hit 
> compared to only doing JIT compilation in a separate process.  Do you have a 
> rough sense of the performance hit of serializing everything over Venus?  The 
> answer will depend on the workload, I know.

Yeah I think you'll definitely see a large perf hit than just moving
compilation out to a separate process, I'm not sure of the raw venus
overhead numbers here, someone else might have more information
available.

Dave.

!! External Email: This email originated from outside of the organization. Do 
not click links or open attachments unless you recognize the sender.


[ANNOUNCE] mesa 23.1.0-rc3

2023-04-27 Thread Eric Engestrom
Hello everyone,

The third release candidate for 23.1.0 is now available.

If you find any issues, please report them here:
https://gitlab.freedesktop.org/mesa/mesa/-/issues/new

The next release candidate is expected in one week, on May 3rd.

Cheers,
  Eric

---

Charmaine Lee (2):
  translate: do not clamp element index in generic_run
  svga: set PIPE_CAP_VERTEX_ATTRIB_ELEMENT_ALIGNED_ONLY for VGPU10 device

Daniel Schürmann (2):
  radv/rt: fix total stack size computation
  radv/rt: properly destroy radv_ray_tracing_lib_pipeline on error

Emma Anholt (2):
  blob: Don't valgrind assert for defined memory if we aren't writing.
  util/log: Fix log messages over 1024 characters.

Eric Engestrom (6):
  .pick_status.json: Update to 3017d01c9ded9c9fd097b600081b1bbe86e90fb8
  .pick_status.json: Update to a18a51a708a86f51e0a5ab031b379f65bc84fb49
  .pick_status.json: Update to c060b649c5a866f42e5df73f41c6e2809cf30e99
  ci: rework vulkan validation layer build script
  .pick_status.json: Update to 3f14fd8578549e34db2f564396f300819b2ff10f
  VERSION: bump for 23.1.0-rc3

Filip Gawin (1):
  nine: add fallback for D3DFMT_D16 in d3d9_to_pipe_format_checked

Friedrich Vock (4):
  radv/rmv: Fix creating RT pipelines
  radv/rmv: Fix import memory
  radv/rt: Plug some memory leaks during shader creation
  radv: Don't leak the RT prolog binary

Gert Wollny (2):
  r600/sfn: Lower tess levels to vectors in TCS
  r600/sfn: make sure f2u32 is lowered late and correctly for 64 bit floats

Hans-Kristian Arntzen (1):
  wsi/x11: Fix present ID signal when IDLE comes before COMPLETE.

Iago Toral Quiroga (3):
  broadcom/compiler: fix v3d_qpu_uses_sfu
  broadcom/compiler: add a v3d_qpu_instr_is_legacy_sfu helper
  broadcom/compiler: fix incorrect check for SFU op

Jordan Justen (1):
  intel/compiler/gfx12.5+: Lower 64-bit cluster_broadcast with 32-bit ops

Juan A. Suarez Romero (1):
  v3d: use primitive type to get stream output offset

Karol Herbst (3):
  radeonsi: lower mul_high
  ac/llvm: support shifts on 16 bit vec2
  rusticl: don't set size_t-is-usize for >=bindgen-0.65

Lionel Landwerlin (1):
  anv: rework Wa_14017076903 to only apply with occlusion queries

M Henning (1):
  nouveau/codegen: Check nir_dest_num_components

Marek Olšák (1):
  nir: fix 2 bugs in nir_create_passthrough_tcs

Michel Zou (3):
  vulkan/wsi: fix -Wnarrowing warning
  vk/entry_points:: fix mingw build
  mesa/draw: fix -Wformat warning

Mike Blumenkrantz (19):
  zink: manually re-set framebuffer after msrtss replicate blit
  zink: handle 'blitting' flag better in msrtss replication
  zink: skip msrtss replicate if the attachment will be full-cleared
  zink: avoid recursion during msrtss blits from flushing clears
  nir/lower_alpha_test: rzalloc state slots
  zink: fix non-db bindless texture buffers
  zink: emit demote cap when using demote
  zink: only print copy box warning once per resource
  util/debug: move null checks out of debug message macro
  zink: don't bitcast bool deref loads/stores
  drisw: don't leak the winsys
  zink: check for extendedDynamicState3DepthClipNegativeOneToOne for ds3 
support
  draw: fix viewmask iterating
  zink: don't pin flush queue threads if no threads exist
  zink: add z32s8 as mandatory GL3.0 profile attachment format
  nir/gs: fix array type copying for passthrough gs
  zink: fix array copying in pv lowering
  gallivm: break out native vector width calc for reuse
  llvmpipe: do late init for llvm builder

Patrick Lerda (2):
  r600/sfn: fix memory leak related to sh_info->arrays
  aux/draw: fix memory leak related to ureg_get_tokens()

Pavel Ondračka (1):
  r300: fix unconditional KIL on R300/R400

Qiang Yu (1):
  aco: fix nir_f2u64 translation

Rhys Perry (3):
  aco: remove SMEM_instruction::prevent_overflow
  ac/nir/ps: fix null export write mask miss set to 0xf
  aco: don't move exec reads around exec writes

Rob Clark (2):
  freedreno/a6xx: Fix valid_format_cast logic for newer a6xx
  freedreno: Fix resource tracking vs rebind/invalidate

Samuel Pitoiset (8):
  radv: do not allow 1D block-compressed images with (extended) storage on 
GFX6
  radv: fix usage flag for 3D compressed 128 bpp images on GFX9
  radv: update binning settings to work around GPU hangs
  radv/amdgpu: fix adding continue preambles and postambles BOs to the list
  radv: wait for occlusion queries in the resolve query shader
  radv: delay enabling/disabling occlusion queries at draw time
  radv: track DB_COUNT_CONTROL changes to avoid context rolls
  radv: add the perf counters BO to the preambles BO list

SoroushIMG (4):
  zink: do not emit line stipple dynamic state when emulating
  zink: take location_frac into account in lower_line_smooth_gs
  zink: fix 

Re: Advice on modifying Lavapipe to isolate JIT compilation in separate process

2023-04-27 Thread Josh Gargus
Thanks Jose,

You're right, compilation is just data transformation, the security issues
arise when the process executes the generated code.  I'm realizing that I
don't *deeply* grok the mindset of our security folk; I'll have a talk with
them.  For example, the client process has the ability to load shared
libraries (otherwise it couldn't load libvulkan.so, nor the ICD obtained
from the loader).  So if a client exploit:
  1) generated some malicious code
  2) wrote that code to a file
  3) loaded that file as a .so
  4) ran the malicious code
... then I don't see how the outcome is different compared to if 2) and 3)
were replaced by generating the same malicious code into
writable/executable memory.  I have some ideas about how Fuchsia's
capability model might address this, but they're pure speculation that
isn't worth writing down.  I'll report back here, in case this is of
interest to anyone.

In your sketched non-Venus solution, I don't understand the first step
("back all buffer/texture memory with shared memory, visible to the 2nd
process").  I don't think you're proposing that rendering would occur in
the second process, only shader compilation (otherwise there wouldn't be a
need to "to tell LLVM either spit out .so files, or to use shared
memory").  So then why does the compilation process need access to the
buffer/texture memory?

Thanks,
Josh

On Thu, Apr 27, 2023 at 6:23 AM Jose Fonseca  wrote:

> Perhaps I'm getting confused with the terminology, but I don't think
> moving *compilation* to a separate process helps here.  IIUC, compilation
> (as in LLVM IR -> x86 code) can happen anywhere, the problem is loading the
> JITed code (ie, make writeable memory executable.)
>
> As mentioned, there are many places this can be done.
>
> If Venus does not suite your needs, the easiest way to achieve this would
> be to:
> - back all buffer/texture memory with shared memory, visible to the 2nd
>  process
> - modify gallivm to tell LLVM either spit out .so files, or to use shared
> memory
> - modify gallivm_jit_function to return a wrapper that marshals the call
> into the 2nd process:
>   - either a generic C wrapper which can introspect the LLVM IR Function
> arguments
>   - or hand written C wrappers for every function prototype returned by
> gallivm_jit_function
>
> - there are also a few places where JIT code refers to host process memory
> addresses explicitly (e.g, util_format_xxx helper functions) which need to
> be handled separately (e.g, by passing these addresses in a structure which
> can be rewritten to match the 2nd process)
>
> Jose
>
> --
> *From:* mesa-dev  on behalf of
> Dave Airlie 
> *Sent:* Thursday, April 27, 2023 08:39
> *To:* Josh Gargus 
> *Cc:* mesa-dev@lists.freedesktop.org 
> *Subject:* Re: Advice on modifying Lavapipe to isolate JIT compilation in
> separate process
>
> !! External Email
>
> On Thu, 27 Apr 2023 at 15:18, Josh Gargus  wrote:
> >
> > Thanks for your advice!  I hadn't looked at Venus, but that seems like a
> very promising place to start.
> >
> > The other approach feels more approachable now too; it feels like there
> are less "unknown unknowns", although there are plenty of known unknowns to
> investigate (address independence was one that was already bugging be
> before I wrote to this list).
>
> I think it shouldn't be too horrible to work out, another option might
> be to abuse the cache somehow, but I think that still needs writable +
> executable which probably doesn't help, but stuff should be address
> independent as I do write x86 asm programs to the cache and read them
> back out, only relocating around the global symbols.
>
> >
> > It seems like Venus is the more straightforward approach, so I'm
> inclined to just go with it.  However, it seems like there would be a
> performance hit compared to only doing JIT compilation in a separate
> process.  Do you have a rough sense of the performance hit of serializing
> everything over Venus?  The answer will depend on the workload, I know.
>
> Yeah I think you'll definitely see a large perf hit than just moving
> compilation out to a separate process, I'm not sure of the raw venus
> overhead numbers here, someone else might have more information
> available.
>
> Dave.
>
> !! External Email: This email originated from outside of the organization.
> Do not click links or open attachments unless you recognize the sender.
>