Re: [Mesa-dev] [Mesa-stable] [PATCH 3/3] amd: Apply elf relocations and allow code with relocations

2019-06-20 Thread Jan Vesely
On Sat, 2019-06-15 at 07:38 +0200, Dieter Nützel wrote:
> Am 14.06.2019 08:13, schrieb Jan Vesely:
> > On Thu, 2019-06-13 at 21:20 +0200, Dieter Nützel wrote:
> > > Am 13.06.2019 07:10, schrieb Marek Olšák:
> > > > FYI, I just pushed the new linker.
> > > > 
> > > > Marek
> > > 
> > > Thank you very much Marek and _Nicolai_ for this GREAT stuff.
> > > It brings back some speed after 1/8 drop with glmark2, lately.
> > > Maybe my amd-staging-drm-next tree (5.2-rc1) didn't honor the kernel
> > > mitigation parameter right.
> > > 
> > > @Jan
> > > Go ahead with your nice relocation and image work.
> > > Send me what you have in the works.
> > 
> > The relocation work is no longer needed as the new linker handles
> > things.
> > The corruption is caused either by (still faulty) conversion builtins,
> > or incorrect buffer coherence handling. Both need fixing, but I'm not
> > sure which one is to blame in this case.
> > 
> > > Latest Mesa git (with Nicolai's new linker) let all 3 luxmark versions
> > > run.
> > > Only 'Hotel lobby' (with v3.0 and v3.1) show some corruption but do 
> > > NOT
> > > crash any longer. Numbers for 'Neumann TLM-102 SE' (medium) show 
> > > ~43000K
> > > (!!!).
> > > 
> > > https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.phoronix.com%2Fforums%2Fforum%2Fphoronix%2Flatest-phoronix-articles%2F1106085-linux-kernel-set-to-expose-hidden-nvidia-hda-controllers-helping-laptop-users%3Fp%3D1106199%23post1106199data=02%7C01%7Cjan.vesely%40cs.rutgers.edu%7Ca6eda55e70a546c57cfa08d6f153b93a%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636961739242915247sdata=HK9Shj%2B8monTnuXeWu%2BgYT77EjqcXT7NOtnpoiyeQNY%3Dreserved=0
> > > 
> > > Blender crash as expected ;-)
> > > 
> > > /home/dieter> trying to save userpref at
> > > /home/dieter/.config/blender/2.79/config/userpref.blend ok
> > > Read blend: /data/Blender/barbershop_interior_gpu.blend
> > > scripts disabled for "/data/Blender/barbershop_interior_gpu.blend",
> > > skipping 'generate_customprops.py'
> > > skipping driver 'var', automatic scripts are disabled
> > > skipping driver 'var', automatic scripts are disabled
> > > skipping driver 'var', automatic scripts are disabled
> > > skipping driver 'var', automatic scripts are disabled
> > > skipping driver 'var', automatic scripts are disabled
> > > skipping driver 'var', automatic scripts are disabled
> > > skipping driver 'var', automatic scripts are disabled
> > > skipping driver 'var', automatic scripts are disabled
> > > skipping driver 'var', automatic scripts are disabled
> > > Device init success
> > > Compiling OpenCL program split
> > > Kernel compilation of split finished in 8.41s.
> > > 
> > > Compiling OpenCL program base
> > > Kernel compilation of base finished in 4.55s.
> > > 
> > > Compiling OpenCL program denoising
> > > Kernel compilation of denoising finished in 2.08s.
> > > 
> > > blender: ../src/gallium/drivers/radeonsi/si_compute.c:319:
> > > si_set_global_binding: Assertion `first + n <= MAX_GLOBAL_BUFFERS'
> > > failed.
> > > 
> > > [1]Abbruch   blender (core dumped)
> > 
> > The number of max global buffers was bumped in 06bf56725d to fix
> > similar crash in luxmark. I guess it needs another bump.
> 
> Hello Jan,
> 
> I'm so blind...
> ...bumping it 48 and 64 (first try) works. 33 not ;-)
> We shouldn't waste to much memory.

Feel free to post a patch. I'm not sure at which point Marek wants to
switch to dynamic allocation (or if at all), but there's no limit in
OCL so we might end up bumping this every time a new app pushes
against the limit.

> Now, let's start with the libclc work.
> Luxmark 'Hotel' is very blocky and Blender 'barbershop_interior_gpu' 
> mostly black. I have some images.
> 
> Shouldn't we better open a new ticket. Any hints for a good name?
> Or do we have one already? I can put my pictures, there.
> Simpler scenes work, but mostly gray (without colors/texture).

Feel free to create a llvm bug for libclc. The best reproducer is
probably OCL CTS convert test failures.

There are several buffer synchronization bugs reported for clover, so
I don't think we need a new one.

sorry for the delay, my day job projects require more time and
attention than usual.

Jan

> 
> Dieter

-- 
Jan Vesely 


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH 3/3] amd: Apply elf relocations and allow code with relocations

2019-06-14 Thread Dieter Nützel

Am 14.06.2019 08:13, schrieb Jan Vesely:

On Thu, 2019-06-13 at 21:20 +0200, Dieter Nützel wrote:

Am 13.06.2019 07:10, schrieb Marek Olšák:
> FYI, I just pushed the new linker.
>
> Marek

Thank you very much Marek and _Nicolai_ for this GREAT stuff.
It brings back some speed after 1/8 drop with glmark2, lately.
Maybe my amd-staging-drm-next tree (5.2-rc1) didn't honor the kernel
mitigation parameter right.

@Jan
Go ahead with your nice relocation and image work.
Send me what you have in the works.


The relocation work is no longer needed as the new linker handles
things.
The corruption is caused either by (still faulty) conversion builtins,
or incorrect buffer coherence handling. Both need fixing, but I'm not
sure which one is to blame in this case.



Latest Mesa git (with Nicolai's new linker) let all 3 luxmark versions
run.
Only 'Hotel lobby' (with v3.0 and v3.1) show some corruption but do 
NOT
crash any longer. Numbers for 'Neumann TLM-102 SE' (medium) show 
~43000K

(!!!).

https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.phoronix.com%2Fforums%2Fforum%2Fphoronix%2Flatest-phoronix-articles%2F1106085-linux-kernel-set-to-expose-hidden-nvidia-hda-controllers-helping-laptop-users%3Fp%3D1106199%23post1106199data=02%7C01%7Cjan.vesely%40cs.rutgers.edu%7Cae4545df023e4910433c08d6f03438a8%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636960504419864592sdata=xSOotxsWyJDb2J14lNk1NV4bK2nRK3%2FzWoxNyRj6IqU%3Dreserved=0

Blender crash as expected ;-)

/home/dieter> trying to save userpref at
/home/dieter/.config/blender/2.79/config/userpref.blend ok
Read blend: /data/Blender/barbershop_interior_gpu.blend
scripts disabled for "/data/Blender/barbershop_interior_gpu.blend",
skipping 'generate_customprops.py'
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
Device init success
Compiling OpenCL program split
Kernel compilation of split finished in 8.41s.

Compiling OpenCL program base
Kernel compilation of base finished in 4.55s.

Compiling OpenCL program denoising
Kernel compilation of denoising finished in 2.08s.

blender: ../src/gallium/drivers/radeonsi/si_compute.c:319:
si_set_global_binding: Assertion `first + n <= MAX_GLOBAL_BUFFERS'
failed.

[1]Abbruch   blender (core dumped)


The number of max global buffers was bumped in 06bf56725d to fix
similar crash in luxmark. I guess it needs another bump.


Hello Jan,

I'm so blind...
...bumping it 48 and 64 (first try) works. 33 not ;-)
We shouldn't waste to much memory.
Now, let's start with the libclc work.
Luxmark 'Hotel' is very blocky and Blender 'barbershop_interior_gpu' 
mostly black. I have some images.


Shouldn't we better open a new ticket. Any hints for a good name?
Or do we have one already? I can put my pictures, there.
Simpler scenes work, but mostly gray (without colors/texture).

Dieter
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH 3/3] amd: Apply elf relocations and allow code with relocations

2019-06-14 Thread Jan Vesely
On Thu, 2019-06-13 at 21:20 +0200, Dieter Nützel wrote:
> Am 13.06.2019 07:10, schrieb Marek Olšák:
> > FYI, I just pushed the new linker.
> > 
> > Marek
> 
> Thank you very much Marek and _Nicolai_ for this GREAT stuff.
> It brings back some speed after 1/8 drop with glmark2, lately.
> Maybe my amd-staging-drm-next tree (5.2-rc1) didn't honor the kernel 
> mitigation parameter right.
> 
> @Jan
> Go ahead with your nice relocation and image work.
> Send me what you have in the works.

The relocation work is no longer needed as the new linker handles
things.
The corruption is caused either by (still faulty) conversion builtins,
or incorrect buffer coherence handling. Both need fixing, but I'm not
sure which one is to blame in this case.

> 
> Latest Mesa git (with Nicolai's new linker) let all 3 luxmark versions 
> run.
> Only 'Hotel lobby' (with v3.0 and v3.1) show some corruption but do NOT 
> crash any longer. Numbers for 'Neumann TLM-102 SE' (medium) show ~43000K 
> (!!!).
> 
> https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.phoronix.com%2Fforums%2Fforum%2Fphoronix%2Flatest-phoronix-articles%2F1106085-linux-kernel-set-to-expose-hidden-nvidia-hda-controllers-helping-laptop-users%3Fp%3D1106199%23post1106199data=02%7C01%7Cjan.vesely%40cs.rutgers.edu%7Cae4545df023e4910433c08d6f03438a8%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636960504419864592sdata=xSOotxsWyJDb2J14lNk1NV4bK2nRK3%2FzWoxNyRj6IqU%3Dreserved=0
> 
> Blender crash as expected ;-)
> 
> /home/dieter> trying to save userpref at 
> /home/dieter/.config/blender/2.79/config/userpref.blend ok
> Read blend: /data/Blender/barbershop_interior_gpu.blend
> scripts disabled for "/data/Blender/barbershop_interior_gpu.blend", 
> skipping 'generate_customprops.py'
> skipping driver 'var', automatic scripts are disabled
> skipping driver 'var', automatic scripts are disabled
> skipping driver 'var', automatic scripts are disabled
> skipping driver 'var', automatic scripts are disabled
> skipping driver 'var', automatic scripts are disabled
> skipping driver 'var', automatic scripts are disabled
> skipping driver 'var', automatic scripts are disabled
> skipping driver 'var', automatic scripts are disabled
> skipping driver 'var', automatic scripts are disabled
> Device init success
> Compiling OpenCL program split
> Kernel compilation of split finished in 8.41s.
> 
> Compiling OpenCL program base
> Kernel compilation of base finished in 4.55s.
> 
> Compiling OpenCL program denoising
> Kernel compilation of denoising finished in 2.08s.
> 
> blender: ../src/gallium/drivers/radeonsi/si_compute.c:319: 
> si_set_global_binding: Assertion `first + n <= MAX_GLOBAL_BUFFERS' 
> failed.
> 
> [1]Abbruch   blender (core dumped)

The number of max global buffers was bumped in 06bf56725d to fix
similar crash in luxmark. I guess it needs another bump.

Jan

> 
> Gretings,
> Dieter
> 
> > On Mon, Jun 3, 2019 at 10:39 PM Jan Vesely 
> > wrote:
> > 
> > > Fixes piglits:
> > > call.cl [1]
> > > calls-larget-struct.cl [2]
> > > calls-struct.cl [3]
> > > calls-workitem-id.cl [4]
> > > realign-stack.cl [5]
> > > tail-calls.cl [6]
> > > 
> > > Cc: mesa-sta...@lists.freedesktop.org
> > > Signed-off-by: Jan Vesely 
> > > ---
> > > The piglit test now pass using llvm-7,8,git.
> > > ImageMagick works on my raven, but some test still fail on
> > > carrizo/iceland.
> > > Other workloads (like shoc) that used function calls also work ok.
> > > ocltoys work after removing static keyword from .cl files.
> > > src/amd/common/ac_binary.c| 30
> > > +++
> > > src/gallium/drivers/radeonsi/si_compute.c |  6 -
> > > 2 files changed, 30 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/src/amd/common/ac_binary.c b/src/amd/common/ac_binary.c
> > > index 18dc72c61f0..4d152fcf1be 100644
> > > --- a/src/amd/common/ac_binary.c
> > > +++ b/src/amd/common/ac_binary.c
> > > @@ -178,6 +178,36 @@ bool ac_elf_read(const char *elf_data, unsigned
> > > elf_size,
> > > 
> > > parse_relocs(elf, relocs, symbols, symbol_sh_link, binary);
> > > 
> > > +   // Apply relocations
> > > +   for (int i = 0; i < binary->reloc_count; ++i) {
> > > +   struct ac_shader_reloc *r = >relocs[i];
> > > +   uint32_t *loc = (uint32_t*)(binary->code +
> > > r->offset);
> > > +   /* Section target relocations store symbol offsets
> > > as
> > > +* values in reloc location. We're expected to
> > > adjust it for
> > > +* start of the section. However, R_AMDGPU_REL32 are
> > > +* PC relative relocations, so we need to recompute
> > > the
> > > +* delta between reloc locatin and the target
> > > adress.
> > > +*/
> > > +   if (r->target_type == 0x3) { // section relocation
> > > +   uint32_t target_offset = *loc; // already
> > > adjusted
> > > +   int64_t diff = 

Re: [Mesa-dev] [Mesa-stable] [PATCH 3/3] amd: Apply elf relocations and allow code with relocations

2019-06-13 Thread Dieter Nützel

Am 13.06.2019 07:10, schrieb Marek Olšák:

FYI, I just pushed the new linker.

Marek


Thank you very much Marek and _Nicolai_ for this GREAT stuff.
It brings back some speed after 1/8 drop with glmark2, lately.
Maybe my amd-staging-drm-next tree (5.2-rc1) didn't honor the kernel 
mitigation parameter right.


@Jan
Go ahead with your nice relocation and image work.
Send me what you have in the works.

Latest Mesa git (with Nicolai's new linker) let all 3 luxmark versions 
run.
Only 'Hotel lobby' (with v3.0 and v3.1) show some corruption but do NOT 
crash any longer. Numbers for 'Neumann TLM-102 SE' (medium) show ~43000K 
(!!!).


https://www.phoronix.com/forums/forum/phoronix/latest-phoronix-articles/1106085-linux-kernel-set-to-expose-hidden-nvidia-hda-controllers-helping-laptop-users?p=1106199#post1106199

Blender crash as expected ;-)

/home/dieter> trying to save userpref at 
/home/dieter/.config/blender/2.79/config/userpref.blend ok

Read blend: /data/Blender/barbershop_interior_gpu.blend
scripts disabled for "/data/Blender/barbershop_interior_gpu.blend", 
skipping 'generate_customprops.py'

skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
Device init success
Compiling OpenCL program split
Kernel compilation of split finished in 8.41s.

Compiling OpenCL program base
Kernel compilation of base finished in 4.55s.

Compiling OpenCL program denoising
Kernel compilation of denoising finished in 2.08s.

blender: ../src/gallium/drivers/radeonsi/si_compute.c:319: 
si_set_global_binding: Assertion `first + n <= MAX_GLOBAL_BUFFERS' 
failed.


[1]Abbruch   blender (core dumped)

Gretings,
Dieter


On Mon, Jun 3, 2019 at 10:39 PM Jan Vesely 
wrote:


Fixes piglits:
call.cl [1]
calls-larget-struct.cl [2]
calls-struct.cl [3]
calls-workitem-id.cl [4]
realign-stack.cl [5]
tail-calls.cl [6]

Cc: mesa-sta...@lists.freedesktop.org
Signed-off-by: Jan Vesely 
---
The piglit test now pass using llvm-7,8,git.
ImageMagick works on my raven, but some test still fail on
carrizo/iceland.
Other workloads (like shoc) that used function calls also work ok.
ocltoys work after removing static keyword from .cl files.
src/amd/common/ac_binary.c| 30
+++
src/gallium/drivers/radeonsi/si_compute.c |  6 -
2 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/src/amd/common/ac_binary.c b/src/amd/common/ac_binary.c
index 18dc72c61f0..4d152fcf1be 100644
--- a/src/amd/common/ac_binary.c
+++ b/src/amd/common/ac_binary.c
@@ -178,6 +178,36 @@ bool ac_elf_read(const char *elf_data, unsigned
elf_size,

parse_relocs(elf, relocs, symbols, symbol_sh_link, binary);

+   // Apply relocations
+   for (int i = 0; i < binary->reloc_count; ++i) {
+   struct ac_shader_reloc *r = >relocs[i];
+   uint32_t *loc = (uint32_t*)(binary->code +
r->offset);
+   /* Section target relocations store symbol offsets
as
+* values in reloc location. We're expected to
adjust it for
+* start of the section. However, R_AMDGPU_REL32 are
+* PC relative relocations, so we need to recompute
the
+* delta between reloc locatin and the target
adress.
+*/
+   if (r->target_type == 0x3) { // section relocation
+   uint32_t target_offset = *loc; // already
adjusted
+   int64_t diff = target_offset - r->offset;
+   if (r->type == 0xa) { // R_AMDGPU_REL32_LO
+   // address of the 'lo' instruction
is 4B below
+   // the relocation point, but the
target has
+   // alredy been adjusted.
+   *loc = (diff & 0x);
+   } else if (r->type == 0xb) { //
R_AMDGPU_REL32_HI
+   // 'hi' relocation is 8B above 'lo'
relocation
+   *loc = ((diff - 8) >> 32);
+   } else {
+   success = false;
+   fprintf(stderr, "Unsupported section
relocation: type: %d, offset: %lx, value: %x\n",
+   r->type, r->offset,
*loc);
+   }
+   } else
+   success = false;
+   }
+
if (elf){
elf_end(elf);
}
diff --git a/src/gallium/drivers/radeonsi/si_compute.c
b/src/gallium/drivers/radeonsi/si_compute.c
index b9cea00eeeb..88631369a62 100644
--- 

Re: [Mesa-dev] [Mesa-stable] [PATCH 3/3] amd: Apply elf relocations and allow code with relocations

2019-06-13 Thread Jan Vesely
On Thu, 2019-06-13 at 01:10 -0400, Marek Olšák wrote:
> FYI, I just pushed the new linker.

thanks. I've checked the handling and the current approach works for
sections as well (even if not handled explicitly).

Jan

> 
> Marek
> 
> On Mon, Jun 3, 2019 at 10:39 PM Jan Vesely  wrote:
> 
> > Fixes piglits:
> > call.cl
> > calls-larget-struct.cl
> > calls-struct.cl
> > calls-workitem-id.cl
> > realign-stack.cl
> > tail-calls.cl
> > 
> > Cc: mesa-sta...@lists.freedesktop.org
> > Signed-off-by: Jan Vesely 
> > ---
> > The piglit test now pass using llvm-7,8,git.
> > ImageMagick works on my raven, but some test still fail on
> > carrizo/iceland.
> > Other workloads (like shoc) that used function calls also work ok.
> > ocltoys work after removing static keyword from .cl files.
> >  src/amd/common/ac_binary.c| 30 +++
> >  src/gallium/drivers/radeonsi/si_compute.c |  6 -
> >  2 files changed, 30 insertions(+), 6 deletions(-)
> > 
> > diff --git a/src/amd/common/ac_binary.c b/src/amd/common/ac_binary.c
> > index 18dc72c61f0..4d152fcf1be 100644
> > --- a/src/amd/common/ac_binary.c
> > +++ b/src/amd/common/ac_binary.c
> > @@ -178,6 +178,36 @@ bool ac_elf_read(const char *elf_data, unsigned
> > elf_size,
> > 
> > parse_relocs(elf, relocs, symbols, symbol_sh_link, binary);
> > 
> > +   // Apply relocations
> > +   for (int i = 0; i < binary->reloc_count; ++i) {
> > +   struct ac_shader_reloc *r = >relocs[i];
> > +   uint32_t *loc = (uint32_t*)(binary->code + r->offset);
> > +   /* Section target relocations store symbol offsets as
> > +* values in reloc location. We're expected to adjust it
> > for
> > +* start of the section. However, R_AMDGPU_REL32 are
> > +* PC relative relocations, so we need to recompute the
> > +* delta between reloc locatin and the target adress.
> > +*/
> > +   if (r->target_type == 0x3) { // section relocation
> > +   uint32_t target_offset = *loc; // already adjusted
> > +   int64_t diff = target_offset - r->offset;
> > +   if (r->type == 0xa) { // R_AMDGPU_REL32_LO
> > +   // address of the 'lo' instruction is 4B
> > below
> > +   // the relocation point, but the target has
> > +   // alredy been adjusted.
> > +   *loc = (diff & 0x);
> > +   } else if (r->type == 0xb) { // R_AMDGPU_REL32_HI
> > +   // 'hi' relocation is 8B above 'lo'
> > relocation
> > +   *loc = ((diff - 8) >> 32);
> > +   } else {
> > +   success = false;
> > +   fprintf(stderr, "Unsupported section
> > relocation: type: %d, offset: %lx, value: %x\n",
> > +   r->type, r->offset, *loc);
> > +   }
> > +   } else
> > +   success = false;
> > +   }
> > +
> > if (elf){
> > elf_end(elf);
> > }
> > diff --git a/src/gallium/drivers/radeonsi/si_compute.c
> > b/src/gallium/drivers/radeonsi/si_compute.c
> > index b9cea00eeeb..88631369a62 100644
> > --- a/src/gallium/drivers/radeonsi/si_compute.c
> > +++ b/src/gallium/drivers/radeonsi/si_compute.c
> > @@ -246,12 +246,6 @@ static void *si_create_compute_state(
> > const amd_kernel_code_t *code_object =
> > si_compute_get_code_object(program, 0);
> > code_object_to_config(code_object,
> > >shader.config);
> > -   if (program->shader.binary.reloc_count != 0) {
> > -   fprintf(stderr, "Error: %d unsupported
> > relocations\n",
> > -
> >  program->shader.binary.reloc_count);
> > -   FREE(program);
> > -   return NULL;
> > -   }
> > } else {
> > 
> > si_shader_binary_read_config(>shader.binary,
> >  >shader.config, 0);
> > --
> > 2.21.0
> > 
> > ___
> > mesa-stable mailing list
> > mesa-sta...@lists.freedesktop.org
> > https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fmesa-stabledata=02%7C01%7Cjan.vesely%40cs.rutgers.edu%7C799b455386104f3a30dd08d6efbd81dc%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636959994551504256sdata=E5usx69EyFtABg3cn24Q5idNveyntDYJ7xR5a7K%2BboA%3Dreserved=0

-- 
Jan Vesely 


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list

Re: [Mesa-dev] [Mesa-stable] [PATCH 3/3] amd: Apply elf relocations and allow code with relocations

2019-06-12 Thread Marek Olšák
FYI, I just pushed the new linker.

Marek

On Mon, Jun 3, 2019 at 10:39 PM Jan Vesely  wrote:

> Fixes piglits:
> call.cl
> calls-larget-struct.cl
> calls-struct.cl
> calls-workitem-id.cl
> realign-stack.cl
> tail-calls.cl
>
> Cc: mesa-sta...@lists.freedesktop.org
> Signed-off-by: Jan Vesely 
> ---
> The piglit test now pass using llvm-7,8,git.
> ImageMagick works on my raven, but some test still fail on
> carrizo/iceland.
> Other workloads (like shoc) that used function calls also work ok.
> ocltoys work after removing static keyword from .cl files.
>  src/amd/common/ac_binary.c| 30 +++
>  src/gallium/drivers/radeonsi/si_compute.c |  6 -
>  2 files changed, 30 insertions(+), 6 deletions(-)
>
> diff --git a/src/amd/common/ac_binary.c b/src/amd/common/ac_binary.c
> index 18dc72c61f0..4d152fcf1be 100644
> --- a/src/amd/common/ac_binary.c
> +++ b/src/amd/common/ac_binary.c
> @@ -178,6 +178,36 @@ bool ac_elf_read(const char *elf_data, unsigned
> elf_size,
>
> parse_relocs(elf, relocs, symbols, symbol_sh_link, binary);
>
> +   // Apply relocations
> +   for (int i = 0; i < binary->reloc_count; ++i) {
> +   struct ac_shader_reloc *r = >relocs[i];
> +   uint32_t *loc = (uint32_t*)(binary->code + r->offset);
> +   /* Section target relocations store symbol offsets as
> +* values in reloc location. We're expected to adjust it
> for
> +* start of the section. However, R_AMDGPU_REL32 are
> +* PC relative relocations, so we need to recompute the
> +* delta between reloc locatin and the target adress.
> +*/
> +   if (r->target_type == 0x3) { // section relocation
> +   uint32_t target_offset = *loc; // already adjusted
> +   int64_t diff = target_offset - r->offset;
> +   if (r->type == 0xa) { // R_AMDGPU_REL32_LO
> +   // address of the 'lo' instruction is 4B
> below
> +   // the relocation point, but the target has
> +   // alredy been adjusted.
> +   *loc = (diff & 0x);
> +   } else if (r->type == 0xb) { // R_AMDGPU_REL32_HI
> +   // 'hi' relocation is 8B above 'lo'
> relocation
> +   *loc = ((diff - 8) >> 32);
> +   } else {
> +   success = false;
> +   fprintf(stderr, "Unsupported section
> relocation: type: %d, offset: %lx, value: %x\n",
> +   r->type, r->offset, *loc);
> +   }
> +   } else
> +   success = false;
> +   }
> +
> if (elf){
> elf_end(elf);
> }
> diff --git a/src/gallium/drivers/radeonsi/si_compute.c
> b/src/gallium/drivers/radeonsi/si_compute.c
> index b9cea00eeeb..88631369a62 100644
> --- a/src/gallium/drivers/radeonsi/si_compute.c
> +++ b/src/gallium/drivers/radeonsi/si_compute.c
> @@ -246,12 +246,6 @@ static void *si_create_compute_state(
> const amd_kernel_code_t *code_object =
> si_compute_get_code_object(program, 0);
> code_object_to_config(code_object,
> >shader.config);
> -   if (program->shader.binary.reloc_count != 0) {
> -   fprintf(stderr, "Error: %d unsupported
> relocations\n",
> -
>  program->shader.binary.reloc_count);
> -   FREE(program);
> -   return NULL;
> -   }
> } else {
>
> si_shader_binary_read_config(>shader.binary,
>  >shader.config, 0);
> --
> 2.21.0
>
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-stable
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev