Re: [Dwarf-discuss] ISSUE: tensor types. V3

2023-04-26 Thread Robinson, Paul via Dwarf-discuss
> > If it ever became necessary, you can always add a 2nd attribute for it.
> > As an example, in our Ada compiler decades ago, we did this for
> > DW_AT_artificial.  It's just a flag, so either present or not-present.
> > We added a 2nd DW_AT_artificial_kind with a whole bunch of different
> > enums for the various kinds our compiler generated.  The point is you
> > still can get there even if DW_AT_tensor is just a flag.
> 
> Totally, not opposed to that if that is the way that people want to
> handle it. My only (admittedly weak) argument against doing it that way
> is that there there will now be two attributes rather than one and the
> space that it takes up.

With DW_FORM_flag_present, there's no extra space taken up in the DIE.
--paulr

-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] ISSUE: tensor types. V3

2023-04-25 Thread Ben Woodard via Dwarf-discuss


On 4/24/23 13:17, Todd Allen via Dwarf-discuss wrote:

On 4/24/23 13:27, Ben Woodard via Dwarf-discuss wrote:

As for NEON vs. SVE, is there a need to differentiate them?  And can it
not be done by shape of the type?

That one continues to be hard. ARM processors that support SVE also have
NEON registers which like the Intel SSE MMX AVX kind of vector registers
are architecturally specified as having a specific number of bits.
Handling those are trivial.

The weird thing about SVE registers (and the same things also apply to
RVV) are that the number of bits is not architecturally defined and is
therefore unknown at compile time. The size of the registers can even
vary from hardware implementation to hardware implementation. So a
simple processor may only have a 128b wide SVE register while a monster
performance core may have 2048b wide SVE registers. The predicate
registers scale the same way. I that it can even vary from core to core
within a CPU sort of like intel's P-cores vs E-cores. To be able to even
know how much a loop is vectorized you need to read a core specific
register that specifies how wide the vector registers are on this
particular core. Things like induction variables are incremented by the
constant in that core specific register divided by size of the type
being acted upon. So some of the techniques used to select lanes in
DWARF don't quite work the same way.

Just to make things even more difficult, when one of these registers are
spilled to memory like the stack the size is unknown at compile time and
so any subsequent spilling has to determine the size that it takes up.
So any subsequent offsets need to use DWARF expressions to that
reference the width of the vector.

...and then there is SME which is like SVE but they are matrices rather
than vectors. The mind boggles.


So the variability of the vector size is the only significant difference
that you've identified?  If so, then I think the shape of the array type
probably is sufficient.  For SVE, the DW_TAG_subrange_type will have a
DW_AT_upper_bound which is a variable (reference or dwarf expr), or the
DW_TAG_array_type's DW_AT_{byte,bit}_size will be a variable, or both.
Meanwhile, NEON would use DW_AT_bit_size 128 (or DW_AT_byte_size 16) and
a constant DW_AT_upper_bound (128/bitsizeof(elementtype)).  That seems
like it very directly reflects the difference between the two vector types.


I went back and revisited the research that I did on behalf of customers 
a few years back when customers first got access to SVE and started 
debugging it. The state of the art has advanced since I did that work.


Back then we ran into problems because the only way to get the size of 
the hardware vector was to read a core specific register. A big problem 
was that if you were debugging something like a core file, you didn't 
have access to the that core specific register. There was no way to 
reference the core specific register from DWARF.


Furthermore while on the systems that I was looking at, all the cores 
were the same, it was architecturally allowed to have different sizes of 
the vector registers depending on which core that you were running on.


At the time, we realized that there needed to be some "magic" that 
didn't exist at the time that provided the debugger with the width of 
the vector. It was this complexity that really left me feeling that SVE 
needed to be its own special thing.


At the time we discussed several options. One was pushing the size of 
the vector into a normal variable so that it could be referenced by 
DWARF; however we didn't know how to make that work because it could 
change depending on which core the code was executing on. There was also 
a kernel problem associated with that, the information about where the 
process was executing needed to be included in the crash dumps. There 
was also a feeling that there was something wrong with this approach 
because the only reason for the variable to exist would be to support 
debugging and keeping it up to date added overhead, and probably some 
kernel support.


Another idea we kicked around was giving the core specific register a 
name and number in the register file so that DWARF could access it. This 
broke ABI. At that time, that option was immediately shot down.


I wasn't able to give the customers a good answer. I didn't know how to 
solve the problem. Word evidently got back to ARM and they wrote: 
https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst#dwarf-register-names 
The big innovation that made this possible is ARM introduced a "pseudo 
register" which they call VG that is specified to exist in the execution 
environment. They even gave some examples how the DWARF should look for 
these types 
https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst#vector-types-beta 



I haven't looked at the implementation of how GCC implements the VG 
register yet. So I don't know how it handles some of the problems that 

Re: [Dwarf-discuss] ISSUE: tensor types. V3

2023-04-24 Thread Todd Allen via Dwarf-discuss
On 4/24/23 13:27, Ben Woodard via Dwarf-discuss wrote:
>
>> As for NEON vs. SVE, is there a need to differentiate them?  And can it
>> not be done by shape of the type?
>
> That one continues to be hard. ARM processors that support SVE also have
> NEON registers which like the Intel SSE MMX AVX kind of vector registers
> are architecturally specified as having a specific number of bits.
> Handling those are trivial.
>
> The weird thing about SVE registers (and the same things also apply to
> RVV) are that the number of bits is not architecturally defined and is
> therefore unknown at compile time. The size of the registers can even
> vary from hardware implementation to hardware implementation. So a
> simple processor may only have a 128b wide SVE register while a monster
> performance core may have 2048b wide SVE registers. The predicate
> registers scale the same way. I that it can even vary from core to core
> within a CPU sort of like intel's P-cores vs E-cores. To be able to even
> know how much a loop is vectorized you need to read a core specific
> register that specifies how wide the vector registers are on this
> particular core. Things like induction variables are incremented by the
> constant in that core specific register divided by size of the type
> being acted upon. So some of the techniques used to select lanes in
> DWARF don't quite work the same way.
>
> Just to make things even more difficult, when one of these registers are
> spilled to memory like the stack the size is unknown at compile time and
> so any subsequent spilling has to determine the size that it takes up.
> So any subsequent offsets need to use DWARF expressions to that
> reference the width of the vector.
>
> ...and then there is SME which is like SVE but they are matrices rather
> than vectors. The mind boggles.
>
So the variability of the vector size is the only significant difference 
that you've identified?  If so, then I think the shape of the array type 
probably is sufficient.  For SVE, the DW_TAG_subrange_type will have a 
DW_AT_upper_bound which is a variable (reference or dwarf expr), or the 
DW_TAG_array_type's DW_AT_{byte,bit}_size will be a variable, or both.  
Meanwhile, NEON would use DW_AT_bit_size 128 (or DW_AT_byte_size 16) and 
a constant DW_AT_upper_bound (128/bitsizeof(elementtype)).  That seems 
like it very directly reflects the difference between the two vector types.

>> If all those things You argued that it still should be an enum, but 
>> with only one "default"
>> value defined.  And I guess any other values that might be added later
>> would be (or at least start as) vendor extensions. It's peculiar, and I
>> don't think we have that anywhere else in the standard.
> I guess that my point is that I'm fairly certain that SVE and RVV will
> need special handling and when the compilers start handling the matrix
> types that the hardware is starting to support, they are going need some
> help as well.
If there's something more peculiar about the types inhabiting these 
vector registers than "variable size", that might convince me.  But 
merely being variable-sized doesn't.
>> If it ever became necessary, you can always add a 2nd attribute for it.
>> As an example, in our Ada compiler decades ago, we did this for
>> DW_AT_artificial.  It's just a flag, so either present or not-present.
>> We added a 2nd DW_AT_artificial_kind with a whole bunch of different
>> enums for the various kinds our compiler generated.  The point is you
>> still can get there even if DW_AT_tensor is just a flag.
>
> Totally, not opposed to that if that is the way that people want to
> handle it. My only (admittedly weak) argument against doing it that way
> is that there there will now be two attributes rather than one and the
> space that it takes up. John DelSignore was just dealing with a program
> that had 4.9GB of DWARF, it would be nice to keep it as compact as
> possible. Of course most of that is likely location lists and template
> instantiations and stuff like that not the relatively rare case like
> this. The cases where this shows up are likely going to be fairly rare.
>
> Would this be an acceptable compromise for V4 of my proposal? I drop it
> back to just being a flag for the time being. Then in a subsequent
> submission (which may or may not be in the DWARF6 cycle -- but hopefully
> is in time for DWARF6), if I find it necessary to make a flavor to
> support SVE, RVV or SME, then my submission for that will include
> changing DW_AT_tensor to requiring a constant that then references an
> enum like I did above. If it comes out before DWARF6 is released then
> great, we don't have to redefine anything. If It bumped to DWARF7 then
> we add a _kind attribute.

You can submit it in whichever form you prefer.  I supposed you were 
soliciting comments here to get it in a form as close to acceptable as 
possible before submitting it.  After you do, the committee will discuss 
it, probably ad nauseum.  (And I'll be 

Re: [Dwarf-discuss] ISSUE: tensor types. V3

2023-04-24 Thread Ben Woodard via Dwarf-discuss


On 4/24/23 09:50, Todd Allen via Dwarf-discuss wrote:

On 4/21/23 16:31, Ben Woodard via Dwarf-discuss wrote:

     Insert the following paragraph between the first paragraph of
     normative text describing DW_TAG_array_type and the second
paragraph
     dealing with multidimensional ordering.


     An array type that refers to a vector or matrix type, shall be
     denoted with DW_AT_tensor whose integer constant, will
specify the
     kind of tensor it is. The default type of tensor shall be
the kind
     used by the vector registers in the target architecture.

     Table 5.4: Tensor attribute values
--
     Name  | Meaning
--
     DW_TENSOR_default | Default encoding and semantics used by
target
   | architecture's vector registers
     DW_TENSOR_boolean | Boolean vectors map to vector mask
registers.
     DW_TENSOR_opencl  | OpenCL vector encoding and semantics
     DW_TENSOR_neon    | NEON vector encoding and semantics
     DW_TENSOR_sve | SVE vector encoding and semantics
--

As someone who was not sitting in on your debugging GPUs discussions,
this table
is baffling.  Is it based on the "Vector Operations" table on the clang
LanguageExtensions page you mentioned?

Yes

That page is a wall of text, so I might
have missed another table, but these values are a subset of columns
from that
table.

1 of the values here is a source language (opencl), 2 reflect
specific vector
registers of one specific architecture (neon & sve), and I don't even
know what
boolean is meant to be.  Maybe a type that you would associate with
predicate
registers?  I think this table needs a lot more explanation.

This was something that Pedro pointed out and it was something that I
hadn't thought of. The overall justification for this is that these
types were semantically different than normal C arrays in several
distinct ways. There is this table which explains the differences:
https://clang.llvm.org/docs/LanguageExtensions.html#vector-operations
The argument is that the semantics of different flavors are different
enough that they need to be distinct.

I really do not know much of anything about OpenCL style vectors, I
wouldn't at all be against folding that constant in because it is
something that could be inferred from the source language. I left it in
because I thought that there might exist in cases where clang compiles
some OpenCL code that references some intrinsics written in another
language like C/C++ which depends on the semantics of OpenCL vector
types.

NEON, yeah I think we should drop that one. The current GCC semantics
are really Intel's vector semantics. By changing it from "GCC semantics"
to "Default encoding and semantics used by target architecture's vector
registers" I think we eliminate the need for that.

You are correct boolean is for predicate register types. After looking
at the calling conventions, these are not passed as types themselves. So
for the purpose of this submission, I don't think we need it. I believe
that some of the stuff that Tony and the AMD, and intel guys are almost
ready to submit has DWARF examples of how to make use of predicate
registers in SIMD and SIMT and access variables making use of predicate
registers should be sufficient for those.

ARM SVE and RISC-V RVV are really weird because of those HW
implementation defined vs architecturally defined register and therefore
type widths. It has been a couple of compiler generation iterations
since I looked at the DWARF for those but but when I last looked, the
compilers didn't know what to do with those and so they didn't generate
usable DWARF. So I feel like there are additional unsolved problems with
the SVE and RVV types that will need to be addressed. It is a problem,
that I know that I need to look into -- but right now I do not have any
"quality of DWARF" user issues pulling it closer to the top of my
priority list. The only processor I've seen with SVE is the A64FX used
in Fugaku and the HPE Apollo 80's, the Apple M1 and M2 don't have it and
I haven't seen any of the newer ARM enterprise CPUs. I don't think there
are any chips with RVV yet. Once more users have access to hardware that
supports it, I know that it will be more of a problem. I kind of feel
like that will be a whole submission in and of itself.



So you're thinking that "OpenCL vector semantics" ought to be
determinable from DW_AT_language DW_LANG_OpenCL?  Seems reasonable.

DW_TENSOR_boolean: Could it just be determinable from the shape of the
array?  For example:

  DW_TAG_base_type
             DW_AT_bit_size    : 1

      DW_TAG_array_type
             DW_AT_name    : predicate_t
             DW_AT_byte_size   

Re: [Dwarf-discuss] ISSUE: tensor types. V3

2023-04-24 Thread Todd Allen via Dwarf-discuss
On 4/21/23 16:31, Ben Woodard via Dwarf-discuss wrote:
>>
>>>     Insert the following paragraph between the first paragraph of
>>>     normative text describing DW_TAG_array_type and the second 
>>> paragraph
>>>     dealing with multidimensional ordering.
>>>
>>> 
>>>     An array type that refers to a vector or matrix type, shall be
>>>     denoted with DW_AT_tensor whose integer constant, will 
>>> specify the
>>>     kind of tensor it is. The default type of tensor shall be 
>>> the kind
>>>     used by the vector registers in the target architecture.
>>>
>>>     Table 5.4: Tensor attribute values
>>> --
>>>     Name  | Meaning
>>> --
>>>     DW_TENSOR_default | Default encoding and semantics used by 
>>> target
>>>   | architecture's vector registers
>>>     DW_TENSOR_boolean | Boolean vectors map to vector mask 
>>> registers.
>>>     DW_TENSOR_opencl  | OpenCL vector encoding and semantics
>>>     DW_TENSOR_neon    | NEON vector encoding and semantics
>>>     DW_TENSOR_sve | SVE vector encoding and semantics
>>> --
>> As someone who was not sitting in on your debugging GPUs discussions, 
>> this table
>> is baffling.  Is it based on the "Vector Operations" table on the clang
>> LanguageExtensions page you mentioned?
> Yes
>> That page is a wall of text, so I might
>> have missed another table, but these values are a subset of columns 
>> from that
>> table.
>>
>> 1 of the values here is a source language (opencl), 2 reflect 
>> specific vector
>> registers of one specific architecture (neon & sve), and I don't even 
>> know what
>> boolean is meant to be.  Maybe a type that you would associate with 
>> predicate
>> registers?  I think this table needs a lot more explanation.
>
> This was something that Pedro pointed out and it was something that I
> hadn't thought of. The overall justification for this is that these
> types were semantically different than normal C arrays in several
> distinct ways. There is this table which explains the differences:
> https://clang.llvm.org/docs/LanguageExtensions.html#vector-operations
> The argument is that the semantics of different flavors are different
> enough that they need to be distinct.
>
> I really do not know much of anything about OpenCL style vectors, I
> wouldn't at all be against folding that constant in because it is
> something that could be inferred from the source language. I left it in
> because I thought that there might exist in cases where clang compiles
> some OpenCL code that references some intrinsics written in another
> language like C/C++ which depends on the semantics of OpenCL vector 
> types.
>
> NEON, yeah I think we should drop that one. The current GCC semantics
> are really Intel's vector semantics. By changing it from "GCC semantics"
> to "Default encoding and semantics used by target architecture's vector
> registers" I think we eliminate the need for that.
>
> You are correct boolean is for predicate register types. After looking
> at the calling conventions, these are not passed as types themselves. So
> for the purpose of this submission, I don't think we need it. I believe
> that some of the stuff that Tony and the AMD, and intel guys are almost
> ready to submit has DWARF examples of how to make use of predicate
> registers in SIMD and SIMT and access variables making use of predicate
> registers should be sufficient for those.
>
> ARM SVE and RISC-V RVV are really weird because of those HW
> implementation defined vs architecturally defined register and therefore
> type widths. It has been a couple of compiler generation iterations
> since I looked at the DWARF for those but but when I last looked, the
> compilers didn't know what to do with those and so they didn't generate
> usable DWARF. So I feel like there are additional unsolved problems with
> the SVE and RVV types that will need to be addressed. It is a problem,
> that I know that I need to look into -- but right now I do not have any
> "quality of DWARF" user issues pulling it closer to the top of my
> priority list. The only processor I've seen with SVE is the A64FX used
> in Fugaku and the HPE Apollo 80's, the Apple M1 and M2 don't have it and
> I haven't seen any of the newer ARM enterprise CPUs. I don't think there
> are any chips with RVV yet. Once more users have access to hardware that
> supports it, I know that it will be more of a problem. I kind of feel
> like that will be a whole submission in and of itself.
>
>
So you're thinking that "OpenCL vector semantics" ought to be 
determinable from DW_AT_language DW_LANG_OpenCL?  Seems reasonable.

DW_TENSOR_boolean: Could it just be determinable from the shape of the 

Re: [Dwarf-discuss] ISSUE: tensor types. V3

2023-04-23 Thread Metzger, Markus T via Dwarf-discuss
Hello Ben,

An array type that refers to a vector or matrix type, shall be
denoted with DW_AT_tensor whose integer constant, will specify the
kind of tensor it is. The default type of tensor shall be the kind
used by the vector registers in the target architecture.

Table 5.4: Tensor attribute values
--
Name  | Meaning
--
DW_TENSOR_default | Default encoding and semantics used by target
  | architecture's vector registers
DW_TENSOR_boolean | Boolean vectors map to vector mask registers.
DW_TENSOR_opencl  | OpenCL vector encoding and semantics
DW_TENSOR_neon| NEON vector encoding and semantics
DW_TENSOR_sve | SVE vector encoding and semantics
--

The width and when applicable the number of rows of the type
shall be specified as array dimensions. The type contained
within the tensor array type must be a DW_TAG_base_type entry.

I don’t think this should refer to h/w registers, at all.  It describes a 
source language type.
Looking at the CLANG table, they may allow different notation, e.g. OpenCL, and 
they seem
to allow different operations.

I’m also not clear what the default encoding and semantics of e.g. YMM 
registers are.

Regards,
Markus.

Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de 
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928
-- 
Dwarf-discuss mailing list
Dwarf-discuss@lists.dwarfstd.org
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss


Re: [Dwarf-discuss] ISSUE: tensor types. V3

2023-04-21 Thread Ben Woodard via Dwarf-discuss



On 4/21/23 12:56, Todd Allen via Dwarf-discuss wrote:

I've been playing catch-up on this discussion today.  I was convinced of the
value early on just based on the need of this information to follow the ABI
parameter passing rules on certain architectures.
Really -- that is what I care about too. Everything else that I have 
done was just to make it acceptable to everyone else.

  And I was with you right
up until this V3 version.  Comments below:

On Thu, Apr 13, 2023 at 11:57:08AM -0700, Dwarf Discussion wrote:

I didn't put back any changes that would allow these tensor types to
appear on the DWARF stack. I feel that particular topic hasn't been
settled yet. The general plan is I will work with Jakub and create some
cases where a compiler could want to put these vector types on the DWARF
stack. Tony Tye and the AMD team believe that the vector types do not need
to be on the stack and believe that all the cases where the debuggers
would want to access elements within the vector can be addressed with
offsetting. IIUC a key point seems to be that they have never seen a case
where an induction variable was embedded in a slot in a vector register,
it always is a scalar. (I am not sure that I fully grokked their argument
-- so please correct me) In the cases where it was, it could still be
accessed as an implicit. Once I've got some examples of how a debugger
might want to put vector types on the DWARF stack, the AMD team can
suggest alternative approaches. I said that I would make a V4 proposal if
the group ultimately comes to a consensus that vector registers are in
fact needed on the stack.

A proposal to allow vector types on the DWARF expression stack easily could be
a distinct proposal, although it obvious would have a dependency on this one.
This seems like a good application of the "keep proposals small" philosophy.


Insert the following paragraph between the first paragraph of
normative text describing DW_TAG_array_type and the second paragraph
dealing with multidimensional ordering.


An array type that refers to a vector or matrix type, shall be
denoted with DW_AT_tensor whose integer constant, will specify the
kind of tensor it is. The default type of tensor shall be the kind
used by the vector registers in the target architecture.

Table 5.4: Tensor attribute values
--
Name  | Meaning
--
DW_TENSOR_default | Default encoding and semantics used by target
  | architecture's vector registers
DW_TENSOR_boolean | Boolean vectors map to vector mask registers.
DW_TENSOR_opencl  | OpenCL vector encoding and semantics
DW_TENSOR_neon| NEON vector encoding and semantics
DW_TENSOR_sve | SVE vector encoding and semantics
--

As someone who was not sitting in on your debugging GPUs discussions, this table
is baffling.  Is it based on the "Vector Operations" table on the clang
LanguageExtensions page you mentioned?

Yes

That page is a wall of text, so I might
have missed another table, but these values are a subset of columns from that
table.

1 of the values here is a source language (opencl), 2 reflect specific vector
registers of one specific architecture (neon & sve), and I don't even know what
boolean is meant to be.  Maybe a type that you would associate with predicate
registers?  I think this table needs a lot more explanation.


This was something that Pedro pointed out and it was something that I 
hadn't thought of. The overall justification for this is that these 
types were semantically different than normal C arrays in several 
distinct ways. There is this table which explains the differences: 
https://clang.llvm.org/docs/LanguageExtensions.html#vector-operations 
The argument is that the semantics of different flavors are different 
enough that they need to be distinct.


I really do not know much of anything about OpenCL style vectors, I 
wouldn't at all be against folding that constant in because it is 
something that could be inferred from the source language. I left it in 
because I thought that there might exist in cases where clang compiles 
some OpenCL code that references some intrinsics written in another 
language like C/C++ which depends on the semantics of OpenCL vector types.


NEON, yeah I think we should drop that one. The current GCC semantics 
are really Intel's vector semantics. By changing it from "GCC semantics" 
to "Default encoding and semantics used by target architecture's vector 
registers" I think we eliminate the need for that.


You are correct boolean is for predicate register 

Re: [Dwarf-discuss] ISSUE: tensor types. V3

2023-04-18 Thread Cary Coutant via Dwarf-discuss
Added as Issue 230413.1:

https://dwarfstd.org/issues/230413.1.html

-cary

On Thu, Apr 13, 2023 at 11:57 AM Ben Woodard via Dwarf-discuss <
dwarf-discuss@lists.dwarfstd.org> wrote:

> Here is V3 of what was my vector types proposal.
>
> Changes since V2:
>
> We discussed this extensively in the DWARF for GPUs meeting. Cary
> originally wanted it to be a TAG rather than an attribute on an array and
> quite frankly, I don't care and so my default position is "What Cary wants,
> Cary gets". However, Pedro pointed out LLVMs different flavors of vector
> types which like the vector types bubbled up from the target architecture
> to language source through intrinsics. Each of these different vectors
> flavors has slightly different semantics. There is a really nice table on
> https://clang.llvm.org/docs/LanguageExtensions.html#id15 that summarizes
> the differences. This changed the course of discussion and it seemed that
> the group moved back to making it an attribute on an array. Since there are
> multiple flavors of vector, this led to adding a parameter to the attribute
> that defines the flavor and a table which defines what those constants mean.
>
> I brought up the point about matrix registers. Jakub is right there are
> currently no compilers which make use of matrix vector types right now.
> Even AMD's GPUs which do have intrinsics for matrix operations end up
> implementing them with arrays of vector registers. This area is rapidly
> evolving due to its heavy use in HPC and AI. The challenge appears to be
> the compilers haven't supported these operations yet. Cary came up with the
> idea of calling it a "tensor" rather than defining DW_AT_vector and then
> later adding DW_AT_matrix. So through the entire document, vector has been
> changed to tensor.
>
> Markus pointed out a few problems in my V2 version, I tried to address
> those. They were pretty minor and obvious. Markus please verify that I did
> it to your satisfaction otherwise V4.
>
> What has not changed since V2:
>
> I didn't put back any changes that would allow these tensor types to
> appear on the DWARF stack. I feel that particular topic hasn't been settled
> yet. The general plan is I will work with Jakub and create some cases where
> a compiler could want to put these vector types on the DWARF stack. Tony
> Tye and the AMD team believe that the vector types do not need to be on the
> stack and believe that all the cases where the debuggers would want to
> access elements within the vector can be addressed with offsetting. IIUC a
> key point seems to be that they have never seen a case where an induction
> variable was embedded in a slot in a vector register, it always is a
> scalar. (I am not sure that I fully grokked their argument -- so please
> correct me) In the cases where it was, it could still be accessed as an
> implicit. Once I've got some examples of how a debugger might want to put
> vector types on the DWARF stack, the AMD team can suggest alternative
> approaches. I said that I would make a V4 proposal if the group ultimately
> comes to a consensus that vector registers are in fact needed on the stack.
>
> As for DWARF consumers, according to Cary, the reason why DWARF operations
> are currently limited to base types is to make it relatively easy on the
> consumers. If vector registers are in fact needed on the stack, Zoran is
> fairly certain that changes that he's preparing to enable gdb to support
> GPUs would also automatically handle vector registers on the stack. The
> problem for gdb would be client server operation with gdbserver. The goal
> with gdbserver has been to keep its execution footprint very small. While
> having huge registers on the DWARF stack is reasonable when gdb is
> operating on the target or on the client side, on the server end it may
> pose a problem. John DelSignore said that TotalView has a similar concern
> because of its client server architecture. He did point out though that
> DWARF expressions are ephemeral.
>
> My impression was that Cary wanted to add this tensor types issue to the
> DWARF issue queue for discussion and once question of vector registers on
> the stack is settled, this proposal can be amended or a new proposal
> addressing just that issue can be filed.
> --
> Tensor types
>
> Some languages support vector data types, which are not possible to
> represent today in standard DWARF.  A vector is an array of values.
> These can be allocated to a SIMD vector register, if available, either
> permanently or temporarily, and operations on vectors make use of SIMD
> instructions, again if available.
>
> For example, as an extension to C and C++, GCC supports defining
> vector data types as described here:
>
>   https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html
>
> In this C/C++ extension, vector types are similar to arrays, and you
> can index and initialize them similarly, but they have some important
> differences.  For example:
>
> - C 

Re: [Dwarf-discuss] ISSUE: tensor types. V3

2023-04-13 Thread Ben Woodard via Dwarf-discuss

Here is V3 of what was my vector types proposal.

Changes since V2:

We discussed this extensively in the DWARF for GPUs meeting. Cary 
originally wanted it to be a TAG rather than an attribute on an array 
and quite frankly, I don't care and so my default position is "What Cary 
wants, Cary gets". However, Pedro pointed out LLVMs different flavors of 
vector types which like the vector types bubbled up from the target 
architecture to language source through intrinsics. Each of these 
different vectors flavors has slightly different semantics. There is a 
really nice table on 
https://clang.llvm.org/docs/LanguageExtensions.html#id15 that summarizes 
the differences. This changed the course of discussion and it seemed 
that the group moved back to making it an attribute on an array. Since 
there are multiple flavors of vector, this led to adding a parameter to 
the attribute that defines the flavor and a table which defines what 
those constants mean.


I brought up the point about matrix registers. Jakub is right there are 
currently no compilers which make use of matrix vector types right now. 
Even AMD's GPUs which do have intrinsics for matrix operations end up 
implementing them with arrays of vector registers. This area is rapidly 
evolving due to its heavy use in HPC and AI. The challenge appears to be 
the compilers haven't supported these operations yet. Cary came up with 
the idea of calling it a "tensor" rather than defining DW_AT_vector and 
then later adding DW_AT_matrix. So through the entire document, vector 
has been changed to tensor.


Markus pointed out a few problems in my V2 version, I tried to address 
those. They were pretty minor and obvious. Markus please verify that I 
did it to your satisfaction otherwise V4.


What has not changed since V2:

I didn't put back any changes that would allow these tensor types to 
appear on the DWARF stack. I feel that particular topic hasn't been 
settled yet. The general plan is I will work with Jakub and create some 
cases where a compiler could want to put these vector types on the DWARF 
stack. Tony Tye and the AMD team believe that the vector types do not 
need to be on the stack and believe that all the cases where the 
debuggers would want to access elements within the vector can be 
addressed with offsetting. IIUC a key point seems to be that they have 
never seen a case where an induction variable was embedded in a slot in 
a vector register, it always is a scalar. (I am not sure that I fully 
grokked their argument -- so please correct me) In the cases where it 
was, it could still be accessed as an implicit. Once I've got some 
examples of how a debugger might want to put vector types on the DWARF 
stack, the AMD team can suggest alternative approaches. I said that I 
would make a V4 proposal if the group ultimately comes to a consensus 
that vector registers are in fact needed on the stack.


As for DWARF consumers, according to Cary, the reason why DWARF 
operations are currently limited to base types is to make it relatively 
easy on the consumers. If vector registers are in fact needed on the 
stack, Zoran is fairly certain that changes that he's preparing to 
enable gdb to support GPUs would also automatically handle vector 
registers on the stack. The problem for gdb would be client server 
operation with gdbserver. The goal with gdbserver has been to keep its 
execution footprint very small. While having huge registers on the DWARF 
stack is reasonable when gdb is operating on the target or on the client 
side, on the server end it may pose a problem. John DelSignore said that 
TotalView has a similar concern because of its client server 
architecture. He did point out though that DWARF expressions are ephemeral.


My impression was that Cary wanted to add this tensor types issue to the 
DWARF issue queue for discussion and once question of vector registers 
on the stack is settled, this proposal can be amended or a new proposal 
addressing just that issue can be filed.



Tensor types

Some languages support vector data types, which are not possible to
represent today in standard DWARF.  A vector is an array of values.
These can be allocated to a SIMD vector register, if available, either
permanently or temporarily, and operations on vectors make use of SIMD
instructions, again if available.

For example, as an extension to C and C++, GCC supports defining
vector data types as described here:

https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html

In this C/C++ extension, vector types are similar to arrays, and you
can index and initialize them similarly, but they have some important
differences.  For example:

- C arrays automatically decay to pointers.  Vector types do not.

- Vector types can be passed by value to functions, and likewise
  functions can return vector types by value. Neither of which can be
  done with C arrays.

- Vector types can be