Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-19 Thread Joseph Myers
On Fri, 19 Oct 2018, Richard Sandiford wrote:

> Joseph Myers  writes:
> > On Thu, 18 Oct 2018, Richard Sandiford wrote:
> >> - Type introspection for things like parsing format strings
> >> 
> >>   It sounded like the type descriptors would be fixed-sized types,
> >>   a bit like a C version of std::type_info.
> >
> > It wasn't clear if people might also want to e.g. extract a list of all 
> > members of a structure type from such an object (which of course could 
> > either involve variable-sized data, or fixed-size data pointing to arrays, 
> > or something else along those lines).
> 
> OK.  But wouldn't that basically be a tree structure?  Or a flexible
> array if flattened?  It doesn't sound like it would need changes to

I don't know (but you mention flexible arrays, and initializers for 
flexible array members, where the size of the object ends up bigger than 
sizeof its type, are also a GNU extension).  Raise that question in the 
WG14 discussion; I'm not the right person to answer questions around 
everyone else's ideas for extensions to the C type system.  As far as I'm 
concerned, this is all a preliminary exploration of ideas that might or 
might not end up involving type system additions, and WG14 is a much 
better place for that than separate single-implementation discussions - 
the point should be to float and explore possible ideas in this space, and 
their benefits and disadvantages, rather than pushing too early for one 
particular approach.  And given how much C++ tends to use class-based 
interfaces where C uses built-in types (complex numbers, decimal floating 
point, ...), I definitely do not want to start from an assumption that the 
right interface or language concepts for this in C++ should look like 
those in C.

For me, thinking of SVE types as something like VLAs but passed by value 
seems a more natural model in C than having them sizeless - but if they 
are sizeless, that pushes them closer to other ideas for types that might 
also be sizeless (and if those other use cases are indeed best specified 
using sizeless types, that provides more justification for using sizeless 
types for SVE).

> > Is there something wrong with a model in C++ where these types have
> > some fixed small sizeof (which carries through to sizeof for
> > containing types), but where different ABIs are used for them, and
> > where much the same raw memory operations on them are disallowed as
> > would be disallowed for a class-based implementation?  (Whether
> > implemented entirely in the compiler or through some combination of
> > the compiler and class implementations in a header - though with the
> > latter you might still need some new language feature, albeit only for
> > use within the header rather than more generally.)
> 
> Having different ABIs would defeat the primary purpose of the extension,
> which is to provide access to the single-vector SVE ABI types in C and C++.

My suggestion is that the ABI for C++ would be different from that 
resulting for a class-based implementation using purely standard C++ (the 
difference being to make it the same as the SVE C API - as with the 
decimal floating-point classes).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-19 Thread Richard Sandiford
Joseph Myers  writes:
> On Thu, 18 Oct 2018, Richard Sandiford wrote:
>> - Type introspection for things like parsing format strings
>> 
>>   It sounded like the type descriptors would be fixed-sized types,
>>   a bit like a C version of std::type_info.
>
> It wasn't clear if people might also want to e.g. extract a list of all 
> members of a structure type from such an object (which of course could 
> either involve variable-sized data, or fixed-size data pointing to arrays, 
> or something else along those lines).

OK.  But wouldn't that basically be a tree structure?  Or a flexible
array if flattened?  It doesn't sound like it would need changes to
the type system.  We can already describe this kind of thing with tree
types in GCC.  (The memory needed to represent the data could of course
be allocated using a single block of stack if that's what's wanted.)

>> So I didn't see anything there that was really related, or anything that
>> relied on sizeof being variable (which as I say seems to be a very high
>> hurdle for C++).
>
> The references you gave regarding the removal of one version of VLAs from 
> C++ didn't seem to make clear whether there were supposed to be general 
> issues with variable-size types fitting in the overall C++ object model, 
> or whether the concerns were more specific to things in the particular 
> proposal - but in either case, the SVE proposals would need to be compared 
> to the actual specific concerns.

But this is also one of my concerns about moving this discussing to the
WG14 list.  It doesn't seem to be publicly readable, and I only knew
about the bignum discussion because you gave me a direct link to the
first article in the thread.  I had to read the rest by wgetting the
individual messages.  So any objections raised there would presumably
be shrouded in mystery to most people, and wouldn't e.g. show up in a
web search.

If we move it to a different forum, I'd rather it be a public one that
would treat C and C++ equally.  But maybe such a thing doesn't exist. :-)

> Anyway, the correct model in C++ need not be the same as the correct model 
> in C.  For example, for decimal floating point, C++ chose a class-based 
> model whereas C chose _Decimal* keywords (and then there's some compiler 
> magic to use appropriate ABIs for std::decimal types, I think).
>
> If you were implementing the SVE API for C++ for non-SVE hardware, you 
> might have a class-based implementation where the class internally 
> contains a pointer to underlying storage and does allocation / 
> deallocation, for example - sizeof would give some fixed small size to the 
> objects with that class type, but e.g. copying them with memcpy would not 
> work correctly (and would be diagnosed with -Wclass-memaccess).

One important point here is that the SVE API isn't a new API whose
primary target happens to be SVE.  It's an API whose *only* target
is SVE.  Anyone wanting to write vector code that runs on non-SVE
hardware should use something that's designed to be cross-platform,
(e.g. P0214 or whatever).  They certainly shouldn't be using this.

Like other vector intrinsics, the SVE ACLE is supposed to be the last
line of defence before resorting to asm, and isn't designed to be any
more portable than asm would be.

> Is there something wrong with a model in C++ where these types have
> some fixed small sizeof (which carries through to sizeof for
> containing types), but where different ABIs are used for them, and
> where much the same raw memory operations on them are disallowed as
> would be disallowed for a class-based implementation?  (Whether
> implemented entirely in the compiler or through some combination of
> the compiler and class implementations in a header - though with the
> latter you might still need some new language feature, albeit only for
> use within the header rather than more generally.)

Having different ABIs would defeat the primary purpose of the extension,
which is to provide access to the single-vector SVE ABI types in C and C++.
We want types that in both C and C++ represent the contents of SVE vector
and predicate registers.  E.g.:

  svfloat64_t vector_sin(svbool_t pg, svfloat64_t vx)

has to map pg to a predicate register (P0), vx to a vector register (Z0)
and return the result in a vector register (Z0), in both C and C++.

The main objection to the details of the sizeless type proposal seems
to be that sizeof was too useful for us to make it invalid.  But if
sizeof has different values for C and C++, wouldn't that defeat the
point?  Users would be forced to use the SVE vector length functions
after all.  Also, for:

  void (*update_vector)(svfloat64_t *px);

how would the caller of update_vector know whether the target
function is using the C or the C++ representation of svfloat64_t
when accessing *px?

> Even if that model doesn't work for some reason, it doesn't mean the only 
> alternatives for C++ are something like VLAs or a new concept of sizeless 
> types for C++ 

Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-18 Thread Uecker, Martin
Am Donnerstag, den 18.10.2018, 20:53 +0100 schrieb Richard Sandiford:
> "Uecker, Martin"  writes:
> > Hi Richard,
> > 
> > responding here to a couple of points.
> > 
> > For bignums and for a type-descibing type 'type'
> > there were proposals (including from me) to implement
> > these as variable-sized types which have some restrictions,
> > i.e. they cannot be stored in a struct/union.
> 
> But do you mean variable-sized types in the sense that they
> are completely self-contained and don't refer to separate storage?
> I.e. the moral equivalent of:
> 
>   1: struct { int size; int contents[size]; };
> 
> rather than either:
> 
>   2: struct { int size; int *contents; };

I was thinking about 1 not 2. But I would leave this to the
implementation. If it can unwind the stack and free
all allocated storage automatically whenever this is
necessary, it could also allocate it somewhere else.
Not that this would offer any real advantage...

In both cases the only real problem is when storing
these in structs. So this should simply be forbidden
as it is for VLAs.

> or:
> 
>   3: union {
>    // embedded storage up to N bits (N constant)
>    // description of separately-allocated storage (for >N bits)
>  };

This is essentially an optimized version of 2.

> ?  If so, how would that work with the example I gave in the earlier
> message:
> 
> bignum x = ...;
> for (int i = 0; i < var; ++i)
>   x += x;
> 
> Each time the addition result grows beyond the original size of x,
> I assume you'd need to allocate a new stack bignum for the new size,
> which would result in a series of ever-increasing allocas.  Won't that
> soon blow the stack?

That depends on the final size of x relative to the size of the stack.
But this is no different to:

for (int i = 0; i < var; ++)
{
   int x[i];
}

or to a recursive function. There are many ways to exhaust the
stack. It is also possible to exhaust other kinds of resources.
I don't really see the problem.

> Option 3 (as for LLVM's APInt) seems far less surprising, and can
> be made efficient for a chosen N.

Far less surprising in what sense?

>  What makes it difficult for C
> isn't the lack of general variable-length types but the lack of
> user-defined contructor, destructor, copy and move operations.

C already has generic variable-length types (VLAs). So yes, 
this is not what makes it difficult.

Yes, descructors would be needed to make it possible to store
these types in struct without memory leakage. But the destructors
don't need to be user defined, it could be a special purpose
destructor which only frees the special type.

But it doesn't really fit in the way C works and I kind of like
that C doesn't do anything behind my back.

Best,
Martin


> Thanks,
> Richard
> 
> > Most of the restrictions for these types would be the same
> > as proposed for your sizeless types. 
> > 
> > Because all these types fall into the same overall class
> > of types which do not have a size known at compile
> > time, I would suggest to add this concept to the standard
> > and then define your vector types as a subclass which
> > may have additional restrictions (no sizeof) instead
> > of adding a very specific concept which only works for
> > your proposal.
> > 
> > Best,
> > Martin
> > 
> > 
> > 
> > 
> > 
> > Am Donnerstag, den 18.10.2018, 13:47 +0100 schrieb Richard Sandiford:
> > > Joseph Myers  writes:
> > > > On Wed, 17 Oct 2018, Richard Sandiford wrote:
> > > > > Yeah, can't deny that if you look at it as a general-purpose 
> > > > > extension.
> > > > > But that's not really what this is supposed to be.  It's fairly 
> > > > > special
> > > > > purpose: there has to be some underlying variable-length/sizeless
> > > > > built-in type that you want to provide via a library.
> > > > > 
> > > > > What the extension allows is enough to support the intended use case,
> > > > > and it does that with no enforced overhead.
> > > > 
> > > > Part of my point is that there are various *other* possible cases of 
> > > > non-VLA-variable-size-type people have suggested in WG14 reflector 
> > > > discussions - so any set of concepts for such types ought to take into 
> > > > account more than just the SVE use case (even if other use cases need 
> > > > further concepts added on top of the ones needed for SVE).
> > > 
> > > [Answered this in the other thread -- sorry, took me a while to go
> > > through the full discussion.]
> > > 
> > > > > > Surely, the processor knows the size when it computes using these
> > > > > > types, so one could make it available using 'sizeof'.
> > > > > 
> > > > > The argument's similar here: we don't really need sizeof to be 
> > > > > available
> > > > > for vector use because the library provides easy ways of getting
> > > > > vector-length-based constants.  Usually what you want to know is
> > > > > "how many elements of type X are there?", with bytes just being one
> > > > > of the available element sizes.
> > > > 
> > > > But if 

Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-18 Thread Joseph Myers
On Thu, 18 Oct 2018, Uecker, Martin wrote:

> Most of the restrictions for these types would be the same
> as proposed for your sizeless types. 
> 
> Because all these types fall into the same overall class
> of types which do not have a size known at compile
> time, I would suggest to add this concept to the standard
> and then define your vector types as a subclass which
> may have additional restrictions (no sizeof) instead
> of adding a very specific concept which only works for
> your proposal.

And an underlying point here is:

Various people are exploring various ideas for C language and library 
features that might involve extending the kinds of types present in C.  
Maybe some of the ideas will turn out to be fundamentally flawed; maybe 
some will work with existing kinds of types rather than needing new kinds 
of variable-sized types.  But since all those ideas are currently under 
discussion in WG14, the SVE issues should be brought into the exploration 
process taking place there, with a view to getting a better-defined set of 
concepts for such types out of that process than from considering just one 
proposal for concepts for one set of requirements in the context of one 
implementation.

Once that discussion has resulted in a more generally applicable set of 
concepts, experience in implementing that set of concepts - likely various 
different people implementing them, in different C implementations, with a 
view to the different use cases they are exploring - could help inform any 
standardization of such features.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-18 Thread Joseph Myers
On Thu, 18 Oct 2018, Richard Sandiford wrote:

> - Type introspection for things like parsing format strings
> 
>   It sounded like the type descriptors would be fixed-sized types,
>   a bit like a C version of std::type_info.

It wasn't clear if people might also want to e.g. extract a list of all 
members of a structure type from such an object (which of course could 
either involve variable-sized data, or fixed-size data pointing to arrays, 
or something else along those lines).

> So I didn't see anything there that was really related, or anything that
> relied on sizeof being variable (which as I say seems to be a very high
> hurdle for C++).

The references you gave regarding the removal of one version of VLAs from 
C++ didn't seem to make clear whether there were supposed to be general 
issues with variable-size types fitting in the overall C++ object model, 
or whether the concerns were more specific to things in the particular 
proposal - but in either case, the SVE proposals would need to be compared 
to the actual specific concerns.

Anyway, the correct model in C++ need not be the same as the correct model 
in C.  For example, for decimal floating point, C++ chose a class-based 
model whereas C chose _Decimal* keywords (and then there's some compiler 
magic to use appropriate ABIs for std::decimal types, I think).

If you were implementing the SVE API for C++ for non-SVE hardware, you 
might have a class-based implementation where the class internally 
contains a pointer to underlying storage and does allocation / 
deallocation, for example - sizeof would give some fixed small size to the 
objects with that class type, but e.g. copying them with memcpy would not 
work correctly (and would be diagnosed with -Wclass-memaccess).  Is there 
something wrong with a model in C++ where these types have some fixed 
small sizeof (which carries through to sizeof for containing types), but 
where different ABIs are used for them, and where much the same raw memory 
operations on them are disallowed as would be disallowed for a class-based 
implementation?  (Whether implemented entirely in the compiler or through 
some combination of the compiler and class implementations in a header - 
though with the latter you might still need some new language feature, 
albeit only for use within the header rather than more generally.)

Even if that model doesn't work for some reason, it doesn't mean the only 
alternatives for C++ are something like VLAs or a new concept of sizeless 
types for C++ - but I don't have the C++ expertise to judge what other 
options for interfacing to SVE might fit best into the C++ language.

> I think it would look something like this (referring back to
> 
> *Object types are further partitioned into sized and
> sizeless; all basic and derived types defined in this standard are
> sized, but an implementation may provide additional sizeless types.*
> 
> in the RFC), not really in standardese yet:
> 
> Each implementation-specific sizeless type may have a set of
> implementation-specific "configurations".  The configuration of
> such a type may change in implementation-defined ways at any given
> sequence point.
> 
> The configuration of a sizeless structure is a tuple containing the
> configuration of each member.  Thus the configuration of a sizeless
> structure changes if and only if the configuration of one of its
> members changes.
> 
> The configuration of an object of sizeless type T is the configuration
> of T at the point that the object is created.
> 
> And then borrowing slightly from your 6.7.6.2#6 reference:
> 
> If an object of sizeless type T is accessed when T has a different
> configuration from the object, the behavior is undefined.
> 
> Is that the kind of thing you mean?

Yes.  But I wonder if it would be better to disallow such changing of 
configurations, so that all code in a program always uses the same 
configuration as far as the standard is concerned, so that there is indeed 
a size for a given vector type that's constant throughout the execution of 
a program (which would be used by calls to sizeof on such types), and so 
that communicating with a thread using a different configuration is just 
as much outside the scope of the defined language as processes using 
different ABIs communicating is today.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-18 Thread Richard Sandiford
"Uecker, Martin"  writes:
> Hi Richard,
>
> responding here to a couple of points.
>
> For bignums and for a type-descibing type 'type'
> there were proposals (including from me) to implement
> these as variable-sized types which have some restrictions,
> i.e. they cannot be stored in a struct/union.

But do you mean variable-sized types in the sense that they
are completely self-contained and don't refer to separate storage?
I.e. the moral equivalent of:

  1: struct { int size; int contents[size]; };

rather than either:

  2: struct { int size; int *contents; };

or:

  3: union {
   // embedded storage up to N bits (N constant)
   // description of separately-allocated storage (for >N bits)
 };

?  If so, how would that work with the example I gave in the earlier
message:

bignum x = ...;
for (int i = 0; i < var; ++i)
  x += x;

Each time the addition result grows beyond the original size of x,
I assume you'd need to allocate a new stack bignum for the new size,
which would result in a series of ever-increasing allocas.  Won't that
soon blow the stack?

Option 3 (as for LLVM's APInt) seems far less surprising, and can
be made efficient for a chosen N.  What makes it difficult for C
isn't the lack of general variable-length types but the lack of
user-defined contructor, destructor, copy and move operations.

Thanks,
Richard

> Most of the restrictions for these types would be the same
> as proposed for your sizeless types. 
>
> Because all these types fall into the same overall class
> of types which do not have a size known at compile
> time, I would suggest to add this concept to the standard
> and then define your vector types as a subclass which
> may have additional restrictions (no sizeof) instead
> of adding a very specific concept which only works for
> your proposal.
>
> Best,
> Martin
>
>
>
>
>
> Am Donnerstag, den 18.10.2018, 13:47 +0100 schrieb Richard Sandiford:
>> Joseph Myers  writes:
>> > On Wed, 17 Oct 2018, Richard Sandiford wrote:
>> > > Yeah, can't deny that if you look at it as a general-purpose extension.
>> > > But that's not really what this is supposed to be.  It's fairly special
>> > > purpose: there has to be some underlying variable-length/sizeless
>> > > built-in type that you want to provide via a library.
>> > > 
>> > > What the extension allows is enough to support the intended use case,
>> > > and it does that with no enforced overhead.
>> > 
>> > Part of my point is that there are various *other* possible cases of 
>> > non-VLA-variable-size-type people have suggested in WG14 reflector 
>> > discussions - so any set of concepts for such types ought to take into 
>> > account more than just the SVE use case (even if other use cases need 
>> > further concepts added on top of the ones needed for SVE).
>> 
>> [Answered this in the other thread -- sorry, took me a while to go
>> through the full discussion.]
>> 
>> > > > Surely, the processor knows the size when it computes using these
>> > > > types, so one could make it available using 'sizeof'.
>> > > 
>> > > The argument's similar here: we don't really need sizeof to be available
>> > > for vector use because the library provides easy ways of getting
>> > > vector-length-based constants.  Usually what you want to know is
>> > > "how many elements of type X are there?", with bytes just being one
>> > > of the available element sizes.
>> > 
>> > But if having sizeof available makes for a more natural language feature 
>> > (one where a few places referencing VLAs need to change to reference a 
>> > more general class of variable-size types, and a few constraints on VLAs 
>> > and variably modified types need to be relaxed to allow what you want with 
>> > these types), that may be a case for doing so, even if sizeof won't 
>> > generally be used.
>> 
>> I agree that might be all that's needed in C.  But since C++ doesn't
>> even have VLAs yet (and since something less ambituous than VLAs was
>> rejected) the situation is very different there.
>> 
>> I think we'd need a compelling reason to make sizeof variable in C++.
>> The fact that it isn't going to be generally used for SVE anyway
>> would undercut that.
>> 
>> > If the processor in fact knows the size, do you actually need to include 
>> > it in the object to be able to provide it when sizeof is called?  (With 
>> > undefined behavior still present if passing the object from a thread with 
>> > one value of sizeof for that type to a thread with a different value of 
>> > sizeof for that type, of course - the rule on VLA type compatibility would 
>> > still need to be extended to apply to sizes of these types, and those they 
>> > contain, recursively.)
>> 
>> No, if we go the undefined behaviour route, we wouldn't need to store it.
>> This was just to answer Martin's suggestion that we could make sizeof(x)
>> do the right thing for a sizeless object x by storing the size with x.
>> 
>> Thanks,
>> Richard


Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-18 Thread Uecker, Martin

Hi Richard,

responding here to a couple of points.

For bignums and for a type-descibing type 'type'
there were proposals (including from me) to implement
these as variable-sized types which have some restrictions,
i.e. they cannot be stored in a struct/union.

Most of the restrictions for these types would be the same
as proposed for your sizeless types. 

Because all these types fall into the same overall class
of types which do not have a size known at compile
time, I would suggest to add this concept to the standard
and then define your vector types as a subclass which
may have additional restrictions (no sizeof) instead
of adding a very specific concept which only works for
your proposal.

Best,
Martin





Am Donnerstag, den 18.10.2018, 13:47 +0100 schrieb Richard Sandiford:
> Joseph Myers  writes:
> > On Wed, 17 Oct 2018, Richard Sandiford wrote:
> > > Yeah, can't deny that if you look at it as a general-purpose extension.
> > > But that's not really what this is supposed to be.  It's fairly special
> > > purpose: there has to be some underlying variable-length/sizeless
> > > built-in type that you want to provide via a library.
> > > 
> > > What the extension allows is enough to support the intended use case,
> > > and it does that with no enforced overhead.
> > 
> > Part of my point is that there are various *other* possible cases of 
> > non-VLA-variable-size-type people have suggested in WG14 reflector 
> > discussions - so any set of concepts for such types ought to take into 
> > account more than just the SVE use case (even if other use cases need 
> > further concepts added on top of the ones needed for SVE).
> 
> [Answered this in the other thread -- sorry, took me a while to go
> through the full discussion.]
> 
> > > > Surely, the processor knows the size when it computes using these
> > > > types, so one could make it available using 'sizeof'.
> > > 
> > > The argument's similar here: we don't really need sizeof to be available
> > > for vector use because the library provides easy ways of getting
> > > vector-length-based constants.  Usually what you want to know is
> > > "how many elements of type X are there?", with bytes just being one
> > > of the available element sizes.
> > 
> > But if having sizeof available makes for a more natural language feature 
> > (one where a few places referencing VLAs need to change to reference a 
> > more general class of variable-size types, and a few constraints on VLAs 
> > and variably modified types need to be relaxed to allow what you want with 
> > these types), that may be a case for doing so, even if sizeof won't 
> > generally be used.
> 
> I agree that might be all that's needed in C.  But since C++ doesn't
> even have VLAs yet (and since something less ambituous than VLAs was
> rejected) the situation is very different there.
> 
> I think we'd need a compelling reason to make sizeof variable in C++.
> The fact that it isn't going to be generally used for SVE anyway
> would undercut that.
> 
> > If the processor in fact knows the size, do you actually need to include 
> > it in the object to be able to provide it when sizeof is called?  (With 
> > undefined behavior still present if passing the object from a thread with 
> > one value of sizeof for that type to a thread with a different value of 
> > sizeof for that type, of course - the rule on VLA type compatibility would 
> > still need to be extended to apply to sizes of these types, and those they 
> > contain, recursively.)
> 
> No, if we go the undefined behaviour route, we wouldn't need to store it.
> This was just to answer Martin's suggestion that we could make sizeof(x)
> do the right thing for a sizeless object x by storing the size with x.
> 
> Thanks,
> Richard

Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-18 Thread Richard Sandiford
Joseph Myers  writes:
> On Wed, 17 Oct 2018, Richard Sandiford wrote:
>> Yeah, can't deny that if you look at it as a general-purpose extension.
>> But that's not really what this is supposed to be.  It's fairly special
>> purpose: there has to be some underlying variable-length/sizeless
>> built-in type that you want to provide via a library.
>> 
>> What the extension allows is enough to support the intended use case,
>> and it does that with no enforced overhead.
>
> Part of my point is that there are various *other* possible cases of 
> non-VLA-variable-size-type people have suggested in WG14 reflector 
> discussions - so any set of concepts for such types ought to take into 
> account more than just the SVE use case (even if other use cases need 
> further concepts added on top of the ones needed for SVE).

[Answered this in the other thread -- sorry, took me a while to go
through the full discussion.]

>> > Surely, the processor knows the size when it computes using these
>> > types, so one could make it available using 'sizeof'.
>> 
>> The argument's similar here: we don't really need sizeof to be available
>> for vector use because the library provides easy ways of getting
>> vector-length-based constants.  Usually what you want to know is
>> "how many elements of type X are there?", with bytes just being one
>> of the available element sizes.
>
> But if having sizeof available makes for a more natural language feature 
> (one where a few places referencing VLAs need to change to reference a 
> more general class of variable-size types, and a few constraints on VLAs 
> and variably modified types need to be relaxed to allow what you want with 
> these types), that may be a case for doing so, even if sizeof won't 
> generally be used.

I agree that might be all that's needed in C.  But since C++ doesn't
even have VLAs yet (and since something less ambituous than VLAs was
rejected) the situation is very different there.

I think we'd need a compelling reason to make sizeof variable in C++.
The fact that it isn't going to be generally used for SVE anyway
would undercut that.

> If the processor in fact knows the size, do you actually need to include 
> it in the object to be able to provide it when sizeof is called?  (With 
> undefined behavior still present if passing the object from a thread with 
> one value of sizeof for that type to a thread with a different value of 
> sizeof for that type, of course - the rule on VLA type compatibility would 
> still need to be extended to apply to sizes of these types, and those they 
> contain, recursively.)

No, if we go the undefined behaviour route, we wouldn't need to store it.
This was just to answer Martin's suggestion that we could make sizeof(x)
do the right thing for a sizeless object x by storing the size with x.

Thanks,
Richard


Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-18 Thread Richard Sandiford
Joseph Myers  writes:
> On Wed, 17 Oct 2018, Richard Sandiford wrote:
>
>> > But as shown in the related discussions, there are other possible features 
>> > that might also involve non-VLA types whose size is not a compile-time 
>> > constant.  And so it's necessary to work with the people interested in 
>> > those features in order to clarify what the underlying concepts ought to 
>> > look like to support different such features.
>> 
>> Could you give pointers to the specific proposals/papers you mean?
>
> They're generally reflector discussions rather than written up as papers, 
> exploring the space of problems and solutions in various areas (including 
> bignums and runtime introspection of types).  I think the first message in 
> those discussions is number 15529 
>  and then relevant 
> discussions continue for much of the next 200 messages or so.

OK, thanks.  I've read from there to the latest message at the time
of writing (15720).  There seemed to be various ideas:

- a new int128_t, which started the discussion off.

- support for parameterised fixed-size integers like _Int(40), which
  seemed to be a C version of C++ template and wouldn't need
  variable-length types.

- bignums that extend as necessary.  On that I agree with what you said in:


A bignum type, in the sense of one that grows its storage if you
store a too-big number in it (as opposed to fixed-width int where
you can specify an arbitrary integer constant expression for N),
cannot meet other requirements for C integer types such as being
directly represented in binary - it has to, effectively, be a fixed
size but contain a pointer to allocated storage (and then there are
considerations of how such a type should handle errors for
allocation failure).

  and Hans Boehm said in:


2) Provide an integral type that is reasonably efficient for small
integers, but gracefully overflows to something along the lines of
(1). A common way to do that in other languages is to represent
e.g. 63-bit integers directly by adding a zero bit on the right.
On overflow a more complex result is represented by e.g. a 64-bit
aligned pointer with the low bit set to one. That way integer
addition is just an add instruction followed by an overflow check in
the normal case. Probably a better way to do integer arithmetic in
many, maybe even most, cases. Especially since such integers need to
be usable as array elements, I don't see how to avoid memory
allocation under the covers, along the slow path.

  This IIRC is how LLVM's APInt is implemented.  It doesn't need
  variable-length types, and although it would need some kind of
  memory management support for C, it doesn't need any language
  changes at all for C++.

  It's also similar to what GCC does with auto_vec and LLVM does
  with SmallVector: the types have embedded room for common cases and
  fall back to separately-allocated storage if the contents get too big.

  There was talk about having it as a true variable-length type in:


(2) is difficult because of the requirements for memory management and
the necessity to deal with allocation failures.

For avoiding integer overflow vulnerabilities, there is a variant of (2)
which is not possible to implement in a library, where expressions are
evaluated with a sufficient number of bits to obtain the mathematically
correct result.  GNAT has implemented something in this direction
(MINIMIZED and ELIMINATED):




I think that for expressions which do not involve shifts by
non-constants, it should be possible to determine the required storage
at compile time, so it would avoid the memory allocation issue.  Unlike
Ada, C doesn't have a power operator, so the storage requirements would
grow with the size of the expression (still under the assumption that
left shifts are excluded).

  But AIUI that was intended to be more special purpose, for
  intermediate results while evaluating an expression.  It solves
  the memory allocation issue because the (stack) memory used for
  evaluating the expression could be recovered after evaluation is
  complete.

  This approach wouldn't work if it was extended to an assignable bignum
  object type.  E.g. prohibiting left shifts wouldn't then help since:

 bignum x = ...;
 x <<= var; // invalid

  would be equivalent to:

 bignum x = ...;
 for (int i = 0; i < var; ++i)
   x += x; // valid

  Thus it would be easy to create what are effectively allocas of O(1<> ...and here is that any size changes come only from changes in the
>> implementation-defined built-in sizeless 

Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-17 Thread Joseph Myers
On Wed, 17 Oct 2018, Richard Sandiford wrote:

> Yeah, can't deny that if you look at it as a general-purpose extension.
> But that's not really what this is supposed to be.  It's fairly special
> purpose: there has to be some underlying variable-length/sizeless
> built-in type that you want to provide via a library.
> 
> What the extension allows is enough to support the intended use case,
> and it does that with no enforced overhead.

Part of my point is that there are various *other* possible cases of 
non-VLA-variable-size-type people have suggested in WG14 reflector 
discussions - so any set of concepts for such types ought to take into 
account more than just the SVE use case (even if other use cases need 
further concepts added on top of the ones needed for SVE).

> > Surely, the processor knows the size when it computes using these
> > types, so one could make it available using 'sizeof'.
> 
> The argument's similar here: we don't really need sizeof to be available
> for vector use because the library provides easy ways of getting
> vector-length-based constants.  Usually what you want to know is
> "how many elements of type X are there?", with bytes just being one
> of the available element sizes.

But if having sizeof available makes for a more natural language feature 
(one where a few places referencing VLAs need to change to reference a 
more general class of variable-size types, and a few constraints on VLAs 
and variably modified types need to be relaxed to allow what you want with 
these types), that may be a case for doing so, even if sizeof won't 
generally be used.

If the processor in fact knows the size, do you actually need to include 
it in the object to be able to provide it when sizeof is called?  (With 
undefined behavior still present if passing the object from a thread with 
one value of sizeof for that type to a thread with a different value of 
sizeof for that type, of course - the rule on VLA type compatibility would 
still need to be extended to apply to sizes of these types, and those they 
contain, recursively.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-17 Thread Richard Sandiford
"Uecker, Martin"  writes:
> Am Mittwoch, den 17.10.2018, 13:30 +0100 schrieb Richard Sandiford:
>> [ Sorry that there were so many typos in my last reply, will try to
>> do better
>>   this time... ]
>
> ...
>> I think the key difference between sizeless types and full C99-style
>> VLAs is that the size and layout of sizeless types never matters for
>> semantic analysis.  Rather than the sizes of types becoming variable
>> (and the offsets of members becoming variable, and constexprs
>> becoming
>> variable-sized, etc.), we simply don't make those concepts available
>> for sizeless types.
>> 
>> So nothing at the language level becomes variable that was constant
>> before.
>> All that happens is that some things become invalid for sizeless
>> types
>> that would be valid for sized ones.
>> 

>> The idea was really for the language to provide a framework for
>> implementations to define implementation-specific types with
>> implementation-specific rules while disturbing the language itself
>> as little as possible.
>
> I guess this makes it much easier for C++, but also much less useful.

Yeah, can't deny that if you look at it as a general-purpose extension.
But that's not really what this is supposed to be.  It's fairly special
purpose: there has to be some underlying variable-length/sizeless
built-in type that you want to provide via a library.

What the extension allows is enough to support the intended use case,
and it does that with no enforced overhead.

The examples with one thread accessing a vector from a different thread
aren't interesting in practice; any sharing or caching should happen via
normal arrays or malloced memory instead.  These corner cases are just
something that needs to be addressed once you allow pointers to things.

> Surely, the processor knows the size when it computes using these
> types, so one could make it available using 'sizeof'.

The argument's similar here: we don't really need sizeof to be available
for vector use because the library provides easy ways of getting
vector-length-based constants.  Usually what you want to know is
"how many elements of type X are there?", with bytes just being one
of the available element sizes.

And of course, making sizeof variable would be a can of worms in C++...
(The rejected ARB proposal didn't try to do that.)

> If a value of the type is stores in addressable memory (on the stack),
> can't we also store the size so that it is available for other threads? 

But the problem is that once the size becomes a part of the object,
it becomes difficult to get rid of it again for the intended use case.
E.g. say a vector is passed on the stack due to running out of registers.
Does the caller provide both the size and the contents, or just the
contents?  In the latter case it would be the callee's reponsibility
to "fill in" the missing size, but at what point should it compute the
size?  In the former case, would a callee-copies ABI require the callee
to copy the contents with the size given in the argument or the value
that the callee would use if it were creating an object from scratch?

These aren't insurmountable problems.  They just seem like an unnecessary
complication when the only reason for doing them is to support sizeof,
which isn't something that the use case needs.

Also, storing the size with the object would make the size field
*become* part of the size of the object, so sizeof (vector) would give
you something bigger than the size of the vector itself.  It would also
open up oddities like:

  sizeof (x) != sizeof (typeof (x))

being false in some cases.  I.e. even if sizeof (x) correctly reported
the size that an object x actually has, it isn't necessarily the size
that a new object of that type would have, so users would have to be
very careful about what they ask.  I think both these things would just
open the door to more confusion.

> Just making it undefined to access a variable with the wong size for the
> current thread seems rather fragile to me.

In many ways it seems similar to memory management.  It's the
programmer's responsibility to ensure that they don't access vectors
with the "wrong" size in the same way that it's their responsibility
not to dereference freed memory or access beyond the amount of memory
allocated.

And as I mentioned above, noone should be doing that anyway :-)
These types shouldn't live longer than a vector loop, with that loop
potentially calling functions that are easily identified as "vector
functions".

Thanks,
Richard


Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-17 Thread Joseph Myers
On Wed, 17 Oct 2018, Richard Sandiford wrote:

> > But as shown in the related discussions, there are other possible features 
> > that might also involve non-VLA types whose size is not a compile-time 
> > constant.  And so it's necessary to work with the people interested in 
> > those features in order to clarify what the underlying concepts ought to 
> > look like to support different such features.
> 
> Could you give pointers to the specific proposals/papers you mean?

They're generally reflector discussions rather than written up as papers, 
exploring the space of problems and solutions in various areas (including 
bignums and runtime introspection of types).  I think the first message in 
those discussions is number 15529 
 and then relevant 
discussions continue for much of the next 200 messages or so.

> ...and here is that any size changes come only from changes in the
> implementation-defined built-in sizeless types.  The user can't define

But then I think you still need to define in the standard edits something 
of what the type-compatibility rules are.

> > Can these types be passed to variadic functions and named in va_arg?  
> > Again, I don't see anything to say they can't.
> 
> Yes, this is allowed (and covered by the tests FWIW).

How does that work with not knowing the size even at runtime?

At least, this seems like another place where there would be special type 
compatibility considerations that need to be applied between caller and 
callee.

> Except for bit-fields *and sizeless structures*, objects are
> composed of contiguous sequences of one or more bytes, the number,
> order, and encoding of which are either explicitly specified or
> implementation-defined.
> 
> TBH the possibility of a discontiguous representation was an early idea
> that we've never actually used so far, so if that's a problem, we could
> probably drop it.  It just seemed to be a natural extension of the
> principle that the layout is completely implementation-defined.

If you have discontiguous representations, I don't see how "->" on 
structure pointers (or indeed unary "*") is supposed to work; disallowing 
discontiguous representations would seem to fit a lot more naturally with 
the C object model.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-17 Thread Uecker, Martin


Am Mittwoch, den 17.10.2018, 13:30 +0100 schrieb Richard Sandiford:
> [ Sorry that there were so many typos in my last reply, will try to
> do better
>   this time... ]

...
> I think the key difference between sizeless types and full C99-style
> VLAs is that the size and layout of sizeless types never matters for
> semantic analysis.  Rather than the sizes of types becoming variable
> (and the offsets of members becoming variable, and constexprs
> becoming
> variable-sized, etc.), we simply don't make those concepts available
> for sizeless types.
> 
> So nothing at the language level becomes variable that was constant
> before.
> All that happens is that some things become invalid for sizeless
> types
> that would be valid for sized ones.
> 
> The idea was really for the language to provide a framework for
> implementations to define implementation-specific types with
> implementation-specific rules while disturbing the language itself
> as little as possible.

I guess this makes it much easier for C++, but also much less useful.
Surely, the processor knows the size when it computes using these
types, so one could make it available using 'sizeof'. If a value
of the type is stores in addressable memory (on the stack), can't
we also store the size so that it is available for other threads? 
Just making it undefined to access a variable with the wong size
for the current thread seems rather fragile to me. 

Best,
Martin







Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-17 Thread Richard Sandiford
[ Sorry that there were so many typos in my last reply, will try to do better
  this time... ]

Joseph Myers  writes:
> On Tue, 16 Oct 2018, Richard Sandiford wrote:
>> The patches therefore add a new "__sizeless_struct" keyword to denote
>> structures that are sizeless rather than sized.  Unlike normal
>> structures, these structures can have members of sizeless type in
>> addition to members of sized type.  On the other hand, they have all
>> the same limitations as other sizeless types (described in earlier
>> sections).
>
> I don't see anything here disallowing offsetof on such structures.

I didn't think this needed to be done explicitly since:

offsetof(type, member-designator)

which expands to an integer constant expression that has type size_t,
the value of which is the offset in bytes, to the structure
member (designated by member-designator), from the beginning of its
structure (designated by type). The type and member designator shall be
such that given

static type t;

then the expression &(t.member-designator) evaluates to an address
constant. (If the specified member is a bit-field, the behavior is
undefined.)

implicitly rejects sizeless types on the basis that "static type t;"
would be invalid.  I think that's the same way that it rejects
incomplete structure types.

But yeah, it looks like I forgot to handle this in GCC. :-(

> On Tue, 16 Oct 2018, Richard Sandiford wrote:
>> > as Joseph pointed out, there are some related discussions
>> > on the WG14 reflector. How a about moving the discussion
>> > there?
>> 
>> The idea was to get a feel for what would be acceptable to GCC
>> maintainers.  When Arm presented an extension of P0214 to support SVE
>> at the last C++ committee meeting, using this sizeless type extension
>> as a possible way of providing the underlying vector types, the feeling
>> seemed to be that it wouldn't be considered unless it had already been
>> proven in compilers.
>
> But as shown in the related discussions, there are other possible features 
> that might also involve non-VLA types whose size is not a compile-time 
> constant.  And so it's necessary to work with the people interested in 
> those features in order to clarify what the underlying concepts ought to 
> look like to support different such features.

Could you give pointers to the specific proposals/papers you mean?

>> I think it is for some people though.  If the vectors don't decay to
>> pointers, they're more akin to a VLA wrapped in a structure rather than
>> a stand-alone VLA.  There is a GNU extension for that, e.g.:
>> 
>>   int
>>   f (int n)
>>   {
>> struct s {
>>   int x[n];
>> } foo;
>> return sizeof (foo.x);
>>   }
>> 
>> But even though clang supports VLAs (of course), it rejects the
>> above with:
>> 
>>   error: fields must have a constant size: 'variable length array in 
>> structure' extension will never be supported
>> 
>> This gives a strong impression that wrapping a VLA type like this
>> is a bridge too far for some :-)  The message makes it clear that's
>> a case of "don't even bother asking".
>
> What are the clang concerns about VLAs in structs that are the reason for 
> not supporting them?

The user manual says:

-  clang does not support the gcc extension that allows variable-length
   arrays in structures. This is for a few reasons: one, it is tricky to
   implement, two, the extension is completely undocumented, and three,
   the extension appears to be rarely used. Note that clang *does*
   support flexible array members (arrays with a zero or unspecified
   size at the end of a structure).

So I guess defining it would remove the second objection.

> How do the sizeless structs with sizeless members in your proposal
> avoid those concerns about the definition of VLAs in structs?

The key difference is that the size, offset and layout don't have to be
known to the frontend and available during semantic analysis (unlike for
VLAs in structs).  In the clang implementation of sizeless types those
details only start to matter when translating clang ASTs into LLVM IR.
(With GCC it's a bit different, since TYPE_SIZE is set as soon as the
type definition is complete, even though for SVE TYPE_SIZE should only
matter in the mid and backend.)

>> The problem isn't so much that the size is only known at runtime,
>> but that the size isn't necessarily invariant, and the size of an
>> object doesn't carry the size information with it.
>> 
>> This means you can't tell what size a given object is, even at runtime.
>
> How then is e.g. passing a pointer to such a struct (containing such 
> unknown-size members) to another function supposed to work?  Or is there 
> something in your proposed standard text edits that would disallow passing 
> such a pointer, or disallow using "->" with it to access members?

The idea here...

>> All you can tell is what size the object would be if you created it
>> from scratch.  E.g.:
>> 
>>   

Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-16 Thread Joseph Myers
On Tue, 16 Oct 2018, Richard Sandiford wrote:

> > as Joseph pointed out, there are some related discussions
> > on the WG14 reflector. How a about moving the discussion
> > there?
> 
> The idea was to get a feel for what would be acceptable to GCC
> maintainers.  When Arm presented an extension of P0214 to support SVE
> at the last C++ committee meeting, using this sizeless type extension
> as a possible way of providing the underlying vector types, the feeling
> seemed to be that it wouldn't be considered unless it had already been
> proven in compilers.

But as shown in the related discussions, there are other possible features 
that might also involve non-VLA types whose size is not a compile-time 
constant.  And so it's necessary to work with the people interested in 
those features in order to clarify what the underlying concepts ought to 
look like to support different such features.

> I think it is for some poople though.  If the vectors don't decay to
> pointers, they're moe akin to a VLA wrapped in a structure rather than
> a stand-alone VLA.  There is a GNU extension for that, e.g.:
> 
>   int
>   f (int n)
>   {
> struct s {
>   int x[n];
> } foo;
> return sizeof (foo.x);
>   }
> 
> But even though clang supports VLAs (of course), it rejects the
> above with:
> 
>   error: fields must have a constant size: 'variable length array in 
> structure' extension will never be supported
> 
> This gives a strong impression that wrapping a VLA type like this
> is a bridge too far for some :-)  The message makes it clear that's
> a case of "don't even bother asking".

What are the clang concerns about VLAs in structs that are the reason for 
not supporting them?  How do the sizeless structs with sizeless members in 
your proposal avoid those concerns about the definition of VLAs in 
structs?

> The problem isn't so much that the size is only known at runtime,
> but that the size isn't necessarily invariant, and the size of an
> object doesn't carry the size information with it.
> 
> This means you can't tell what size a given object is, even at runtime.

How then is e.g. passing a pointer to such a struct (containing such 
unknown-size members) to another function supposed to work?  Or is there 
something in your proposed standard text edits that would disallow passing 
such a pointer, or disallow using "->" with it to access members?

> All you can tell is what size the object would be if you created it
> from scratch.  E.g.:
> 
>   svint8_t *ptr;  // pointer to variable-length vector type
> 
>   void thread1 (void)
>   {
> svint8_t local;
> *ptr = 
> ...run for a long time...
>   }
> 
>   void thread2 (void)
>   {
> ... sizeof (*ptr); ...;
>   }
> 
> If thread1 and thread2 have different vector lengths, thread2 has no way
> of knowing what size *ptr is.
> 
> Of course, thread2 can't validly use *ptr if it has wider vectors than
> thread1, but if we resort to saying "undefined behavior" for the above,
> then it becomes difficult to define when the size actually is defined.

What in your standard text edits serves to make that undefined?  
Generally, what in those edits serves to say when conversions involving 
such types, or pointers thereto, or accesses through compatible types in 
different places, are or are not defined?

In standard C, for example, we have for VLAs 6.7.6.2#6, "If the two array 
types are used in a context which requires them to be compatible, it is 
undefined behavior if the two size specifiers evaluate to unequal 
values.".  What is the analogue of this for sizeless types?  Since unlike 
VLAs you're allowing these types, and sizeless structs containing them, to 
be passed by value, assigned, etc., you need something like that to 
determine whether assignment, conditional expression, function argument 
passing, function return, access via a pointer, etc., are valid.

Can these types be used with _Atomic?  I don't see anything to say they 
can't.

Can these types be passed to variadic functions and named in va_arg?  
Again, I don't see anything to say they can't.

Can you have file-scope, and so static-storage-duration, compound literals 
with these types?  You're allowing compound literals with these types, and 
what you have disallowing objects with these types with non-automatic 
storage duration seems to be specific to the case of "an identifier for an 
object".

I don't see any change proposed to 6.2.6.1#2 corresponding to what you say 
elsewhere about discontiguous representations of sizeless structures.

> But do you have any feel for whether this would ever be acceptable
> in C++?  One of the main requirements for this was that it needs
> to work in both C and C++, with the same ABI representation.
> I thought VLAs were added to an early draft of C++14 and then
> removed before it was published.  They weren't added back for C++17,
> and I'd seen other proposals about classes having a "sizeof field"
> instead (i.e. the type would carry the size 

Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-16 Thread Richard Sandiford
Hi Martin,

Thanks for the reply.

"Uecker, Martin"  writes:
> Hi Richard,
>
> as Joseph pointed out, there are some related discussions
> on the WG14 reflector. How a about moving the discussion
> there?

The idea was to get a feel for what would be acceptable to GCC
maintainers.  When Arm presented an extension of P0214 to support SVE
at the last C++ committee meeting, using this sizeless type extension
as a possible way of providing the underlying vector types, the feeling
seemed to be that it wouldn't be considered unless it had already been
proven in compilers.

> I find your approach very interesting and that it already
> comes with an implementation is of course very useful
>
> But I don't really understand the reasons why this is not based
> on (2). These types are not "sizeless" at all, their size
> just isn't known at compile time. So to me this seems to me
> a misnomer.
>
> In fact, to me these types *do* in fact seem very similar
> to VLAs as VLAs are also complete types which also do no
> have a known size at compile time.
>
> That arrays decay to pointers doesn't mean that we
> couldn't have similar vectors types which don't decay.
> This is hardly a fundamental problem.

I think it is for some poople though.  If the vectors don't decay to
pointers, they're moe akin to a VLA wrapped in a structure rather than
a stand-alone VLA.  There is a GNU extension for that, e.g.:

  int
  f (int n)
  {
struct s {
  int x[n];
} foo;
return sizeof (foo.x);
  }

But even though clang supports VLAs (of course), it rejects the
above with:

  error: fields must have a constant size: 'variable length array in structure' 
extension will never be supported

This gives a strong impression that wrapping a VLA type like this
is a bridge too far for some :-)  The message makes it clear that's
a case of "don't even bother asking".

The vector tuple types would be very similar to this if modelled as VLAs.

> I also don't understand the problem about the array
> size. If I understand this correctly, the size is somehow
> known at run-time and implicitly passed along with the
> values. So these new types do not need to have a
> size expression (as in your proposal). 

The problem isn't so much that the size is only known at runtime,
but that the size isn't necessarily invariant, and the size of an
object doesn't carry the size information with it.

This means you can't tell what size a given object is, even at runtime.
All you can tell is what size the object would be if you created it
from scratch.  E.g.:

  svint8_t *ptr;  // pointer to variable-length vector type

  void thread1 (void)
  {
svint8_t local;
*ptr = 
...run for a long time...
  }

  void thread2 (void)
  {
... sizeof (*ptr); ...;
  }

If thread1 and thread2 have different vector lengths, thread2 has no way
of knowing what size *ptr is.

Of course, thread2 can't validly use *ptr if it has wider vectors than
thread1, but if we resort to saying "undefined behavior" for the above,
then it becomes difficult to define when the size actually is defined.
It's simpler not to make it measurable via sizeof at all.  (And that's
a much less invasive change to the language.)

> Assignment, the possibility to return the type from
> functions, and something like __sizeless_structs would
> make sense for VLAs too.
>
> So creating a new category "variable-length types" for 
> both VLAs and variably-length vector types seems do make
> much more sense to me. As I see it, this would be mainly
> a change in terminology and not so much of the underlying
> approach.

But do you have any feel for whether this would ever be acceptable
in C++?  One of the main requirements for this was that it needs
to work in both C and C++, with the same ABI representation.
I thought VLAs were added to an early draft of C++14 and then
removed before it was published.  They weren't added back for C++17,
and I'd seen other proposals about classes having a "sizeof field"
instead (i.e. the type would carry the size information with it,
which we don't want).  So the prospects didn't look good.

Thanks,
Richard


[00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-16 Thread Richard Sandiford
The C standard says:

At various points within a translation unit an object type may be
"incomplete" (lacking sufficient information to determine the size of
objects of that type) or "complete" (having sufficient information).

For AArch64 SVE, we'd like to split this into two concepts:

  * has the type been fully defined?
  * would fully-defining the type determine its size?

This is because we'd like to be able to represent SVE vectors as C and C++
types.  Since SVE is a "vector-length agnostic" architecture, the size
of the vectors is determined by the runtime environment rather than the
programmer or compiler.  In that sense, defining an SVE vector type does
not determine its size.  It's nevertheless possible to use SVE vector types
in meaningful ways, such as having automatic vector variables and passing
vectors between functions.

The main questions in the RFC are:

  1) is splitting the definition like this OK in principle?
  2) are the specific rules described below OK?
  3) coding-wise, how should the split be represented in GCC?

Terminology
---

Going back to the second bullet above:

  * would fully-defining the type determine its size?

the rest of the RFC calls a type "sized" if fully defining it would
determine its size.  The type is "sizeless" otherwise.

Contents


The RFC is organised as follows.  I've erred on the side of including
detail rather than leaving it out, but each section is meant to be
self-contained and skippable:

  - An earlier RFC
  - Quick overview of SVE
  - Why we need SVE types in C and C++
  - How we ended up with this definition
  - The SVE types in more detail
  - Outline of the type system changes
  - Sizeless structures (and testing on non-SVE targets)
  - Other variable-length vector architectures
  - Edits to the C standard
- Base changes
- Updates for consistency
- Sizeless structures
  - Edits to the C++ standard
  - GCC implementation questions

I'll follow up with patches that implement the split.



An earlier RFC
==

For the record (in case this sounds familiar) I sent an RFC about the
sizeless type extension a while ago:

https://gcc.gnu.org/ml/gcc/2017-08/msg00012.html

The rules haven't changed since then, but this version includes more
information and includes support for sizeless structures.


Quick overview of SVE
=

SVE is a vector extension to AArch64.  A detailed description is
available here:

https://static.docs.arm.com/ddi0584/a/DDI0584A_a_SVE_supp_armv8A.pdf

but the only feature that really matters for this RFC is that SVE has no
fixed or preferred vector length.  Implementations can instead choose
from a range of possible vector lengths, with 128 bits being the minimum
and 2048 bits being the maximum.  Priveleged software can further
constrain the vector length within the range offered by the implementation;
e.g. linux currently provides per-thread control of the vector length.


Why we need SVE types in C and C++
==

SVE was designed to be an easy target for autovectorising normal scalar
code.  There are also various language extensions that support explicit
data parallelism or that make explicit vector chunking easier to do in
an architecture-neutral way (e.g. C++ P0214).  This means that many users
won't need to do anything SVE-specific.

Even so, there's always going to be a place for writing SVE-specific
optimisations, with full access to the underlying ISA.  As for other
vector architectures, we'd like users to be able to write such routines
in C and C++ rather than force them to go all the way to assembly.

We'd also like C and C++ functions to be able to take SVE vector
parameters and return SVE vector results, which is particularly useful
when implementing things like vector math routines.  In this case in
particular, the types need to map directly to something that fits in
an SVE register, so that passing and returning vectors has minimal
overhead.


How we ended up with this definition


Requirements


We need the SVE vector types to define and use SVE intrinsic functions
and to write SVE vector library routines.  The key requirements when
defining the types were:

  * They must be available in both C and C++ (because we want to be able
add SVE optimisations to C-only codebases).

  * They must fit in an SVE vector register (so there can be no on-the-side
information).

  * It must be possible to define automatic variables with these types.

  * It must be possible to pass and return objects of these types
(since that's what intrinsics and vector library routines need to do).

  * It must be possible to use the types in _Generic associations
(so that _Generic can be used to provide tgmath.h-style overloads).

  * It must be possible to use pointers or references to the types
(for passing or returning by pointer or reference, and because not

Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-16 Thread Richard Biener
On Mon, Oct 15, 2018 at 8:40 PM Uecker, Martin
 wrote:
>
>
> Hi Richard,
>
> as Joseph pointed out, there are some related discussions
> on the WG14 reflector. How a about moving the discussion
> there?
>
> I find your approach very interesting and that it already
> comes with an implementation is of course very useful
>
> But I don't really understand the reasons why this is not based
> on (2). These types are not "sizeless" at all, their size
> just isn't known at compile time. So to me this seems to me
> a misnomer.
>
> In fact, to me these types *do* in fact seem very similar
> to VLAs as VLAs are also complete types which also do no
> have a known size at compile time.
>
> That arrays decay to pointers doesn't mean that we
> couldn't have similar vectors types which don't decay.
> This is hardly a fundamental problem.
>
> I also don't understand the problem about the array
> size. If I understand this correctly, the size is somehow
> known at run-time and implicitly passed along with the
> values. So these new types do not need to have a
> size expression (as in your proposal).
>
> Assignment, the possibility to return the type from
> functions, and something like __sizeless_structs would
> make sense for VLAs too.
>
> So creating a new category "variable-length types" for
> both VLAs and variably-length vector types seems do make
> much more sense to me. As I see it, this would be mainly
> a change in terminology and not so much of the underlying
> approach.

I agree - those types very much feel like VLAs.

I think there's also the existing vector extension of GCC to
factor in which introduces (non-VLA) vector types to the
language and allows arithmetic on them as well as
indexing and passing and returing them by values.
Variable-size vectors should play well in that context.

Richard.

>
> Best,
> Martin
>
> Am Montag, den 15.10.2018, 15:30 +0100 schrieb Richard Sandiford:
> > The C standard says:
> >
> > At various points within a translation unit an object type may be
> > "incomplete" (lacking sufficient information to determine the size of
> > objects of that type) or "complete" (having sufficient information).
> >
> > For AArch64 SVE, we'd like to split this into two concepts:
> >
> >   * has the type been fully defined?
> >   * would fully-defining the type determine its size?
> >
> > This is because we'd like to be able to represent SVE vectors as C and C++
> > types.  Since SVE is a "vector-length agnostic" architecture, the size
> > of the vectors is determined by the runtime environment rather than the
> > programmer or compiler.  In that sense, defining an SVE vector type does
> > not determine its size.  It's nevertheless possible to use SVE vector types
> > in meaningful ways, such as having automatic vector variables and passing
> > vectors between functions.
> >
> > The main questions in the RFC are:
> >
> >   1) is splitting the definition like this OK in principle?
> >   2) are the specific rules described below OK?
> >   3) coding-wise, how should the split be represented in GCC?
> >
> > Terminology
> > ---
> >
> > Going back to the second bullet above:
> >
> >   * would fully-defining the type determine its size?
> >
> > the rest of the RFC calls a type "sized" if fully defining it would
> > determine its size.  The type is "sizeless" otherwise.
> >
> > Contents
> > 
> >
> > The RFC is organised as follows.  I've erred on the side of including
> > detail rather than leaving it out, but each section is meant to be
> > self-contained and skippable:
> >
> >   - An earlier RFC
> >   - Quick overview of SVE
> >   - Why we need SVE types in C and C++
> >   - How we ended up with this definition
> >   - The SVE types in more detail
> >   - Outline of the type system changes
> >   - Sizeless structures (and testing on non-SVE targets)
> >   - Other variable-length vector architectures
> >   - Edits to the C standard
> > - Base changes
> > - Updates for consistency
> > - Sizeless structures
> >   - Edits to the C++ standard
> >   - GCC implementation questions
> >
> > I'll follow up with patches that implement the split.
> >
> >
> >
> > An earlier RFC
> > ==
> >
> > For the record (in case this sounds familiar) I sent an RFC about the
> > sizeless type extension a while ago:
> >
> > https://gcc.gnu.org/ml/gcc/2017-08/msg00012.html
> >
> > The rules haven't changed since then, but this version includes more
> > information and includes support for sizeless structures.
> >
> >
> > Quick overview of SVE
> > =
> >
> > SVE is a vector extension to AArch64.  A detailed description is
> > available here:
> >
> > https://static.docs.arm.com/ddi0584/a/DDI0584A_a_SVE_supp_armv8A.pdf
> >
> > but the only feature that really matters for this RFC is that SVE has no
> > fixed or preferred vector length.  Implementations can instead choose
> > from a range of possible vector lengths, with 128 bits being the minimum
> > and 

Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-15 Thread Uecker, Martin

Hi Richard,

as Joseph pointed out, there are some related discussions
on the WG14 reflector. How a about moving the discussion
there?

I find your approach very interesting and that it already
comes with an implementation is of course very useful

But I don't really understand the reasons why this is not based
on (2). These types are not "sizeless" at all, their size
just isn't known at compile time. So to me this seems to me
a misnomer.

In fact, to me these types *do* in fact seem very similar
to VLAs as VLAs are also complete types which also do no
have a known size at compile time.

That arrays decay to pointers doesn't mean that we
couldn't have similar vectors types which don't decay.
This is hardly a fundamental problem.

I also don't understand the problem about the array
size. If I understand this correctly, the size is somehow
known at run-time and implicitly passed along with the
values. So these new types do not need to have a
size expression (as in your proposal). 

Assignment, the possibility to return the type from
functions, and something like __sizeless_structs would
make sense for VLAs too.

So creating a new category "variable-length types" for 
both VLAs and variably-length vector types seems do make
much more sense to me. As I see it, this would be mainly
a change in terminology and not so much of the underlying
approach.


Best,
Martin

Am Montag, den 15.10.2018, 15:30 +0100 schrieb Richard Sandiford:
> The C standard says:
> 
> At various points within a translation unit an object type may be
> "incomplete" (lacking sufficient information to determine the size of
> objects of that type) or "complete" (having sufficient information).
> 
> For AArch64 SVE, we'd like to split this into two concepts:
> 
>   * has the type been fully defined?
>   * would fully-defining the type determine its size?
> 
> This is because we'd like to be able to represent SVE vectors as C and C++
> types.  Since SVE is a "vector-length agnostic" architecture, the size
> of the vectors is determined by the runtime environment rather than the
> programmer or compiler.  In that sense, defining an SVE vector type does
> not determine its size.  It's nevertheless possible to use SVE vector types
> in meaningful ways, such as having automatic vector variables and passing
> vectors between functions.
> 
> The main questions in the RFC are:
> 
>   1) is splitting the definition like this OK in principle?
>   2) are the specific rules described below OK?
>   3) coding-wise, how should the split be represented in GCC?
> 
> Terminology
> ---
> 
> Going back to the second bullet above:
> 
>   * would fully-defining the type determine its size?
> 
> the rest of the RFC calls a type "sized" if fully defining it would
> determine its size.  The type is "sizeless" otherwise.
> 
> Contents
> 
> 
> The RFC is organised as follows.  I've erred on the side of including
> detail rather than leaving it out, but each section is meant to be
> self-contained and skippable:
> 
>   - An earlier RFC
>   - Quick overview of SVE
>   - Why we need SVE types in C and C++
>   - How we ended up with this definition
>   - The SVE types in more detail
>   - Outline of the type system changes
>   - Sizeless structures (and testing on non-SVE targets)
>   - Other variable-length vector architectures
>   - Edits to the C standard
> - Base changes
> - Updates for consistency
> - Sizeless structures
>   - Edits to the C++ standard
>   - GCC implementation questions
> 
> I'll follow up with patches that implement the split.
> 
> 
> 
> An earlier RFC
> ==
> 
> For the record (in case this sounds familiar) I sent an RFC about the
> sizeless type extension a while ago:
> 
> https://gcc.gnu.org/ml/gcc/2017-08/msg00012.html
> 
> The rules haven't changed since then, but this version includes more
> information and includes support for sizeless structures.
> 
> 
> Quick overview of SVE
> =
> 
> SVE is a vector extension to AArch64.  A detailed description is
> available here:
> 
> https://static.docs.arm.com/ddi0584/a/DDI0584A_a_SVE_supp_armv8A.pdf
> 
> but the only feature that really matters for this RFC is that SVE has no
> fixed or preferred vector length.  Implementations can instead choose
> from a range of possible vector lengths, with 128 bits being the minimum
> and 2048 bits being the maximum.  Priveleged software can further
> constrain the vector length within the range offered by the implementation;
> e.g. linux currently provides per-thread control of the vector length.
> 
> 
> Why we need SVE types in C and C++
> ==
> 
> SVE was designed to be an easy target for autovectorising normal scalar
> code.  There are also various language extensions that support explicit
> data parallelism or that make explicit vector chunking easier to do in
> an architecture-neutral way (e.g. C++ P0214).  This means that many users
> won't need to do 

Re: [00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-15 Thread Joseph Myers
On Mon, 15 Oct 2018, Richard Sandiford wrote:

> The patches therefore add a new "__sizeless_struct" keyword to denote
> structures that are sizeless rather than sized.  Unlike normal
> structures, these structures can have members of sizeless type in
> addition to members of sized type.  On the other hand, they have all
> the same limitations as other sizeless types (described in earlier
> sections).

I don't see anything here disallowing offsetof on such structures.

> Edits to the C standard
> ===
> 
> This section specifies the behaviour for sizeless types as an edit to N1570.

That's a very old standard version.

I'm not in Pittsburgh this week, but I don't see anything to do with these 
ideas on the agenda.  I haven't seen any contributions from Arm to the 
ongoing discussions on the WG14 reflector that include issues relating to 
possibly runtime sized types (vectors, bignums, types representing 
information about another type, for example), unless they're using 
not-obviously-Arm email addresses.  Is Arm going to be engaging in those 
discussions and working with people interested in these areas to produce 
proposals that take account of the different ideas people have for use of 
non-VLA types that may not have a compile-time-constant size (some of 
which may not end up in the C standard, of course)?  (It might of course 
require multiple papers, e.g. starting with fixed-width vector types which 
as a widely-implemented feature are something it might be natural to 
consider for C2x.)

-- 
Joseph S. Myers
jos...@codesourcery.com


[00/10][RFC] Splitting the C and C++ concept of "complete type"

2018-10-15 Thread Richard Sandiford
The C standard says:

At various points within a translation unit an object type may be
"incomplete" (lacking sufficient information to determine the size of
objects of that type) or "complete" (having sufficient information).

For AArch64 SVE, we'd like to split this into two concepts:

  * has the type been fully defined?
  * would fully-defining the type determine its size?

This is because we'd like to be able to represent SVE vectors as C and C++
types.  Since SVE is a "vector-length agnostic" architecture, the size
of the vectors is determined by the runtime environment rather than the
programmer or compiler.  In that sense, defining an SVE vector type does
not determine its size.  It's nevertheless possible to use SVE vector types
in meaningful ways, such as having automatic vector variables and passing
vectors between functions.

The main questions in the RFC are:

  1) is splitting the definition like this OK in principle?
  2) are the specific rules described below OK?
  3) coding-wise, how should the split be represented in GCC?

Terminology
---

Going back to the second bullet above:

  * would fully-defining the type determine its size?

the rest of the RFC calls a type "sized" if fully defining it would
determine its size.  The type is "sizeless" otherwise.

Contents


The RFC is organised as follows.  I've erred on the side of including
detail rather than leaving it out, but each section is meant to be
self-contained and skippable:

  - An earlier RFC
  - Quick overview of SVE
  - Why we need SVE types in C and C++
  - How we ended up with this definition
  - The SVE types in more detail
  - Outline of the type system changes
  - Sizeless structures (and testing on non-SVE targets)
  - Other variable-length vector architectures
  - Edits to the C standard
- Base changes
- Updates for consistency
- Sizeless structures
  - Edits to the C++ standard
  - GCC implementation questions

I'll follow up with patches that implement the split.



An earlier RFC
==

For the record (in case this sounds familiar) I sent an RFC about the
sizeless type extension a while ago:

https://gcc.gnu.org/ml/gcc/2017-08/msg00012.html

The rules haven't changed since then, but this version includes more
information and includes support for sizeless structures.


Quick overview of SVE
=

SVE is a vector extension to AArch64.  A detailed description is
available here:

https://static.docs.arm.com/ddi0584/a/DDI0584A_a_SVE_supp_armv8A.pdf

but the only feature that really matters for this RFC is that SVE has no
fixed or preferred vector length.  Implementations can instead choose
from a range of possible vector lengths, with 128 bits being the minimum
and 2048 bits being the maximum.  Priveleged software can further
constrain the vector length within the range offered by the implementation;
e.g. linux currently provides per-thread control of the vector length.


Why we need SVE types in C and C++
==

SVE was designed to be an easy target for autovectorising normal scalar
code.  There are also various language extensions that support explicit
data parallelism or that make explicit vector chunking easier to do in
an architecture-neutral way (e.g. C++ P0214).  This means that many users
won't need to do anything SVE-specific.

Even so, there's always going to be a place for writing SVE-specific
optimisations, with full access to the underlying ISA.  As for other
vector architectures, we'd like users to be able to write such routines
in C and C++ rather than force them to go all the way to assembly.

We'd also like C and C++ functions to be able to take SVE vector
parameters and return SVE vector results, which is particularly useful
when implementing things like vector math routines.  In this case in
particular, the types need to map directly to something that fits in
an SVE register, so that passing and returning vectors has minimal
overhead.


How we ended up with this definition


Requirements


We need the SVE vector types to define and use SVE intrinsic functions
and to write SVE vector library routines.  The key requirements when
defining the types were:

  * They must be available in both C and C++ (because we want to be able
add SVE optimisations to C-only codebases).

  * They must fit in an SVE vector register (so there can be no on-the-side
information).

  * It must be possible to define automatic variables with these types.

  * It must be possible to pass and return objects of these types
(since that's what intrinsics and vector library routines need to do).

  * It must be possible to use the types in _Generic associations
(so that _Generic can be used to provide tgmath.h-style overloads).

  * It must be possible to use pointers or references to the types
(for passing or returning by pointer or reference, and because not