[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2021-05-07 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


The following issue has a resolution that has been APPLIED. 
== 
https://austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: Applied
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text:https://austingroupbugs.net/view.php?id=697#c5332 
Resolution: Accepted As Marked
Fixed in Version:   
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2021-05-07 15:06 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947  
2020-08-28 08:27 geoffclare Note Edited: 0004947 
2020-08-28 17:07 shware_systems Note Added: 0004948  
2020-08-28 17:52 kreNote Added: 0004949  
2020-08-28 22:42 philip-guentherNote Added: 0004952  
2020-08-28 22:52 philip-guentherNote Added: 0004953  
2020-08-29 11:34 shware_systems Note Added: 0004955  
2020-08-29 11:34 shware_systems Note Added: 0004956  
2020-08-29 11:41 shware_systems Note Edited: 0004956 
2020-08-29 11:44 shware_systems Note Deleted: 0004955
2020-08-29 13:07 shware_systems Note Added: 0004957  
2020-08-29 13:11 shware_systems Note Edited: 0004957 
2020-08-30 23:06 philip-guentherNote Added: 0004958  
2020-08-30 23:11 philip-guentherNote Added: 0004959  
2020-08-30 23:44 shware_systems Note Added: 0004960  
2020-10-02 09:00 geoffclare Note Edited: 0004947 
2020-10-02 09:11 geoffclare Note Added: 0005016  
2020-10-03 12:53 denniswNote Added: 0005022  
2020-10-05 08:03 geoffclare Note Edited: 0004947 
2020-10-05 08:11 geoffclare Note Added: 0005023  
2020-10-07 08:44 geoffclare Note Edited: 0004947 
2020-10-07 08:47 geoffclare Note Added: 0005032  
2020-10-07 11:29 denniswNote Added: 0005033  
2020-10-07 13:23 geoffclare Note Added: 0005034  
2020-10-07 14:28 shware_systems Note Added: 0005036  
2020-10-08 16:32 geoffclare Note Edited: 0004947 
2020-10-08 16:33 geoffclare Note Edited: 0005034 
2020-10-23 14:38 geoffclare Note Added: 0005060  
2021-04-29 15:11 geoffclare Note Added: 0005332  
2021-04-29 15:11 geoffclare Interp Status => --- 

[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2021-04-29 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


The following issue has been RESOLVED. 
== 
https://austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: Resolved
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text:https://austingroupbugs.net/view.php?id=697#c5332 
Resolution: Accepted As Marked
Fixed in Version:   
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2021-04-29 15:11 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947  
2020-08-28 08:27 geoffclare Note Edited: 0004947 
2020-08-28 17:07 shware_systems Note Added: 0004948  
2020-08-28 17:52 kreNote Added: 0004949  
2020-08-28 22:42 philip-guentherNote Added: 0004952  
2020-08-28 22:52 philip-guentherNote Added: 0004953  
2020-08-29 11:34 shware_systems Note Added: 0004955  
2020-08-29 11:34 shware_systems Note Added: 0004956  
2020-08-29 11:41 shware_systems Note Edited: 0004956 
2020-08-29 11:44 shware_systems Note Deleted: 0004955
2020-08-29 13:07 shware_systems Note Added: 0004957  
2020-08-29 13:11 shware_systems Note Edited: 0004957 
2020-08-30 23:06 philip-guentherNote Added: 0004958  
2020-08-30 23:11 philip-guentherNote Added: 0004959  
2020-08-30 23:44 shware_systems Note Added: 0004960  
2020-10-02 09:00 geoffclare Note Edited: 0004947 
2020-10-02 09:11 geoffclare Note Added: 0005016  
2020-10-03 12:53 denniswNote Added: 0005022  
2020-10-05 08:03 geoffclare Note Edited: 0004947 
2020-10-05 08:11 geoffclare Note Added: 0005023  
2020-10-07 08:44 geoffclare Note Edited: 0004947 
2020-10-07 08:47 geoffclare Note Added: 0005032  
2020-10-07 11:29 denniswNote Added: 0005033  
2020-10-07 13:23 geoffclare Note Added: 0005034  
2020-10-07 14:28 shware_systems Note Added: 0005036  
2020-10-08 16:32 geoffclare Note Edited: 0004947 
2020-10-08 16:33 geoffclare Note Edited: 0005034 
2020-10-23 14:38 geoffclare Note Added: 0005060  
2021-04-29 15:11 geoffclare Note Added: 0005332  
2021-04-29 15:11 geoffclare Interp Status => --- 
2021-04-29 15:11 

[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2021-04-29 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


A NOTE has been added to this issue. 
== 
https://austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: New
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2021-04-29 15:11 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

-- 
 (0005332) geoffclare (manager) - 2021-04-29 15:11
 https://austingroupbugs.net/view.php?id=697#c5332 
-- 
Make the changes from "Additional APIs for Issue 8, Part 1" (Austin/1110). 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947  
2020-08-28 08:27 geoffclare Note Edited: 0004947 
2020-08-28 17:07 shware_systems Note Added: 0004948  
2020-08-28 17:52 kreNote Added: 0004949  
2020-08-28 22:42 philip-guentherNote Added: 0004952  
2020-08-28 22:52 philip-guentherNote Added: 0004953  
2020-08-29 11:34 shware_systems Note Added: 0004955  
2020-08-29 11:34 shware_systems Note Added: 0004956  
2020-08-29 11:41 shware_systems Note Edited: 0004956 
2020-08-29 11:44 shware_systems Note Deleted: 0004955
2020-08-29 13:07 shware_systems Note Added: 0004957  
2020-08-29 13:11 shware_systems Note Edited: 0004957 
2020-08-30 23:06 philip-guentherNote Added: 0004958  
2020-08-30 23:11 philip-guentherNote Added: 0004959  
2020-08-30 23:44 shware_systems Note Added: 0004960  
2020-10-02 09:00 geoffclare Note Edited: 0004947 
2020-10-02 09:11 geoffclare Note Added: 0005016  
2020-10-03 12:53 denniswNote Added: 0005022  
2020-10-05 08:03 geoffclare Note Edited: 0004947 
2020-10-05 08:11 geoffclare Note Added: 0005023  
2020-10-07 08:44 geoffclare Note Edited: 0004947 
2020-10-07 08:47 geoffclare Note Added: 0005032  
2020-10-07 11:29 denniswNote Added: 0005033  
2020-10-07 13:23 geoffclare Note Added: 0005034  
2020-10-07 14:28 shware_systems Note Added: 0005036  
2020-10-08 16:32 geoffclare Note Edited: 0004947 
2020-10-08 16:33 geoffclare Note Edited: 0005034 
2020-10-23 14:38 geoffclare Note Added: 0005060   

[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-10-23 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


A NOTE has been added to this issue. 
== 
https://austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: New
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2020-10-23 14:38 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

-- 
 (0005060) geoffclare (manager) - 2020-10-23 14:38
 https://austingroupbugs.net/view.php?id=697#c5060 
-- 
The posix_getdents() addition has been made in the Issue8NewAPIs branch in
gitlab, based on https://austingroupbugs.net/view.php?id=697#c4947. 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947  
2020-08-28 08:27 geoffclare Note Edited: 0004947 
2020-08-28 17:07 shware_systems Note Added: 0004948  
2020-08-28 17:52 kreNote Added: 0004949  
2020-08-28 22:42 philip-guentherNote Added: 0004952  
2020-08-28 22:52 philip-guentherNote Added: 0004953  
2020-08-29 11:34 shware_systems Note Added: 0004955  
2020-08-29 11:34 shware_systems Note Added: 0004956  
2020-08-29 11:41 shware_systems Note Edited: 0004956 
2020-08-29 11:44 shware_systems Note Deleted: 0004955
2020-08-29 13:07 shware_systems Note Added: 0004957  
2020-08-29 13:11 shware_systems Note Edited: 0004957 
2020-08-30 23:06 philip-guentherNote Added: 0004958  
2020-08-30 23:11 philip-guentherNote Added: 0004959  
2020-08-30 23:44 shware_systems Note Added: 0004960  
2020-10-02 09:00 geoffclare Note Edited: 0004947 
2020-10-02 09:11 geoffclare Note Added: 0005016  
2020-10-03 12:53 denniswNote Added: 0005022  
2020-10-05 08:03 geoffclare Note Edited: 0004947 
2020-10-05 08:11 geoffclare Note Added: 0005023  
2020-10-07 08:44 geoffclare Note Edited: 0004947 
2020-10-07 08:47 geoffclare Note Added: 0005032  
2020-10-07 11:29 denniswNote Added: 0005033  
2020-10-07 13:23 geoffclare Note Added: 0005034  
2020-10-07 14:28 shware_systems Note Added: 0005036  
2020-10-08 16:32 geoffclare Note Edited: 0004947 
2020-10-08 16:33 geoffclare Note Edited: 0005034

Re: Overflow conditions for read() and fread() (was: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function)

2020-10-08 Thread Robert Elz via austin-group-l at The Open Group
Date:Thu, 8 Oct 2020 13:52:16 +
From:Wojtek Lerch 
Message-ID:  <136001a9769d49babe52348edeaf2...@blackberry.com>

  | Chapter and verse please?  :)

I don't do C standards stuff (a committee that does far too much invention
for my tastes) but as I understand it, a size_t is defined as being large
enough to hold the size of any definable object, and SIZE_MAX is its max
value, so an array bigger than SIZE_MAX should be impossible (typically
size_t is big enough for the entire VA space of the system, but I am not
sure if that is required).

  | Not necessarily.  An application can carefully prepare the file [...]

Yes, you're right, I didn't consider short reads.

  | Technically it's not an "overflow":

I speak (type) in the vernacular...

kre




RE: Overflow conditions for read() and fread() (was: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function)

2020-10-08 Thread Wojtek Lerch via austin-group-l at The Open Group
Robert Elz wrote:
>From:"Geoff Clare via austin-group-l at The Open Group" 
> 
> | Isn't an array whose total size exceeds SIZE_MAX bytes an impossibility?
>
> It is, [...]

Chapter and verse please?  :)

> Of course, any application that attempts to test this is going to result in a 
> copy into memory that cannot exist,

Not necessarily.  An application can carefully prepare the file to make sure 
that there isn't enough bytes in it to overflow its buffer, and then call 
fread() with arguments such that n*size is mathematically greater than SIZE_MAX 
but very small after conversion to size_t.  As far as I can tell, the C 
standard expects fread() to read all the bytes from the file in such 
circumstances, rather than reading just (size_t)(n*size) bytes or returning a 
bogus error.

> and so will get EFAULT or SIGSEGV or something similar - but that shouldn't 
> need any specific comment in the
> text, or not related to the n*size multiplication, as the same thing can 
> happen without any kind of overflow
> from that multiplication.

Technically it's not an "overflow":

"A computation involving unsigned operands can never overflow, because a result 
that cannot be represented by the resulting unsigned integer type is
reduced modulo the number that is one greater than the largest value that can 
be represented by the resulting type." (6.2.5#9)

--
This transmission (including any attachments) may contain confidential 
information, privileged material (including material protected by the 
solicitor-client or other applicable privileges), or constitute non-public 
information. Any use of this information by anyone other than the intended 
recipient is prohibited. If you have received this transmission in error, 
please immediately reply to the sender and delete this information from your 
system. Use, dissemination, distribution, or reproduction of this transmission 
by unintended recipients is not authorized and may be unlawful.



Re: Overflow conditions for read() and fread() (was: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function)

2020-10-08 Thread Robert Elz via austin-group-l at The Open Group
Date:Thu, 8 Oct 2020 09:49:26 +0100
From:"Geoff Clare via austin-group-l at The Open Group" 

Message-ID:  <20201008084926.GA19154@localhost>

  | Isn't an array whose total size exceeds SIZE_MAX bytes an impossibility?

It is, I think the point might have been that there's no need to say
anything about it, as there's no need for the implementation to actually
perform the multiplication, and by so doing produce an overflow.

Of course, any application that attempts to test this is going to result in
a copy into memory that cannot exist, and so will get EFAULT or SIGSEGV
or something similar - but that shouldn't need any specific comment in the
text, or not related to the n*size multiplication, as the same thing can
happen without any kind of overflow from that multiplication.

kre



RE: Overflow conditions for read() and fread() (was: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function)

2020-10-08 Thread Wojtek Lerch via austin-group-l at The Open Group
Geoff Clare wrote:
> Wojtek Lerch wrote, on 07 Oct 2020:
> > Geoff Clare wrote:
> > > For fread(), the return type is size_t not ssize_t, so it doesn't 
> > > have quite the same problem. The question is what should happen if 
> > > the mathematical product of the size and nitems arguments is greater 
> > > than SIZE_MAX.  POSIX defers to the C standard on this and there is 
> > > no reason for us to state anything specific about it.  (The C 
> > > standard is silent on the matter, so the behaviour is implicitly 
> > > undefined.)
> > 
> > The C standard is not silent on the matter -- it just does not treat 
> > it as a special case. The way fread() is specified does not depend on 
> > the total size of the array not exceeding SIZE_MAX bytes.
>
> Isn't an array whose total size exceeds SIZE_MAX bytes an impossibility?

I don't know of any text in the C standard that forbids it or makes it 
undefined.
 
> Or are you claiming it's possible to define such an array, you just can't use 
> sizeof on it?

My understanding is that apart from implementation limits, there's nothing in 
the C standard that forbids constructing types, declaring objects, or asking 
calloc() to allocate objects, whose size is greater than SIZE_MAX.  The 
requirements of the sizeof operator when applied to such types or objects are 
of course impossible to satisfy, and that fact is supposed to imply undefined 
behavior (even though specifying impossible requirements is not the same as 
"omission of any explicit definition of behavior"); but that's an issue with 
the sizeof operator, not with sizes of objects or types.

typedef char BIGARR[ SIZE_MAX ][ SIZE_MAX ]; // Allowed, as far as I can tell
sizeof(BIGARR);  // undefined
BIGARR bigarr; // Potentially okay, subject to implementation limits
sizeof(bigarr); // undefined
calloc(SIZE_MAX, SIZE_MAX);  // Defined, but of course may fail

I have always thought that the reason calloc() and fread() use two integers to 
specify a number of bytes was to allow sizes greater than SIZE_MAX (or 
originally INT_MAX) -- what else could it be?

--
This transmission (including any attachments) may contain confidential 
information, privileged material (including material protected by the 
solicitor-client or other applicable privileges), or constitute non-public 
information. Any use of this information by anyone other than the intended 
recipient is prohibited. If you have received this transmission in error, 
please immediately reply to the sender and delete this information from your 
system. Use, dissemination, distribution, or reproduction of this transmission 
by unintended recipients is not authorized and may be unlawful.



Re: Overflow conditions for read() and fread() (was: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function)

2020-10-08 Thread Geoff Clare via austin-group-l at The Open Group
Wojtek Lerch wrote, on 07 Oct 2020:
>
> Geoff Clare wrote:
> > For fread(), the return type is size_t not ssize_t, so it doesn't
> > have quite the same problem. The question is what should happen if
> > the mathematical product of the size and nitems arguments is greater
> > than SIZE_MAX.  POSIX defers to the C standard on this and there is no
> > reason for us to state anything specific about it.  (The C standard
> > is silent on the matter, so the behaviour is implicitly undefined.)
> 
> The C standard is not silent on the matter -- it just does not treat
> it as a special case. The way fread() is specified does not depend on
> the total size of the array not exceeding SIZE_MAX bytes.

Isn't an array whose total size exceeds SIZE_MAX bytes an impossibility?

Or are you claiming it's possible to define such an array, you just
can't use sizeof on it?

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



RE: Overflow conditions for read() and fread() (was: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function)

2020-10-07 Thread Wojtek Lerch via austin-group-l at The Open Group
Geoff Clare wrote:
> For fread(), the return type is size_t not ssize_t, so it doesn't
> have quite the same problem. The question is what should happen if
> the mathematical product of the size and nitems arguments is greater
> than SIZE_MAX.  POSIX defers to the C standard on this and there is no
> reason for us to state anything specific about it.  (The C standard
> is silent on the matter, so the behaviour is implicitly undefined.)

The C standard is not silent on the matter -- it just does not treat it as a 
special case.  The way fread() is specified does not depend on the total size 
of the array not exceeding SIZE_MAX bytes.

--
This transmission (including any attachments) may contain confidential 
information, privileged material (including material protected by the 
solicitor-client or other applicable privileges), or constitute non-public 
information. Any use of this information by anyone other than the intended 
recipient is prohibited. If you have received this transmission in error, 
please immediately reply to the sender and delete this information from your 
system. Use, dissemination, distribution, or reproduction of this transmission 
by unintended recipients is not authorized and may be unlawful.



RE: Overflow conditions for read() and fread() (was: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function)

2020-10-07 Thread shwaresyst via austin-group-l at The Open Group

The C standard leaves it undefined for fread() because it doesn't require 
EOVERFLOW in , that I see, or presumes size_t will always be a short 
or int type. Since POSIX does have it and does not presume a limited width I 
feel this is a place where a CX extension is warranted as a portability 
consideration.
On Wednesday, October 7, 2020 Geoff Clare via austin-group-l at The Open Group 
 wrote:
> -- 
>  (0005036) shware_systems (reporter) - 2020-10-07 14:28
>  https://austingroupbugs.net/view.php?id=697#c5036 
> -- 
> That is an error in read(), and fread() as well; that these should have
> that case also as a may fail type.

The above was in reply to my note about posix_getdents() EOVERFLOW
that said:

    This set me thinking about why that part of the EOVERFLOW error is
    there at all. There is no equivalent EOVERFLOW for read(), nor
    should there be.

I continue to believe that for read() there should not be an EOVERFLOW
error.  There is absolutely no reason for read() to fail when it could
instead successfully return SSIZE_MAX bytes.  Perhaps we should add a
statement:

    If nbyte is great than SSIZE_MAX, read() shall
    behave as if nbyte had the value SSIZE_MAX.

For fread(), the return type is size_t not ssize_t, so it doesn't
have quite the same problem. The question is what should happen if
the mathematical product of the size and nitems arguments is greater
than SIZE_MAX.  POSIX defers to the C standard on this and there is no
reason for us to state anything specific about it.  (The C standard
is silent on the matter, so the behaviour is implicitly undefined.)

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Overflow conditions for read() and fread() (was: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function)

2020-10-07 Thread Geoff Clare via austin-group-l at The Open Group
> -- 
>  (0005036) shware_systems (reporter) - 2020-10-07 14:28
>  https://austingroupbugs.net/view.php?id=697#c5036 
> -- 
> That is an error in read(), and fread() as well; that these should have
> that case also as a may fail type.

The above was in reply to my note about posix_getdents() EOVERFLOW
that said:

This set me thinking about why that part of the EOVERFLOW error is
there at all. There is no equivalent EOVERFLOW for read(), nor
should there be.

I continue to believe that for read() there should not be an EOVERFLOW
error.  There is absolutely no reason for read() to fail when it could
instead successfully return SSIZE_MAX bytes.  Perhaps we should add a
statement:

If nbyte is great than SSIZE_MAX, read() shall
behave as if nbyte had the value SSIZE_MAX.

For fread(), the return type is size_t not ssize_t, so it doesn't
have quite the same problem. The question is what should happen if
the mathematical product of the size and nitems arguments is greater
than SIZE_MAX.  POSIX defers to the C standard on this and there is no
reason for us to state anything specific about it.  (The C standard
is silent on the matter, so the behaviour is implicitly undefined.)

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-10-07 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


A NOTE has been added to this issue. 
== 
https://austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: New
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2020-10-07 14:28 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

-- 
 (0005036) shware_systems (reporter) - 2020-10-07 14:28
 https://austingroupbugs.net/view.php?id=697#c5036 
-- 
That is an error in read(), and fread() as well; that these should have
that case also as a may fail type. I remember this being discussed for some
other interface where an argument was size_t or ssize_t but return was int.
If an implementation has a SSIZE_MAX that needs a long long int to
represent it, a value larger than INT_MAX is a plausible desired return
value but is not representable. Similar holds when SIZE_MAX is larger than
SSIZE_MAX. This was a non-issue when no platform had more than 2gb RAM, for
int as a 32 bit type, but is one now. 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947  
2020-08-28 08:27 geoffclare Note Edited: 0004947 
2020-08-28 17:07 shware_systems Note Added: 0004948  
2020-08-28 17:52 kreNote Added: 0004949  
2020-08-28 22:42 philip-guentherNote Added: 0004952  
2020-08-28 22:52 philip-guentherNote Added: 0004953  
2020-08-29 11:34 shware_systems Note Added: 0004955  
2020-08-29 11:34 shware_systems Note Added: 0004956  
2020-08-29 11:41 shware_systems Note Edited: 0004956 
2020-08-29 11:44 shware_systems Note Deleted: 0004955
2020-08-29 13:07 shware_systems Note Added: 0004957  
2020-08-29 13:11 shware_systems Note Edited: 0004957 
2020-08-30 23:06 philip-guentherNote Added: 0004958  
2020-08-30 23:11 philip-guentherNote Added: 0004959  
2020-08-30 23:44 shware_systems Note Added: 0004960  
2020-10-02 09:00 geoffclare Note Edited: 0004947 
2020-10-02 09:11 geoffclare Note Added: 0005016  
2020-10-03 12:53 denniswNote Added: 0005022  
2020-10-05 08:03 geoffclare Note Edited: 0004947 
2020-10-05 08:11 geoffclare Note Added: 0005023  
2020-10-07 08:44 geoffclare Note Edited: 0004947 
2020-10-07 08:47 geoffclare Note Added: 0005032 

[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-10-07 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


A NOTE has been added to this issue. 
== 
https://austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: New
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2020-10-07 13:23 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

-- 
 (0005034) geoffclare (manager) - 2020-10-07 13:23
 https://austingroupbugs.net/view.php?id=697#c5034 
-- 
Re https://austingroupbugs.net/view.php?id=697#c5033 This set me thinking about
why that part of the EOVERFLOW
error is there at all. There is no equivalent EOVERFLOW for read(), nor
should there be.

I think if posix_getdents() reaches the point where adding another entry to
the buffer would make its length unrepresentable in the return type, then
it should return (successfully) at that point. I.e. the equivalent of a
short read().

So I think the EOVERFLOW error should be changed to just:One of
the values in a structure to be placed in buf cannot be represented
correctly. 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947  
2020-08-28 08:27 geoffclare Note Edited: 0004947 
2020-08-28 17:07 shware_systems Note Added: 0004948  
2020-08-28 17:52 kreNote Added: 0004949  
2020-08-28 22:42 philip-guentherNote Added: 0004952  
2020-08-28 22:52 philip-guentherNote Added: 0004953  
2020-08-29 11:34 shware_systems Note Added: 0004955  
2020-08-29 11:34 shware_systems Note Added: 0004956  
2020-08-29 11:41 shware_systems Note Edited: 0004956 
2020-08-29 11:44 shware_systems Note Deleted: 0004955
2020-08-29 13:07 shware_systems Note Added: 0004957  
2020-08-29 13:11 shware_systems Note Edited: 0004957 
2020-08-30 23:06 philip-guentherNote Added: 0004958  
2020-08-30 23:11 philip-guentherNote Added: 0004959  
2020-08-30 23:44 shware_systems Note Added: 0004960  
2020-10-02 09:00 geoffclare Note Edited: 0004947 
2020-10-02 09:11 geoffclare Note Added: 0005016  
2020-10-03 12:53 denniswNote Added: 0005022  
2020-10-05 08:03 geoffclare Note Edited: 0004947 
2020-10-05 08:11 geoffclare Note Added: 0005023  
2020-10-07 08:44 geoffclare Note Edited: 0004947 
2020-10-07 08:47 geoffclare 

[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-10-07 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


A NOTE has been added to this issue. 
== 
https://www.austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: New
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2020-10-07 11:29 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

-- 
 (0005033) dennisw (reporter) - 2020-10-07 11:29
 https://www.austingroupbugs.net/view.php?id=697#c5033 
-- 
I noticed another issue:
The EOVERFLOW error requires the function to fail if the result would be
larger than INT_MAX. But the return type was changed to ssize_t, so this
error needs to be updated. 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947  
2020-08-28 08:27 geoffclare Note Edited: 0004947 
2020-08-28 17:07 shware_systems Note Added: 0004948  
2020-08-28 17:52 kreNote Added: 0004949  
2020-08-28 22:42 philip-guentherNote Added: 0004952  
2020-08-28 22:52 philip-guentherNote Added: 0004953  
2020-08-29 11:34 shware_systems Note Added: 0004955  
2020-08-29 11:34 shware_systems Note Added: 0004956  
2020-08-29 11:41 shware_systems Note Edited: 0004956 
2020-08-29 11:44 shware_systems Note Deleted: 0004955
2020-08-29 13:07 shware_systems Note Added: 0004957  
2020-08-29 13:11 shware_systems Note Edited: 0004957 
2020-08-30 23:06 philip-guentherNote Added: 0004958  
2020-08-30 23:11 philip-guentherNote Added: 0004959  
2020-08-30 23:44 shware_systems Note Added: 0004960  
2020-10-02 09:00 geoffclare Note Edited: 0004947 
2020-10-02 09:11 geoffclare Note Added: 0005016  
2020-10-03 12:53 denniswNote Added: 0005022  
2020-10-05 08:03 geoffclare Note Edited: 0004947 
2020-10-05 08:11 geoffclare Note Added: 0005023  
2020-10-07 08:44 geoffclare Note Edited: 0004947 
2020-10-07 08:47 geoffclare Note Added: 0005032  
2020-10-07 11:29 denniswNote Added: 0005033  
==




[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-10-07 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


A NOTE has been added to this issue. 
== 
https://austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: New
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2020-10-07 08:47 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

-- 
 (0005032) geoffclare (manager) - 2020-10-07 08:47
 https://austingroupbugs.net/view.php?id=697#c5032 
-- 
https://austingroupbugs.net/view.php?id=697#c4947 has been updated with changes
agreed in the Oct 5th
teleconference, changing two occurrences of "that reads the entire
directory" to "that reads from offset zero to end-of-file". 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947  
2020-08-28 08:27 geoffclare Note Edited: 0004947 
2020-08-28 17:07 shware_systems Note Added: 0004948  
2020-08-28 17:52 kreNote Added: 0004949  
2020-08-28 22:42 philip-guentherNote Added: 0004952  
2020-08-28 22:52 philip-guentherNote Added: 0004953  
2020-08-29 11:34 shware_systems Note Added: 0004955  
2020-08-29 11:34 shware_systems Note Added: 0004956  
2020-08-29 11:41 shware_systems Note Edited: 0004956 
2020-08-29 11:44 shware_systems Note Deleted: 0004955
2020-08-29 13:07 shware_systems Note Added: 0004957  
2020-08-29 13:11 shware_systems Note Edited: 0004957 
2020-08-30 23:06 philip-guentherNote Added: 0004958  
2020-08-30 23:11 philip-guentherNote Added: 0004959  
2020-08-30 23:44 shware_systems Note Added: 0004960  
2020-10-02 09:00 geoffclare Note Edited: 0004947 
2020-10-02 09:11 geoffclare Note Added: 0005016  
2020-10-03 12:53 denniswNote Added: 0005022  
2020-10-05 08:03 geoffclare Note Edited: 0004947 
2020-10-05 08:11 geoffclare Note Added: 0005023  
2020-10-07 08:44 geoffclare Note Edited: 0004947 
2020-10-07 08:47 geoffclare Note Added: 0005032  
==




[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-10-05 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


A NOTE has been added to this issue. 
== 
https://austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: New
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2020-10-05 08:11 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

-- 
 (0005023) geoffclare (manager) - 2020-10-05 08:11
 https://austingroupbugs.net/view.php?id=697#c5023 
-- 
Re https://austingroupbugs.net/view.php?id=697#c5022, the condition in the last
paragraph of DESCRIPTION, "If a
sequence of calls to posix_getdents() is made that reads the entire
directory" is intended to cover the case of lseek() to the beginning then
reading to end-of-file. (It was based on the readdir() page saying "after
the most recent call to opendir() or rewinddir()".) Perhaps this should be
clarified by changing it to:If a sequence of calls to
posix_getdents() is made that reads  from offset 0 to end-of-file,
...
Thanks for catching the extraneous parentheses after fildes - I have
removed them from https://austingroupbugs.net/view.php?id=697#c4947. 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947  
2020-08-28 08:27 geoffclare Note Edited: 0004947 
2020-08-28 17:07 shware_systems Note Added: 0004948  
2020-08-28 17:52 kreNote Added: 0004949  
2020-08-28 22:42 philip-guentherNote Added: 0004952  
2020-08-28 22:52 philip-guentherNote Added: 0004953  
2020-08-29 11:34 shware_systems Note Added: 0004955  
2020-08-29 11:34 shware_systems Note Added: 0004956  
2020-08-29 11:41 shware_systems Note Edited: 0004956 
2020-08-29 11:44 shware_systems Note Deleted: 0004955
2020-08-29 13:07 shware_systems Note Added: 0004957  
2020-08-29 13:11 shware_systems Note Edited: 0004957 
2020-08-30 23:06 philip-guentherNote Added: 0004958  
2020-08-30 23:11 philip-guentherNote Added: 0004959  
2020-08-30 23:44 shware_systems Note Added: 0004960  
2020-10-02 09:00 geoffclare Note Edited: 0004947 
2020-10-02 09:11 geoffclare Note Added: 0005016  
2020-10-03 12:53 denniswNote Added: 0005022  
2020-10-05 08:03 geoffclare Note Edited: 0004947 
2020-10-05 08:11 geoffclare Note Added: 0005023  

[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-10-03 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


A NOTE has been added to this issue. 
== 
https://www.austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: New
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2020-10-03 12:53 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

-- 
 (0005022) dennisw (reporter) - 2020-10-03 12:53
 https://www.austingroupbugs.net/view.php?id=697#c5022 
-- 
I think the specification should make it explicit under which circumstances
(if any) the application can rely on posix_getdents() to return the current
state of the directory.
Right now it only says that it is unspecified when a sequence of
posix_getdents() calls reads the entire directory.
Is posix_getdents() guaranteed to return the current state after an lseek()
to the beginning?

Also in the second last paragraph of the DESCRIPTION, there should be no
parentheses after fildes. 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947  
2020-08-28 08:27 geoffclare Note Edited: 0004947 
2020-08-28 17:07 shware_systems Note Added: 0004948  
2020-08-28 17:52 kreNote Added: 0004949  
2020-08-28 22:42 philip-guentherNote Added: 0004952  
2020-08-28 22:52 philip-guentherNote Added: 0004953  
2020-08-29 11:34 shware_systems Note Added: 0004955  
2020-08-29 11:34 shware_systems Note Added: 0004956  
2020-08-29 11:41 shware_systems Note Edited: 0004956 
2020-08-29 11:44 shware_systems Note Deleted: 0004955
2020-08-29 13:07 shware_systems Note Added: 0004957  
2020-08-29 13:11 shware_systems Note Edited: 0004957 
2020-08-30 23:06 philip-guentherNote Added: 0004958  
2020-08-30 23:11 philip-guentherNote Added: 0004959  
2020-08-30 23:44 shware_systems Note Added: 0004960  
2020-10-02 09:00 geoffclare Note Edited: 0004947 
2020-10-02 09:11 geoffclare Note Added: 0005016  
2020-10-03 12:53 denniswNote Added: 0005022  
==




[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-10-02 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


A NOTE has been added to this issue. 
== 
https://austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: New
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2020-10-02 09:11 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

-- 
 (0005016) geoffclare (manager) - 2020-10-02 09:11
 https://austingroupbugs.net/view.php?id=697#c5016 
-- 
https://austingroupbugs.net/view.php?id=697#c4947 has been updated with changes
agreed in the Oct 1st
teleconference. 

Notable changes are:

Return type of posix_getdents() is now ssize_t (to match read()) and
 is required to define that type.

The last entry in buf must have a d_reclen that includes any needed padding
for alignment. (So applications can grow the buffer and append to it
without needing to do an alignment calculation.)

The wording of the condition for d_type not being DT_UNKNOWN is now "if the
file type can be determined without needing to use the file serial number
to obtain the file's metadata".

Added a paragraph about concurrent file operations.

Knock-on effects of d_name[] possibly being a flexible array member, where
the size of the structure is mentioned and in the first para of APPLICATION
USAGE.

Added a paragraph to RATIONALE about posix_getdents() not being allowed to
return directory entry structures for deleted directory entries. 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947  
2020-08-28 08:27 geoffclare Note Edited: 0004947 
2020-08-28 17:07 shware_systems Note Added: 0004948  
2020-08-28 17:52 kreNote Added: 0004949  
2020-08-28 22:42 philip-guentherNote Added: 0004952  
2020-08-28 22:52 philip-guentherNote Added: 0004953  
2020-08-29 11:34 shware_systems Note Added: 0004955  
2020-08-29 11:34 shware_systems Note Added: 0004956  
2020-08-29 11:41 shware_systems Note Edited: 0004956 
2020-08-29 11:44 shware_systems Note Deleted: 0004955
2020-08-29 13:07 shware_systems Note Added: 0004957  
2020-08-29 13:11 shware_systems Note Edited: 0004957 
2020-08-30 23:06 philip-guentherNote Added: 0004958  
2020-08-30 23:11 philip-guentherNote Added: 0004959  
2020-08-30 23:44 shware_systems Note Added: 0004960  
2020-10-02 09:00 geoffclare Note Edited: 0004947 
2020-10-02 09:11 

Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-02 Thread Wojtek Lerch via austin-group-l at The Open Group
ly:sans-serif;}#yiv9121566835
> p.yiv9121566835msonormal, #yiv9121566835 li.yiv9121566835msonormal,
> #yiv9121566835 div.yiv9121566835msonormal {margin-right:0cm;margin-
> left:0cm;font-size:11.0pt;font-family:sans-serif;}#yiv9121566835
> p.yiv9121566835msonospacing1, #yiv9121566835
> li.yiv9121566835msonospacing1, #yiv9121566835
> div.yiv9121566835msonospacing1 {margin-right:0cm;margin-
> left:0cm;font-size:11.0pt;font-family:sans-serif;}#yiv9121566835
> p.yiv9121566835msonormal4, #yiv9121566835 li.yiv9121566835msonormal4,
> #yiv9121566835 div.yiv9121566835msonormal4 {margin-right:0cm;margin-
> left:0cm;font-size:11.0pt;font-family:sans-serif;}#yiv9121566835
> p.yiv9121566835msonormal31, #yiv9121566835
> li.yiv9121566835msonormal31, #yiv9121566835
> div.yiv9121566835msonormal31 {margin-right:0cm;margin-left:0cm;font-
> size:11.0pt;font-family:sans-serif;}#yiv9121566835
> span.yiv9121566835EmailStyle36 {font-family:New
> serif;color:windowtext;}#yiv9121566835
> span.yiv9121566835PlainTextChar {font-family:sans-
> serif;}#yiv9121566835 .yiv9121566835MsoChpDefault {font-size:10.0pt;} 
> _filtered {}#yiv9121566835 div.yiv9121566835WordSection1
> {}#yiv9121566835 
> Yes I made the flexible member a "short" on purpose -- I wanted that
> byte of padding before the flexible array.
>  
>   
>  
> No, the sizeof can't be 5 or 6 unless the implementation is okay with
> unaligned access.  If I declare an array of these structs, the int32
> inside each element needs to be aligned to a multiple of 4 --
> therefore the size of the struct must be a multiple of 4 as well. 
> The same applies to a struct without a flexible member.
>  
>   
>  
> No, the requirements on sizeof have nothing to do with how many flex
> members are "present".  All that is required is that the sizeof is
> either the same as it would be for a struct without the flexible
> member (which is still 8, on any implementation that requires
> alignment), or greater, if the struct requires more padding
> (presumably also for alignment).  Apart from that, the C standard
> says nothing about whether there's enough room between the offsetof
> and the sizeof for one or more elements of the flexible array.
>  
>   
>  
> What you described with malloc() has nothing to do with what the C
> standard refers to as “padding”.
>  
>   
>  
> Also, while I understand the need to page-align data structures in
> some situations, I still don’t see its relevance to a discussion of
> the C standard’s requirements regarding padding in struct types and
> how it’s affected by flexible arrays.
>  
>   
>  
> From: shwaresyst <mailto:shwares...@aol.com> 
> Sent: September 2, 2020 1:58 PM
> To: Wojtek Lerch <mailto:wle...@blackberry.com>; 
> mailto:austin-group-l@opengroup.org
> Subject: RE: [1003.1(2013)/Issue7+TC1 697]: Adding of a
> getdirentries() function
>  
>   
>  
> That example still has a byte of added padding, or the offsetof would
> be 5. The sizeof value is just incorrect, as it assumes one flex
> member is present. It should be 5 or 6, and which is the required
> value is what is ambiguous.
>  
> As you say, these are used most often with malloc(). Padding after
> the array is usually an artifact of this operation. You do a
> malloc(12) and you may get 16 or 32 bytes actually allocated. Mapping
> this as a short s[] an application can safely access s[5], but a
> compiler may not block an access to s[7] too, in that the memory for
> it is allocated. You map a long long l[] and you can only access l[0]
> safely, the remaining 4 bytes out of the 12 plus what malloc adds are
> tail padding, but a compiler may allow an l[1] access because the
> total allocated permits it.
>  
> I mentioned page aligned because when you are buffering multiple
> sectors directly from media the malloc()s for these will usually be
> in multiples of pages, and efficient management of these happens when
> these don't straddle pages so are page aligned too. Such isn't
> required by the standard, but it's common enough as desirable
> aligned_alloc() was added. As I've seen no one use FLA as an acronym
> for flexible array, I consider VLA as applying to any array of
> indeterminate size, sorry if this confuses anyone.
>  
>   
>  
> On Tuesday, September 1, 2020 Wojtek Lerch <mailto:wle...@blackberry.com>
> wrote:
>  
> My understanding is that they meant to allow an implementation where
>  “struct a { int32_t x; char y; short flex[]; }”  produces
>  sizeof(struct a)==8  but  offsetof(struct a,flex)==6.
>  
>  
>  
> I don’t like that they talk about padding “after” the flexible member
> – since the flexible array has a flexible size, rather than a

RE: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-02 Thread shwaresyst via austin-group-l at The Open Group

No, it does not need to be aligned to a multiple of 4, except on some lame RISC 
architectures. The logical model is unaligned accesses are always permitted; 
aligned accesses are the exception, not the rule. This is why the language is 
padding bytes may be added, not shall be added. The standard expects 
applications to use int_fastN_t or int_leastN_t types if it wants to take 
advantage of platform specific alignment optimizations. The allocation 
functions only recently added the only alignment requirement, namely any 
pointer returned be aligned for an access to an intmax_t value, and the region 
be minimally sizeof(intmax_t) in length.
On Wednesday, September 2, 2020 Wojtek Lerch  wrote:
#yiv9121566835 #yiv9121566835 -- _filtered {} _filtered {} _filtered 
{}#yiv9121566835 #yiv9121566835 p.yiv9121566835MsoNormal, #yiv9121566835 
li.yiv9121566835MsoNormal, #yiv9121566835 div.yiv9121566835MsoNormal 
{margin:0cm;margin-bottom:.0001pt;font-size:11.0pt;font-family:sans-serif;}#yiv9121566835
 a:link, #yiv9121566835 span.yiv9121566835MsoHyperlink 
{color:blue;text-decoration:underline;}#yiv9121566835 
p.yiv9121566835MsoPlainText, #yiv9121566835 li.yiv9121566835MsoPlainText, 
#yiv9121566835 div.yiv9121566835MsoPlainText 
{margin:0cm;margin-bottom:.0001pt;font-size:11.0pt;font-family:sans-serif;}#yiv9121566835
 p.yiv9121566835msonormal, #yiv9121566835 li.yiv9121566835msonormal, 
#yiv9121566835 div.yiv9121566835msonormal 
{margin-right:0cm;margin-left:0cm;font-size:11.0pt;font-family:sans-serif;}#yiv9121566835
 p.yiv9121566835msonospacing1, #yiv9121566835 li.yiv9121566835msonospacing1, 
#yiv9121566835 div.yiv9121566835msonospacing1 
{margin-right:0cm;margin-left:0cm;font-size:11.0pt;font-family:sans-serif;}#yiv9121566835
 p.yiv9121566835msonormal4, #yiv9121566835 li.yiv9121566835msonormal4, 
#yiv9121566835 div.yiv9121566835msonormal4 
{margin-right:0cm;margin-left:0cm;font-size:11.0pt;font-family:sans-serif;}#yiv9121566835
 p.yiv9121566835msonormal31, #yiv9121566835 li.yiv9121566835msonormal31, 
#yiv9121566835 div.yiv9121566835msonormal31 
{margin-right:0cm;margin-left:0cm;font-size:11.0pt;font-family:sans-serif;}#yiv9121566835
 span.yiv9121566835EmailStyle36 {font-family:New 
serif;color:windowtext;}#yiv9121566835 span.yiv9121566835PlainTextChar 
{font-family:sans-serif;}#yiv9121566835 .yiv9121566835MsoChpDefault 
{font-size:10.0pt;} _filtered {}#yiv9121566835 div.yiv9121566835WordSection1 
{}#yiv9121566835 
Yes I made the flexible member a "short" on purpose -- I wanted that byte of 
padding before the flexible array.
 
  
 
No, the sizeof can't be 5 or 6 unless the implementation is okay with unaligned 
access.  If I declare an array of these structs, the int32 inside each element 
needs to be aligned to a multiple of 4 -- therefore the size of the struct must 
be a multiple of 4 as well.  The same applies to a struct without a flexible 
member.
 
  
 
No, the requirements on sizeof have nothing to do with how many flex members 
are "present".  All that is required is that the sizeof is either the same as 
it would be for a struct without the flexible member (which is still 8, on any 
implementation that requires alignment), or greater, if the struct requires 
more padding (presumably also for alignment).  Apart from that, the C standard 
says nothing about whether there's enough room between the offsetof and the 
sizeof for one or more elements of the flexible array.
 
  
 
What you described with malloc() has nothing to do with what the C standard 
refers to as “padding”.
 
  
 
Also, while I understand the need to page-align data structures in some 
situations, I still don’t see its relevance to a discussion of the C standard’s 
requirements regarding padding in struct types and how it’s affected by 
flexible arrays.
 
  
 
From: shwaresyst  
Sent: September 2, 2020 1:58 PM
To: Wojtek Lerch ; austin-group-l@opengroup.org
Subject: RE: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() 
function
 
  
 
That example still has a byte of added padding, or the offsetof would be 5. The 
sizeof value is just incorrect, as it assumes one flex member is present. It 
should be 5 or 6, and which is the required value is what is ambiguous.
 
As you say, these are used most often with malloc(). Padding after the array is 
usually an artifact of this operation. You do a malloc(12) and you may get 16 
or 32 bytes actually allocated. Mapping this as a short s[] an application can 
safely access s[5], but a compiler may not block an access to s[7] too, in that 
the memory for it is allocated. You map a long long l[] and you can only access 
l[0] safely, the remaining 4 bytes out of the 12 plus what malloc adds are tail 
padding, but a compiler may allow an l[1] access because the total allocated 
permits it.
 
I mentioned page aligned because when you are buffering multiple sectors 
directly from media the malloc()s for these will usually be in multiples of 
pages, and efficient managem

Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-02 Thread Steffen Nurpmeso via austin-group-l at The Open Group
Hallo Jörg.

Joerg Schilling wrote in
 <5f4fabb0.NZ6ZB9gXMVdfs/6x%joerg.schill...@fokus.fraunhofer.de>:
 |Steffen Nurpmeso via austin-group-l at The Open Group  wrote:
 |
 |> I personally would say that these should be skipped.  The data is
 |> copied over to user buffers, and these entries are simply not
 |> copied.  That seems to be the best.  The Group does not seem to
 |> want to add DT_WHITEOUT or similar things.
 |
 |A nice idea from 1986 from SunOS-3.5 that did not make it into SVr4...
 |
 |The question is whether this is POSIX compliant at all. If you like to see
 |such eintries, I would expect that you need to open() the directory with
 |a specific open flag first.
 |
 |So my questtions:
 |
 |- When do you see such entries?
 |
 |- What happens when you stat() such a name?

These are good questions.  I have never seen them myself, i never
used union mounts (on *BSD).  In the tracker discussion you will
find myself digging through FreeBSD C library code, and i think it
was no good.  To answer your questions, i seem to recall that
whiteout is used on union mounts, and i think if you try to stat
one of those their overlay that exists in an upper layer is found
instead.

The Plan9 operating system of Bell Labs as developed by the real,
real heroes of the scene makes (made, but for the still existing
and continued 9front fork) heavy use of such bind mounts.  The
Plan9Port code base which makes lots of this code available on
POSIX systems makes use of the getdents/direntries system calls
for its directory listings, that much i remember.  I had to look
how _they_ handle such entries in their bind mounts.

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-02 Thread Steffen Nurpmeso via austin-group-l at The Open Group
Philip Guenther via austin-group-l at The Open Group wrote in
 :
 |On Tue, Sep 1, 2020 at 1:20 PM Steffen Nurpmeso  wrote:
 |> Philip Guenther wrote in
 |>  :
 |>|On Tue, Sep 1, 2020 at 6:22 AM Steffen Nurpmeso via austin-group-l at The
 |>|Open Group  wrote:
 |>|> Robert Elz via austin-group-l at The Open Group wrote in
 |>|>  <9252.1598969...@jinx.noi.kre.to>:
 |>|>|Date:Tue, 1 Sep 2020 10:32:55 +0100
 |>|>|From:"Geoff Clare via austin-group-l at The Open Group" \
 |>|>|
 |>|>|Message-ID:  <20200901093255.GA7629@localhost>
 |>|>   ...
 |>|>|What's more important is what happens if the application buffer isn't
 |>|>|big enough for the next entry.What do the existing getdents()
 |>|>|implementations do in that case?   If they're all the same then
 |>  ..
 |>|> Isn't that covered nicely by the posted text?  There must be space
 |>|> for at least one entry, otherwise EINVAL occurs?  And upon success
 |>  ..
 |>|A quick review of FreeBSD, NetBSD, and OpenBSD finds they all return
 |> EINVAL
 |>|if the buffer isn't "big enough".
 |>|For OpenBSD, the minimum buffer size is 512 bytes; if I'm reading it
 |>|correctly NetBSD is similar, possibly varying based on filesystem
 |>|formatting.
 |>|FreeBSD requires space for the next entry.
 |>
 |> They document that "the size must be greater than or equal to the
 |> block size associated with the file", but i cannot find this "next
 |> entry" of yours?
 |>
 |
 |What document are you quoting from?

  $ git show origin/master:lib/libc/sys/getdirentries.2|mandoc|less

It uses "next" only in conjunction with seeking or entry hopping.
But it is likely you used "next" to mean "at least one", then this
is just a misunderstanding.

 |I actually tested the behavior of FreeBSD 11.3 getdirentries(2), after
 |looking at the code.
 |The manpage on that system says:

..EINVAL, yes..

 |...
 |>|At least NetBSD and OpenBSD will return an entry with d_ino == 0 if the
 |>|first entry in a block is removed.  I suspect others may do this as well;
 |>|glibc at least includes code to skip such entries in its generic
 |> readdir()
 |>|implementation.
 |>|
 |>|The question really is "is this supposed to be a API that can be
 |> trivially
 |>|supported by all the existing versions, even if that makes it more clunky
 |>|to use, or should it be easy to use even if every single existing
 |>|implementation needs to bend?"
 |>
 |> But .. it is already supported by all, and it has always been
 |> used?!
 |
 |As I've been describing, that is not true for at least NetBSD and OpenBSD:
 | * they both require the buffer to be some size larger than a single entry
 | * they both can return entries with d_ino == 0 and d_name that doesn't
 |correspond
 |   to a file in the directory
 |
 |"The same, but different" means "NOT THE SAME".  Code that follows the
 |proposed description on those points would not behaved as expected if used
 |with NetBSD's or OpenBSD's getdents(2).

All i can say is that i would skip those entries.
There is no undeletion facility, so passing as much information as
possible about directory content is fine, but it must be somehow
useful in the end; and then specific programs which include very
specific headers and use very specific ioctls can do _that_ job.
My opinion.

  ...
 |>|If the former, then define a minimum buffer size
 |>|(pathconf(_PC_DIRBUFMIN)...?), permit d_ino==0 as entries where d_name
 |> and
 |>
 |> However there is _PC_NAME_MAX already, and so the number must be
 |> nearby, no?  Isn't that overengineering?
 |
 |The buffers required by NetBSD and OpenBSD are larger than and unrelated to
 |the value returned by pathconf(_PC_NAME_MAX) and instead related to a block
 |size of the filesystem.

In fact i personally would not call this function with less than
a page full of memory, conditionally more.  That reminds me that
it is a pity that there is no "get_usable_size" for a size and
"get_usable_size_ptr" (or so, overloading not in C) for malloc.

 |>|d_type are unspecified, and let d_name be either a fixed fix array or a
 |>|flexible array member.
 |>
 |> You know, my personal position would be to just skip those entries
 |> when copying data over to user buffers.  The costs of walking over
 |> the user buffer once (if it is done like that, and that memory
 |> should be hot even, then) seem to be low compared to collecting
 |> the directory entry information.
 |
 |Sure, I'm fine with this new API specifying that those deleted entries be
 |suppressed, but it's inconsistent to do that and then insist that d_name's
 |nature be unspecified to make the API compatible with existing
 |implementation, despite making it more annoying to program with.

As far as i understood the d_name nature was about reuse of
existing structures which use fixed-size names, and has nothing to
do with "currently" non-existing files, to which i count whiteouts
too, by the way.

 |>|If the latter, then require it to work with very small buffers, require
 |> all
 

RE: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-02 Thread Wojtek Lerch via austin-group-l at The Open Group
Yes I made the flexible member a "short" on purpose -- I wanted that byte of 
padding before the flexible array.



No, the sizeof can't be 5 or 6 unless the implementation is okay with unaligned 
access.  If I declare an array of these structs, the int32 inside each element 
needs to be aligned to a multiple of 4 -- therefore the size of the struct must 
be a multiple of 4 as well.  The same applies to a struct without a flexible 
member.



No, the requirements on sizeof have nothing to do with how many flex members 
are "present".  All that is required is that the sizeof is either the same as 
it would be for a struct without the flexible member (which is still 8, on any 
implementation that requires alignment), or greater, if the struct requires 
more padding (presumably also for alignment).  Apart from that, the C standard 
says nothing about whether there's enough room between the offsetof and the 
sizeof for one or more elements of the flexible array.



What you described with malloc() has nothing to do with what the C standard 
refers to as “padding”.



Also, while I understand the need to page-align data structures in some 
situations, I still don’t see its relevance to a discussion of the C standard’s 
requirements regarding padding in struct types and how it’s affected by 
flexible arrays.

From: shwaresyst 
Sent: September 2, 2020 1:58 PM
To: Wojtek Lerch ; austin-group-l@opengroup.org
Subject: RE: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() 
function


That example still has a byte of added padding, or the offsetof would be 5. The 
sizeof value is just incorrect, as it assumes one flex member is present. It 
should be 5 or 6, and which is the required value is what is ambiguous.

As you say, these are used most often with malloc(). Padding after the array is 
usually an artifact of this operation. You do a malloc(12) and you may get 16 
or 32 bytes actually allocated. Mapping this as a short s[] an application can 
safely access s[5], but a compiler may not block an access to s[7] too, in that 
the memory for it is allocated. You map a long long l[] and you can only access 
l[0] safely, the remaining 4 bytes out of the 12 plus what malloc adds are tail 
padding, but a compiler may allow an l[1] access because the total allocated 
permits it.

I mentioned page aligned because when you are buffering multiple sectors 
directly from media the malloc()s for these will usually be in multiples of 
pages, and efficient management of these happens when these don't straddle 
pages so are page aligned too. Such isn't required by the standard, but it's 
common enough as desirable aligned_alloc() was added. As I've seen no one use 
FLA as an acronym for flexible array, I consider VLA as applying to any array 
of indeterminate size, sorry if this confuses anyone.


On Tuesday, September 1, 2020 Wojtek Lerch 
mailto:wle...@blackberry.com>> wrote:

My understanding is that they meant to allow an implementation where  “struct a 
{ int32_t x; char y; short flex[]; }”  produces  sizeof(struct a)==8  but  
offsetof(struct a,flex)==6.



I don’t like that they talk about padding “after” the flexible member – since 
the flexible array has a flexible size, rather than a zero size, that padding 
really overlaps the beginning of the array.



Personally I think that the standard could be made clearer if a structure with 
a flexible member were considered an incomplete type.  You wouldn’t be allowed 
to apply sizeof to it at all, and you wouldn’t be able to declare objects whose 
type is the structure, but you could still use pointers to it and dereference 
members – since the main purpose of such structures is to allocate them via 
malloc(), I don’t think anybody would mind those restrictions.



Also, I don’t understand why struct s would need to be page aligned or why you 
mention a VLA.  A flexible array is not a VLA, in the sense C uses the term.



From: shwaresyst mailto:shwares...@aol.com>>
Sent: September 1, 2020 4:55 PM
To: Wojtek Lerch mailto:wle...@blackberry.com>>; 
austin-group-l@opengroup.org<mailto:austin-group-l@opengroup.org>
Subject: RE: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() 
function



What that refers to, it looks, is any tail padding for the structure as a 
whole. The standard still permits internal padding between individual fields as 
required, e.g. a struct s { short a; double b[] } might need 6 bytes of this 
padding to align access for b[0]. This would still be needed if b[] only has a 
few members as a VLA but s is being page aligned, and so would reserve a lot of 
tail padding too. There would be 2 padding regions, however, is what that 
change forces.





On Tuesday, September 1, 2020 Wojtek Lerch 
mailto:wle...@blackberry.com>> wrote:

Actually the intent was the opposite.  The original C99 did contain a wording 
that matches y

RE: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-02 Thread shwaresyst via austin-group-l at The Open Group

That example still has a byte of added padding, or the offsetof would be 5. The 
sizeof value is just incorrect, as it assumes one flex member is present. It 
should be 5 or 6, and which is the required value is what is ambiguous.


As you say, these are used most often with malloc(). Padding after the array is 
usually an artifact of this operation. You do a malloc(12) and you may get 16 
or 32 bytes actually allocated. Mapping this as a short s[] an application can 
safely access s[5], but a compiler may not block an access to s[7] too, in that 
the memory for it is allocated. You map a long long l[] and you can only access 
l[0] safely, the remaining 4 bytes out of the 12 plus what malloc adds are tail 
padding, but a compiler may allow an l[1] access because the total allocated 
permits it.

I mentioned page aligned because when you are buffering multiple sectors 
directly from media the malloc()s for these will usually be in multiples of 
pages, and efficient management of these happens when these don't straddle 
pages so are page aligned too. Such isn't required by the standard, but it's 
common enough as desirable aligned_alloc() was added. As I've seen no one use 
FLA as an acronym for flexible array, I consider VLA as applying to any array 
of indeterminate size, sorry if this confuses anyone.
On Tuesday, September 1, 2020 Wojtek Lerch  wrote:
#yiv4376059201 #yiv4376059201 -- _filtered {} _filtered {} _filtered 
{}#yiv4376059201 #yiv4376059201 p.yiv4376059201MsoNormal, #yiv4376059201 
li.yiv4376059201MsoNormal, #yiv4376059201 div.yiv4376059201MsoNormal 
{margin:0cm;margin-bottom:.0001pt;font-size:11.0pt;font-family:sans-serif;}#yiv4376059201
 a:link, #yiv4376059201 span.yiv4376059201MsoHyperlink 
{color:blue;text-decoration:underline;}#yiv4376059201 
p.yiv4376059201msonospacing, #yiv4376059201 li.yiv4376059201msonospacing, 
#yiv4376059201 div.yiv4376059201msonospacing 
{margin-right:0cm;margin-left:0cm;font-size:11.0pt;font-family:sans-serif;}#yiv4376059201
 p.yiv4376059201msonormal, #yiv4376059201 li.yiv4376059201msonormal, 
#yiv4376059201 div.yiv4376059201msonormal 
{margin-right:0cm;margin-left:0cm;font-size:11.0pt;font-family:sans-serif;}#yiv4376059201
 p.yiv4376059201msonormal3, #yiv4376059201 li.yiv4376059201msonormal3, 
#yiv4376059201 div.yiv4376059201msonormal3 
{margin-right:0cm;margin-left:0cm;font-size:11.0pt;font-family:sans-serif;}#yiv4376059201
 span.yiv4376059201EmailStyle33 {font-family:New 
serif;color:windowtext;}#yiv4376059201 .yiv4376059201MsoChpDefault 
{font-size:10.0pt;} _filtered {}#yiv4376059201 div.yiv4376059201WordSection1 
{}#yiv4376059201 
My understanding is that they meant to allow an implementation where  “struct a 
{ int32_t x; char y; short flex[]; }”  produces  sizeof(struct a)==8  but  
offsetof(struct a,flex)==6.
 
  
 
I don’t like that they talk about padding “after” the flexible member – since 
the flexible array has a flexible size, rather than a zero size, that padding 
really overlaps the beginning of the array.
 
  
 
Personally I think that the standard could be made clearer if a structure with 
a flexible member were considered an incomplete type.  You wouldn’t be allowed 
to applysizeof to it at all, and you wouldn’t be able to declare objects whose 
type is the structure, but you could still use pointers to it and dereference 
members – since the main purpose of such structures is to allocate them via 
malloc(), I don’t think anybody would mind those restrictions.
 
  
 
Also, I don’t understand whystruct s would need to be page aligned or why you 
mention a VLA.  A flexible array is not a VLA, in the sense C uses the term.
 
  
 
From: shwaresyst  
Sent: September 1, 2020 4:55 PM
To: Wojtek Lerch ; austin-group-l@opengroup.org
Subject: RE: [1003.1(2013)/Issue7+TC1 697]: Adding of a getdirentries() 
function
 
  
 
What that refers to, it looks, is any tail padding for the structure as a 
whole. The standard still permits internal padding between individual fields as 
required, e.g. a struct s { short a; double b[] } might need 6 bytes of this 
padding to align access for b[0]. This would still be needed if b[] only has a 
few members as a VLA but s is being page aligned, and so would reserve a lot of 
tail padding too. There would be 2 padding regions, however, is what that 
change forces.
 
  
 
On Tuesday, September 1, 2020 Wojtek Lerch  wrote:
 
Actually the intent was the opposite.  The original C99 did contain a wording 
that matches your interpretation:
 
 
 
… the size of the structureshall be equal to the offset of the last element of 
an otherwise identical structure that replaces the flexible array member with 
an array of unspecified length.
 
 
 
But this was reported as a defect, and corrected in TC2.
 
 
 
Summary
 6.7.2.1 Structure and union specifiers, paragraphs 15 and 16 require that any 
padding for alignment of a structure containing a flexible array member must 
preceed the flexible array member.  This contradicts existing

Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-02 Thread Wojtek Lerch via austin-group-l at The Open Group

On 2020-09-02 10:34, Joerg Schilling wrote:

Wojtek Lerch via austin-group-l at The Open Group 
 wrote:


A structure member can be a "flexible array" in standard C, but that's not the 
same thing as a VLA.

Are you speaking about array[] in contrast to array[size] with size being a
variable?



Yes.

(Pedantically speaking, the "size" can be any arbitrary expression 
rather than just a variable; either way, because it has to be computed 
at runtime, such syntax is only allowed inside a function.)




Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-02 Thread Joerg Schilling via austin-group-l at The Open Group
Wojtek Lerch via austin-group-l at The Open Group 
 wrote:

> A structure member can be a "flexible array" in standard C, but that's not 
> the same thing as a VLA.

Are you speaking about array[] in contrast to array[size] with size being a 
variable?

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-02 Thread Joerg Schilling via austin-group-l at The Open Group
Steffen Nurpmeso via austin-group-l at The Open Group 
 wrote:

> I personally would say that these should be skipped.  The data is
> copied over to user buffers, and these entries are simply not
> copied.  That seems to be the best.  The Group does not seem to
> want to add DT_WHITEOUT or similar things.

A nice idea from 1986 from SunOS-3.5 that did not make it into SVr4...

The question is whether this is POSIX compliant at all. If you like to see
such eintries, I would expect that you need to open() the directory with
a specific open flag first.

So my questtions:

-   When do you see such entries?

-   What happens when you stat() such a name?

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-02 Thread Geoff Clare via austin-group-l at The Open Group
Steffen Nurpmeso wrote, on 01 Sep 2020:
> 
>  |Do the existing implementations ever return such things?   Do they
>  |hide them by making the reclen of the previous entry (if there is
>  |one in the buffer) bigger, or do they squash them out, moving the
>  |next existing entry down to follow immediately after the previous one
>  |(where all the reclen's are as small as possible to contain the
>  |sctuct header, the name (and its \0) and alignment padding.)   This is
>  |a case where we don't necessarily need to specify one scheme that
>  |must be used - we can leave that for the implementation, as long as
>  |applications are informed what might happen.
> 
> The proposed text says that filenames are NUL terminated and
> hopping from entry to entry happens by adding the reclen to the
> current entry (casted to char*).  So it seems there could be data
> in between.

I have put a proposed rationale addition in the etherpad to make it
clear that this solution is allowed:

Some existing getdents() functions include deleted
directory entries in buf, marked with a special value of
one of the structure members. This behavior is not allowed for
posix_getdents(), although the data from a deleted
directory entry may be present in buf in the form of extra
padding on the end of the previous entry.

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-02 Thread Geoff Clare via austin-group-l at The Open Group
Philip Guenther wrote, on 01 Sep 2020:
>
> If a posix_getdents() implementation returned the names of all the files
> that ever existed in the given directory, including ones that were removed
> before the fd for this call was opened, what requirement in the standard
> would that violate?  I don't see any, thus my suggested wording for such a
> requirement.

It would not comply with the very first sentence of the description:

The posix_getdents() function shall attempt to read directory
entries from the directory associated with the open file
descriptor fildes and shall place information about the directory
entries and the files they refer to in ...

Note "the files they refer to" (present tense).

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-02 Thread Geoff Clare via austin-group-l at The Open Group
Wojtek Lerch wrote, on 01 Sep 2020:
>
> Geoff Clare wrote:
> > We can't require d_name in struct dirent to be a VLA since there are 
> > implementations where it is not.
> 
> Another good reason is that standard C does not allow structure members to be 
> VLAs.

Mea culpa.  I tried to save some typing by using VLA instead of flexible
array member, thinking they amounted to the same thing. Thanks for the
correction.  (Shows how little attention I have paid to both, since
they were - up to now - not relevant to POSIX.)

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread Philip Guenther via austin-group-l at The Open Group
On Tue, Sep 1, 2020 at 1:20 PM Steffen Nurpmeso  wrote:

> Philip Guenther wrote in
>  :
>  |On Tue, Sep 1, 2020 at 6:22 AM Steffen Nurpmeso via austin-group-l at The
>  |Open Group  wrote:
>  |> Robert Elz via austin-group-l at The Open Group wrote in
>  |>  <9252.1598969...@jinx.noi.kre.to>:
>  |>|Date:Tue, 1 Sep 2020 10:32:55 +0100
>  |>|From:"Geoff Clare via austin-group-l at The Open Group" \
>  |>|
>  |>|Message-ID:  <20200901093255.GA7629@localhost>
>  |>   ...
>  |>|What's more important is what happens if the application buffer isn't
>  |>|big enough for the next entry.What do the existing getdents()
>  |>|implementations do in that case?   If they're all the same then
>  ..
>  |> Isn't that covered nicely by the posted text?  There must be space
>  |> for at least one entry, otherwise EINVAL occurs?  And upon success
>  ..
>  |A quick review of FreeBSD, NetBSD, and OpenBSD finds they all return
> EINVAL
>  |if the buffer isn't "big enough".
>  |For OpenBSD, the minimum buffer size is 512 bytes; if I'm reading it
>  |correctly NetBSD is similar, possibly varying based on filesystem
>  |formatting.
>  |FreeBSD requires space for the next entry.
>
> They document that "the size must be greater than or equal to the
> block size associated with the file", but i cannot find this "next
> entry" of yours?
>

What document are you quoting from?

I actually tested the behavior of FreeBSD 11.3 getdirentries(2), after
looking at the code.
The manpage on that system says:

 [EINVAL]   The file referenced by fd is not a directory, or
nbytes is too small for returning a directory entry
or
block of entries, or the current position pointer is
invalid.

...

>  |At least NetBSD and OpenBSD will return an entry with d_ino == 0 if the
>  |first entry in a block is removed.  I suspect others may do this as well;
>  |glibc at least includes code to skip such entries in its generic
> readdir()
>  |implementation.
>  |
>  |The question really is "is this supposed to be a API that can be
> trivially
>  |supported by all the existing versions, even if that makes it more clunky
>  |to use, or should it be easy to use even if every single existing
>  |implementation needs to bend?"
>
> But .. it is already supported by all, and it has always been
> used?!


As I've been describing, that is not true for at least NetBSD and OpenBSD:
 * they both require the buffer to be some size larger than a single entry
 * they both can return entries with d_ino == 0 and d_name that doesn't
correspond
   to a file in the directory

"The same, but different" means "NOT THE SAME".  Code that follows the
proposed description on those points would not behaved as expected if used
with NetBSD's or OpenBSD's getdents(2).



> And i like this forward-looking approach that has been
> taken by the group, having that stat(2) call removed is fine.
> (Even though that is easily doable with the current standard and
> fstatat(), which is a totally different situation to twenty years
> ago!  Yay!)
>
>  |If the former, then define a minimum buffer size
>  |(pathconf(_PC_DIRBUFMIN)...?), permit d_ino==0 as entries where d_name
> and
>
> However there is _PC_NAME_MAX already, and so the number must be
> nearby, no?  Isn't that overengineering?
>

The buffers required by NetBSD and OpenBSD are larger than and unrelated to
the value returned by pathconf(_PC_NAME_MAX) and instead related to a block
size of the filesystem.



>  |d_type are unspecified, and let d_name be either a fixed fix array or a
>  |flexible array member.
>
> You know, my personal position would be to just skip those entries
> when copying data over to user buffers.  The costs of walking over
> the user buffer once (if it is done like that, and that memory
> should be hot even, then) seem to be low compared to collecting
> the directory entry information.
>

Sure, I'm fine with this new API specifying that those deleted entries be
suppressed, but it's inconsistent to do that and then insist that d_name's
nature be unspecified to make the API compatible with existing
implementation, despite making it more annoying to program with.



>  |If the latter, then require it to work with very small buffers, require
> all
>  |entries to have valid d_name and d_type, and specify d_name as a FAM.
>
> That d_type from the start would be great.
>

If you mean "require d_type to have a real value and never DT_UNKNOWN",
then that's a step further which I don't think any existing getd*ent* API
has taken and which would make this API _slower_  than readdir() on some
implementation+filesystem combos.


Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread shwaresyst via austin-group-l at The Open Group

No, that is not what I would want nor would anyone else. NAME_MAX doesn't 
guarantee no d_name will ever be longer than this value, what it says is all 
drivers for file systems provided by the implementation are capable of 
processing names up to that length. Some provided may support much longer names 
too, the standard leaves open. Because of this latter possibility no compile 
time constant guarantees EINVAL won't occur, that is suitable for use in a 
macro. Something that examines the media at runtime is required, which a macro 
might be an alias for, as a wrapper, but something still needs to be 
implemented to be wrapped.
On Tuesday, September 1, 2020 Steffen Nurpmeso  wrote:
shwaresyst wrote in
 <1739483391.1543785.1598977118...@mail.yahoo.com>:
 |No, it couldn't introduce such a macro, because such would have to \
 |assume all d_name entries are the same length. Adding an option to \

Well it has to go for NAME_MAX + the_size_of_posix_dent for each
and every entry, this is what you want here?  Except for what
Philip Guenther said, of course.  But if it would be left
implementation defined then even that could be covered by the
macro, better than by anything else.

I for one feel you are very brave to apply sizeof() to anything
with a "flexible array member", i would not dare that for portable
code.  (But my code has to work with ISO C89 too, so i have to use
macros to switch between [a-number] and [] as applicable, and also
to SIZEOF these types.)

Really, you are very brave!  Just the bugs i had to work around
since 2018 or what for a really tiny set of primitive tools!
(Like some gregarious animal not inlining for -Os, and another
huge one requiring explicit this-> to find superclass fields in
one class, but not the other.)

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter          he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)


Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread Steffen Nurpmeso via austin-group-l at The Open Group
Philip Guenther wrote in
 :
 |On Tue, Sep 1, 2020 at 6:22 AM Steffen Nurpmeso via austin-group-l at The
 |Open Group  wrote:
 |> Robert Elz via austin-group-l at The Open Group wrote in
 |>  <9252.1598969...@jinx.noi.kre.to>:
 |>|Date:Tue, 1 Sep 2020 10:32:55 +0100
 |>|From:"Geoff Clare via austin-group-l at The Open Group" \
 |>|
 |>|Message-ID:  <20200901093255.GA7629@localhost>
 |>   ...
 |>|What's more important is what happens if the application buffer isn't
 |>|big enough for the next entry.What do the existing getdents()
 |>|implementations do in that case?   If they're all the same then
 ..
 |> Isn't that covered nicely by the posted text?  There must be space
 |> for at least one entry, otherwise EINVAL occurs?  And upon success
 ..
 |A quick review of FreeBSD, NetBSD, and OpenBSD finds they all return EINVAL
 |if the buffer isn't "big enough".
 |For OpenBSD, the minimum buffer size is 512 bytes; if I'm reading it
 |correctly NetBSD is similar, possibly varying based on filesystem
 |formatting.
 |FreeBSD requires space for the next entry.

They document that "the size must be greater than or equal to the
block size associated with the file", but i cannot find this "next
entry" of yours?

 ||Similarly for what is done for directory pieces that don't contain
 |>|files, on filesystems that allow that (inode number == 0 or perhaps
 |>|a file type for "dummy entry" or something, or whatever).
 |>
 |> I personally would say that these should be skipped.  The data is
 ...
 |> copied.  That seems to be the best.  The Group does not seem to
 |> want to add DT_WHITEOUT or similar things.
 |
 |DT_WHITEOUT is different, related to union mounts.

Yes.  I think it was mentioned in the tracker discussion.

 ||Do the existing implementations ever return such things?   Do they
 |>
 |> I personally have not seen it, but this likely is a very
 |> filesystem dependent thing, which possibly even changes over time
 |
 |At least NetBSD and OpenBSD will return an entry with d_ino == 0 if the
 |first entry in a block is removed.  I suspect others may do this as well;
 |glibc at least includes code to skip such entries in its generic readdir()
 |implementation.
 |
 |The question really is "is this supposed to be a API that can be trivially
 |supported by all the existing versions, even if that makes it more clunky
 |to use, or should it be easy to use even if every single existing
 |implementation needs to bend?"

But .. it is already supported by all, and it has always been
used?!  And i like this forward-looking approach that has been
taken by the group, having that stat(2) call removed is fine.
(Even though that is easily doable with the current standard and
fstatat(), which is a totally different situation to twenty years
ago!  Yay!)

 |If the former, then define a minimum buffer size
 |(pathconf(_PC_DIRBUFMIN)...?), permit d_ino==0 as entries where d_name and

However there is _PC_NAME_MAX already, and so the number must be
nearby, no?  Isn't that overengineering?

 |d_type are unspecified, and let d_name be either a fixed fix array or a
 |flexible array member.

You know, my personal position would be to just skip those entries
when copying data over to user buffers.  The costs of walking over
the user buffer once (if it is done like that, and that memory
should be hot even, then) seem to be low compared to collecting
the directory entry information.

 |If the latter, then require it to work with very small buffers, require all
 |entries to have valid d_name and d_type, and specify d_name as a FAM.

That d_type from the start would be great.

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



RE: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread Wojtek Lerch via austin-group-l at The Open Group
My understanding is that they meant to allow an implementation where  “struct a 
{ int32_t x; char y; short flex[]; }”  produces  sizeof(struct a)==8  but  
offsetof(struct a,flex)==6.

I don’t like that they talk about padding “after” the flexible member – since 
the flexible array has a flexible size, rather than a zero size, that padding 
really overlaps the beginning of the array.

Personally I think that the standard could be made clearer if a structure with 
a flexible member were considered an incomplete type.  You wouldn’t be allowed 
to apply sizeof to it at all, and you wouldn’t be able to declare objects whose 
type is the structure, but you could still use pointers to it and dereference 
members – since the main purpose of such structures is to allocate them via 
malloc(), I don’t think anybody would mind those restrictions.

Also, I don’t understand why struct s would need to be page aligned or why you 
mention a VLA.  A flexible array is not a VLA, in the sense C uses the term.

From: shwaresyst 
Sent: September 1, 2020 4:55 PM
To: Wojtek Lerch ; austin-group-l@opengroup.org
Subject: RE: [1003.1(2013)/Issue7+TC1 697]: Adding of a getdirentries() 
function


What that refers to, it looks, is any tail padding for the structure as a 
whole. The standard still permits internal padding between individual fields as 
required, e.g. a struct s { short a; double b[] } might need 6 bytes of this 
padding to align access for b[0]. This would still be needed if b[] only has a 
few members as a VLA but s is being page aligned, and so would reserve a lot of 
tail padding too. There would be 2 padding regions, however, is what that 
change forces.


On Tuesday, September 1, 2020 Wojtek Lerch 
mailto:wle...@blackberry.com>> wrote:

Actually the intent was the opposite.  The original C99 did contain a wording 
that matches your interpretation:



… the size of the structure shall be equal to the offset of the last element of 
an otherwise identical structure that replaces the flexible array member with 
an array of unspecified length.



But this was reported as a defect, and corrected in TC2.



Summary
 6.7.2.1 Structure and union specifiers, paragraphs 15 and 16 require that any 
padding for alignment of a structure containing a flexible array member must 
preceed the flexible array member.  This contradicts existing implementations.  
We do not believe this was the intent of the C99 specification.

Details

If a struct contains a flexible array member and also requires padding for 
alignment, then the current C99 specification requires the implementation to 
put this padding before the flexible array member.  However, existing 
implementations, including at least GNU C, Compaq C, and Sun C, put the padding 
after the flexible array member.

The layout used by existing implementations can be more efficient. Furthermore, 
requiring these existing implementations to change their layout would break 
binary backwards compatibility with previous versions.



See DR282 for more details: 
http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_282.htm<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.open-2Dstd.org_jtc1_sc22_wg14_www_docs_dr-5F282.htm=DwMFaQ=yzoHOc_ZK-sxl-kfGNSEvlJYanssXN3q-lhj0sp26wE=-4AKPdl-tTThW9baWRqks1QhV4BtauX1oWrciJm2KH8=4Hb-MHkV2cRRhPP0ZwnkRAvxW8AzOkMO5hnS-tKa9R4=fNqZbhfwo3apg1vw26sgPTax2JbyoFeBsAxzVXZsARg=>





From: shwaresyst mailto:shwares...@aol.com>>
Sent: September 1, 2020 2:27 PM
To: Wojtek Lerch mailto:wle...@blackberry.com>>; 
austin-group-l@opengroup.org<mailto:austin-group-l@opengroup.org>
Subject: RE: [1003.1(2013)/Issue7+TC1 697]: Adding of a getdirentries() 
function



I agree some additional clarity might be useful there, in the C standard. I'm 
reading it as the intent being sizeof is equivalent to offsetof the VLA in 
accordance with the restrictions placed on it by use of the . or -> operators, 
which may not need extra bytes (so >vla == ( + sizeof(s)) is a truism, in 
other words) but it is not that specific.





On Tuesday, September 1, 2020 Wojtek Lerch 
mailto:wle...@blackberry.com>> wrote:

That sounds a little backwards – it’s everything else that works as if the 
flexible (not “variable”) member were not present.  The sizeof operator, as an 
exception, can return a greater value.  (The “.” and “->” operators are another 
exception.)



The standard does not say how much greater the value may be, or promise that it 
must be greater, even if padding is necessary to align the flexible member – as 
far as I can tell, sizeof(structure) can be less than offsetof(structure, 
flexible).



From: austin-group-l@opengroup.org<mailto:austin-group-l@opengroup.org> 
mailto:austin-group-l@opengroup.org>>
Sent: September 1, 2020 10:52 AM
To: g...@opengroup.org<mailto:g...@opengroup.org>; 
austin-group-l@opengroup.org<mailto:austin-gr

Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread Steffen Nurpmeso via austin-group-l at The Open Group
shwaresyst wrote in
 <1739483391.1543785.1598977118...@mail.yahoo.com>:
 |No, it couldn't introduce such a macro, because such would have to \
 |assume all d_name entries are the same length. Adding an option to \

Well it has to go for NAME_MAX + the_size_of_posix_dent for each
and every entry, this is what you want here?  Except for what
Philip Guenther said, of course.  But if it would be left
implementation defined then even that could be covered by the
macro, better than by anything else.

I for one feel you are very brave to apply sizeof() to anything
with a "flexible array member", i would not dare that for portable
code.  (But my code has to work with ISO C89 too, so i have to use
macros to switch between [a-number] and [] as applicable, and also
to SIZEOF these types.)

Really, you are very brave!  Just the bugs i had to work around
since 2018 or what for a really tiny set of primitive tools!
(Like some gregarious animal not inlining for -Os, and another
huge one requiring explicit this-> to find superclass fields in
one class, but not the other.)

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread Steffen Nurpmeso via austin-group-l at The Open Group
Wojtek Lerch via austin-group-l at The Open Group wrote in
 :
 |Geoff Clare wrote:
 |> We can't require d_name in struct dirent to be a VLA since there \
 |> are implementations where it is not.
 |
 |Another good reason is that standard C does not allow structure members \
 |to be VLAs.
 |
 |C11 6.7.2.1#9 "A member of a structure or union may have any complete \
 |object type other than a variably modified type."

And there is __STDC_NO_VLA__ under conditional feature macros in
the draft i have.

 |If implementations that define d_name as a VLA do in fact exist, they'd \
 |have to use some strange compiler extension.  (GCC does allow VLAs \
 |in structures, but only when the struct is defined inside a function \
 |-- a typedef in a header will not work.)
 |
 |A structure member can be a "flexible array" in standard C, but that's \
 |not the same thing as a VLA.

As usual, "the last element of a structure with more than one
named member may have an incomplete array type; this is called
a flexible array member."
So, it would be a syntax error as a VLA, since that requires an
unflexible definition of its variability for sure.

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



RE: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread shwaresyst via austin-group-l at The Open Group

What that refers to, it looks, is any tail padding for the structure as a 
whole. The standard still permits internal padding between individual fields as 
required, e.g. a struct s { short a; double b[] } might need 6 bytes of this 
padding to align access for b[0]. This would still be needed if b[] only has a 
few members as a VLA but s is being page aligned, and so would reserve a lot of 
tail padding too. There would be 2 padding regions, however, is what that 
change forces.
On Tuesday, September 1, 2020 Wojtek Lerch  wrote:
#yiv7361582445 #yiv7361582445 -- _filtered {} _filtered {} _filtered 
{}#yiv7361582445 #yiv7361582445 p.yiv7361582445MsoNormal, #yiv7361582445 
li.yiv7361582445MsoNormal, #yiv7361582445 div.yiv7361582445MsoNormal 
{margin:0cm;margin-bottom:.0001pt;font-size:11.0pt;font-family:sans-serif;}#yiv7361582445
 a:link, #yiv7361582445 span.yiv7361582445MsoHyperlink 
{color:#0563C1;text-decoration:underline;}#yiv7361582445 
p.yiv7361582445MsoNoSpacing, #yiv7361582445 li.yiv7361582445MsoNoSpacing, 
#yiv7361582445 div.yiv7361582445MsoNoSpacing 
{margin:0cm;margin-bottom:.0001pt;font-size:11.0pt;font-family:sans-serif;}#yiv7361582445
 p.yiv7361582445msonormal, #yiv7361582445 li.yiv7361582445msonormal, 
#yiv7361582445 div.yiv7361582445msonormal 
{margin-right:0cm;margin-left:0cm;font-size:11.0pt;font-family:sans-serif;}#yiv7361582445
 span.yiv7361582445EmailStyle27 {font-family:New 
serif;color:windowtext;}#yiv7361582445 .yiv7361582445MsoChpDefault 
{font-size:10.0pt;} _filtered {}#yiv7361582445 div.yiv7361582445WordSection1 
{}#yiv7361582445 
Actually the intent was the opposite.  The original C99 did contain a wording 
that matches your interpretation:
 
  
 
… the size of the structureshall be equal to the offset of the last element of 
an otherwise identical structure that replaces the flexible array member with 
an array of unspecified length.
 
  
 
But this was reported as a defect, and corrected in TC2.
 
  
 
Summary
 6.7.2.1 Structure and union specifiers, paragraphs 15 and 16 require that any 
padding for alignment of a structure containing a flexible array member must 
preceed the flexible array member.  This contradicts existing implementations.  
We do not believe this was the intent of the C99 specification.
 
Details
 
If a struct contains a flexible array member and also requires padding for 
alignment, then the current C99 specification requires the implementation to 
put this paddingbefore the flexible array member.  However, existing 
implementations, including at least GNU C, Compaq C, and Sun C, put the 
paddingafter the flexible array member.
 
The layout used by existing implementations can be more efficient. Furthermore, 
requiring these existing implementations to change their layout would break 
binary backwards compatibility with previous versions.
 
  
 
See DR282 for more 
details:http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_282.htm
 
  
 
  
 
From: shwaresyst  
Sent: September 1, 2020 2:27 PM
To: Wojtek Lerch ; austin-group-l@opengroup.org
Subject: RE: [1003.1(2013)/Issue7+TC1 697]: Adding of a getdirentries() 
function
 
  
 
I agree some additional clarity might be useful there, in the C standard. I'm 
reading it as the intent being sizeof is equivalent to offsetof the VLA in 
accordance with the restrictions placed on it by use of the . or -> operators, 
which may not need extra bytes (so >vla == ( + sizeof(s)) is a truism, in 
other words) but it is not that specific.
 
  
 
On Tuesday, September 1, 2020 Wojtek Lerch  wrote:
 
That sounds a little backwards – it’severything else that works as if the 
flexible (not “variable”) member were not present.  The sizeof operator, as an 
exception, can return a greater value.  (The “.” and “->” operators are another 
exception.)
 
 
 
The standard does not sayhow much greater the value may be, or promise that it 
must be greater, even if padding is necessary to align the flexible member – as 
far as I can tell, sizeof(structure) can beless than offsetof(structure, 
flexible).
 
 
 
From: austin-group-l@opengroup.org 
Sent: September 1, 2020 10:52 AM
To: g...@opengroup.org; austin-group-l@opengroup.org
Subject: Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() 
function
 
 
 
It's my understanding, by C11 6.7.2.1p18, sizeof on a struct with a variable 
array works as if the variable member was not present, but does count any bytes 
added for alignment padding, as this will be a fixed amount for each use of the 
struct. It is up to the application, like with variable argument lists, to 
establish a protocol that allows it to determine the effective size of the 
final member.
 
This transmission (including any attachments) may contain confidential 
information, privileged material (including material protected by the 
solicitor-client or other applicable privileges), or constitute non-public 
information. Any use of this information by anyone other than the intended 
recipient is p

Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread Philip Guenther via austin-group-l at The Open Group
On Tue, Sep 1, 2020 at 6:22 AM Steffen Nurpmeso via austin-group-l at The
Open Group  wrote:

> Robert Elz via austin-group-l at The Open Group wrote in
>  <9252.1598969...@jinx.noi.kre.to>:
>  |Date:Tue, 1 Sep 2020 10:32:55 +0100
>  |From:"Geoff Clare via austin-group-l at The Open Group" \
>  |
>  |Message-ID:  <20200901093255.GA7629@localhost>
>   ...
>  |What's more important is what happens if the application buffer isn't
>  |big enough for the next entry.What do the existing getdents()
>  |implementations do in that case?   If they're all the same then
>  |posix_getdent() should do the same thing (EINVAL?  E2BIG?) - if they
>  |differ, then we can decide what's best.
>
> Isn't that covered nicely by the posted text?  There must be space
> for at least one entry, otherwise EINVAL occurs?  And upon success
> "a non-negative integer shall be returned indicating the number of
> bytes occupied by the posix_dent structures placed in
> buf", which even for a non-native tongue implies that there
> may be pad left.  I think you are overcomplicating here.
>

A quick review of FreeBSD, NetBSD, and OpenBSD finds they all return EINVAL
if the buffer isn't "big enough".
For OpenBSD, the minimum buffer size is 512 bytes; if I'm reading it
correctly NetBSD is similar, possibly varying based on filesystem
formatting.
FreeBSD requires space for the next entry.


 |Similarly for what is done for directory pieces that don't contain
>  |files, on filesystems that allow that (inode number == 0 or perhaps
>  |a file type for "dummy entry" or something, or whatever).
>
> I personally would say that these should be skipped.  The data is
> copied over to user buffers, and these entries are simply not
> copied.  That seems to be the best.  The Group does not seem to
> want to add DT_WHITEOUT or similar things.
>

DT_WHITEOUT is different, related to union mounts.

 |Do the existing implementations ever return such things?   Do they
>
> I personally have not seen it, but this likely is a very
> filesystem dependent thing, which possibly even changes over time


At least NetBSD and OpenBSD will return an entry with d_ino == 0 if the
first entry in a block is removed.  I suspect others may do this as well;
glibc at least includes code to skip such entries in its generic readdir()
implementation.


The question really is "is this supposed to be a API that can be trivially
supported by all the existing versions, even if that makes it more clunky
to use, or should it be easy to use even if every single existing
implementation needs to bend?"

If the former, then define a minimum buffer size
(pathconf(_PC_DIRBUFMIN)...?), permit d_ino==0 as entries where d_name and
d_type are unspecified, and let d_name be either a fixed fix array or a
flexible array member.

If the latter, then require it to work with very small buffers, require all
entries to have valid d_name and d_type, and specify d_name as a FAM.


Philip Guenther


RE: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread Wojtek Lerch via austin-group-l at The Open Group
Actually the intent was the opposite.  The original C99 did contain a wording 
that matches your interpretation:

… the size of the structure shall be equal to the offset of the last element of 
an otherwise identical structure that replaces the flexible array member with 
an array of unspecified length.

But this was reported as a defect, and corrected in TC2.


Summary
 6.7.2.1 Structure and union specifiers, paragraphs 15 and 16 require that any 
padding for alignment of a structure containing a flexible array member must 
preceed the flexible array member.  This contradicts existing implementations.  
We do not believe this was the intent of the C99 specification.

Details

If a struct contains a flexible array member and also requires padding for 
alignment, then the current C99 specification requires the implementation to 
put this padding before the flexible array member.  However, existing 
implementations, including at least GNU C, Compaq C, and Sun C, put the padding 
after the flexible array member.

The layout used by existing implementations can be more efficient. Furthermore, 
requiring these existing implementations to change their layout would break 
binary backwards compatibility with previous versions.

See DR282 for more details: 
http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_282.htm


From: shwaresyst 
Sent: September 1, 2020 2:27 PM
To: Wojtek Lerch ; austin-group-l@opengroup.org
Subject: RE: [1003.1(2013)/Issue7+TC1 697]: Adding of a getdirentries() 
function


I agree some additional clarity might be useful there, in the C standard. I'm 
reading it as the intent being sizeof is equivalent to offsetof the VLA in 
accordance with the restrictions placed on it by use of the . or -> operators, 
which may not need extra bytes (so >vla == ( + sizeof(s)) is a truism, in 
other words) but it is not that specific.


On Tuesday, September 1, 2020 Wojtek Lerch 
mailto:wle...@blackberry.com>> wrote:

That sounds a little backwards – it’s everything else that works as if the 
flexible (not “variable”) member were not present.  The sizeof operator, as an 
exception, can return a greater value.  (The “.” and “->” operators are another 
exception.)



The standard does not say how much greater the value may be, or promise that it 
must be greater, even if padding is necessary to align the flexible member – as 
far as I can tell, sizeof(structure) can be less than offsetof(structure, 
flexible).



From: austin-group-l@opengroup.org<mailto:austin-group-l@opengroup.org> 
mailto:austin-group-l@opengroup.org>>
Sent: September 1, 2020 10:52 AM
To: g...@opengroup.org<mailto:g...@opengroup.org>; 
austin-group-l@opengroup.org<mailto:austin-group-l@opengroup.org>
Subject: Re: [1003.1(2013)/Issue7+TC1 697]: Adding of a getdirentries() 
function


It's my understanding, by C11 6.7.2.1p18, sizeof on a struct with a variable 
array works as if the variable member was not present, but does count any bytes 
added for alignment padding, as this will be a fixed amount for each use of the 
struct. It is up to the application, like with variable argument lists, to 
establish a protocol that allows it to determine the effective size of the 
final member.

This transmission (including any attachments) may contain confidential 
information, privileged material (including material protected by the 
solicitor-client or other applicable privileges), or constitute non-public 
information. Any use of this information by anyone other than the intended 
recipient is prohibited. If you have received this transmission in error, 
please immediately reply to the sender and delete this information from your 
system. Use, dissemination, distribution, or reproduction of this transmission 
by unintended recipients is not authorized and may be unlawful.


--
This transmission (including any attachments) may contain confidential 
information, privileged material (including material protected by the 
solicitor-client or other applicable privileges), or constitute non-public 
information. Any use of this information by anyone other than the intended 
recipient is prohibited. If you have received this transmission in error, 
please immediately reply to the sender and delete this information from your 
system. Use, dissemination, distribution, or reproduction of this transmission 
by unintended recipients is not authorized and may be unlawful.


Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread Philip Guenther via austin-group-l at The Open Group
On Tue, Sep 1, 2020 at 5:40 AM Geoff Clare via austin-group-l at The Open
Group  wrote:

> > --
> >  (0004958) philip-guenther (reporter) - 2020-08-30 23:06
> >  https://austingroupbugs.net/view.php?id=6
> 


-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England

> 97#c4958 
> > --
> > The proposed text includes:
> > The d_name member shall be a filename string, and (if not dot
> or dot-dot)
> > shall contain the same byte sequence as the last pathname component
> of the
> > string used to create the directory entry, plus the terminating
>  byte.
> >
> > That would seem to require that all returned entries correspond to
> > filenames that existed in the directory at _some_ point in time.
>
> It is just copied from existing text for readdir() in Issue 8 draft 1.
> See bug 293.
>

That part is fine, that the returned names match the creation names.  My
concern in that comment is that there's no requirement on posix_getdents()
to only return _currently_ existing names.

If a posix_getdents() implementation returned the names of all the files
that ever existed in the given directory, including ones that were removed
before the fd for this call was opened, what requirement in the standard
would that violate?  I don't see any, thus my suggested wording for such a
requirement.

Philip


RE: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread shwaresyst via austin-group-l at The Open Group

I agree some additional clarity might be useful there, in the C standard. I'm 
reading it as the intent being sizeof is equivalent to offsetof the VLA in 
accordance with the restrictions placed on it by use of the . or -> operators, 
which may not need extra bytes (so >vla == ( + sizeof(s)) is a truism, in 
other words) but it is not that specific.
On Tuesday, September 1, 2020 Wojtek Lerch  wrote:
#yiv0502119094 #yiv0502119094 -- _filtered {} _filtered {}#yiv0502119094 
#yiv0502119094 p.yiv0502119094MsoNormal, #yiv0502119094 
li.yiv0502119094MsoNormal, #yiv0502119094 div.yiv0502119094MsoNormal 
{margin:0cm;margin-bottom:.0001pt;font-size:11.0pt;font-family:sans-serif;}#yiv0502119094
 span.yiv0502119094EmailStyle20 {font-family:New 
serif;color:windowtext;}#yiv0502119094 .yiv0502119094MsoChpDefault 
{font-size:10.0pt;} _filtered {}#yiv0502119094 div.yiv0502119094WordSection1 
{}#yiv0502119094 
That sounds a little backwards – it’severything else that works as if the 
flexible (not “variable”) member were not present.  The sizeof operator, as an 
exception, can return a greater value.  (The “.” and “->” operators are another 
exception.)
 
  
 
The standard does not sayhow much greater the value may be, or promise that it 
must be greater, even if padding is necessary to align the flexible member – as 
far as I can tell, sizeof(structure) can beless than offsetof(structure, 
flexible).
 

 
  
 
From: austin-group-l@opengroup.org 
Sent: September 1, 2020 10:52 AM
To: g...@opengroup.org; austin-group-l@opengroup.org
Subject: Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() 
function
 
  
 

It's my understanding, by C11 6.7.2.1p18, sizeof on a struct with a variable 
array works as if the variable member was not present, but does count any bytes 
added for alignment padding, as this will be a fixed amount for each use of the 
struct. It is up to the application, like with variable argument lists, to 
establish a protocol that allows it to determine the effective size of the 
final member. This transmission (including any attachments) may contain 
confidential information, privileged material (including material protected by 
the solicitor-client or other applicable privileges), or constitute non-public 
information. Any use of this information by anyone other than the intended 
recipient is prohibited. If you have received this transmission in error, 
please immediately reply to the sender and delete this information from your 
system. Use, dissemination, distribution, or reproduction of this transmission 
by unintended recipients is not authorized and may be unlawful.


RE: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread Wojtek Lerch via austin-group-l at The Open Group
That sounds a little backwards – it’s everything else that works as if the 
flexible (not “variable”) member were not present.  The sizeof operator, as an 
exception, can return a greater value.  (The “.” and “->” operators are another 
exception.)

The standard does not say how much greater the value may be, or promise that it 
must be greater, even if padding is necessary to align the flexible member – as 
far as I can tell, sizeof(structure) can be less than offsetof(structure, 
flexible).

From: austin-group-l@opengroup.org 
Sent: September 1, 2020 10:52 AM
To: g...@opengroup.org; austin-group-l@opengroup.org
Subject: Re: [1003.1(2013)/Issue7+TC1 697]: Adding of a getdirentries() 
function


It's my understanding, by C11 6.7.2.1p18, sizeof on a struct with a variable 
array works as if the variable member was not present, but does count any bytes 
added for alignment padding, as this will be a fixed amount for each use of the 
struct. It is up to the application, like with variable argument lists, to 
establish a protocol that allows it to determine the effective size of the 
final member.

--
This transmission (including any attachments) may contain confidential 
information, privileged material (including material protected by the 
solicitor-client or other applicable privileges), or constitute non-public 
information. Any use of this information by anyone other than the intended 
recipient is prohibited. If you have received this transmission in error, 
please immediately reply to the sender and delete this information from your 
system. Use, dissemination, distribution, or reproduction of this transmission 
by unintended recipients is not authorized and may be unlawful.


RE: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread Wojtek Lerch via austin-group-l at The Open Group
Geoff Clare wrote:
> We can't require d_name in struct dirent to be a VLA since there are 
> implementations where it is not.

Another good reason is that standard C does not allow structure members to be 
VLAs.

C11 6.7.2.1#9 "A member of a structure or union may have any complete object 
type other than a variably modified type."

If implementations that define d_name as a VLA do in fact exist, they'd have to 
use some strange compiler extension.  (GCC does allow VLAs in structures, but 
only when the struct is defined inside a function -- a typedef in a header will 
not work.)

A structure member can be a "flexible array" in standard C, but that's not the 
same thing as a VLA.

--
This transmission (including any attachments) may contain confidential 
information, privileged material (including material protected by the 
solicitor-client or other applicable privileges), or constitute non-public 
information. Any use of this information by anyone other than the intended 
recipient is prohibited. If you have received this transmission in error, 
please immediately reply to the sender and delete this information from your 
system. Use, dissemination, distribution, or reproduction of this transmission 
by unintended recipients is not authorized and may be unlawful.



Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread shwaresyst via austin-group-l at The Open Group

No, it couldn't introduce such a macro, because such would have to assume all 
d_name entries are the same length. Adding an option to the interface to do a 
count, as a vararg parameter, and directly malloc the necessary space, returned 
via my suggested change to buf as a **, is plausible. Since we are merging 
common behaviors with this interface introduction, not describing a single 
reference implementation, such changes are permitted if someone commits to 
doing an implementation, afaik.
On Tuesday, September 1, 2020 Steffen Nurpmeso via austin-group-l at The Open 
Group  wrote:
Geoff Clare via austin-group-l at The Open Group wrote in
 <20200901143300.GB24606@localhost>:
 |> -- 
 |>  (0004953) philip-guenther (reporter) - 2020-08-28 22:52
 |>  https://www.austingroupbugs.net/view.php?id=697#c4953 
 |> -- 
 |> I think the unspecified nature of the d_name member in the new posix_dent
 |> makes writing portable software more difficult while providing only \
 |> minimal
 |> benefit to programs that don't care.  I would support requiring it \
 |> to be a
 |> flexible array member and thus eliminating the error of declaring \
 |> an array
 |> and trying to walk it via indexing instead of by advancing a char pointer
 |> by d_reclen.
 |
 |I think we should keep the requirements for d_name the same between
 |struct dirent and struct posix_dent.  Some implementations of
 |getdents() and getdirentries() use struct dirent and they should be
 |able to make posix_getdents() a synonym (or a light wrapper) for the
 |existing function by making struct posix_dent be identical to struct
 |dirent.  We can't require d_name in struct dirent to be a VLA since
 |there are implementations where it is not.

The standard could also introduce a macro which could be used to
space a buffer accordingly, something like (very ugly)
POSIX_GETDENTS_BYTES_FOR_DENTS(number-of-desired-dents), and use
it in the example.
Like that any possible errors with buffer space allocation would
not even be introduced (except for possible integer overflows,
maybe).

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter          he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread Geoff Clare via austin-group-l at The Open Group
shwaresyst wrote, on 01 Sep 2020:
> 
> It's my understanding, by C11 6.7.2.1p18, sizeof on a struct with a variable 
> array works as if the variable member was not present

Thanks for the pointer. 

So it looks like removing "or equal to" is all that is needed here.

> On Tuesday, September 1, 2020 Geoff Clare via austin-group-l at The Open 
> Group  wrote:
> Per Mildner wrote, on 30 Aug 2020:
> >
> > The posix_getdents() function shall ... place ... posix_dent structures in 
> > the buffer pointed to by buf up to a maximum of nbyte bytes"
> > "The array d_name ... shall contain a filename of at most {NAME_MAX} bytes 
> > followed by a terminating null byte." (so could need up to {NAME_MAX} + 1 
> > bytes).
> > "Implementations may define the d_name array .. to ... use a flexible array 
> > member" (meaning the d_name array does not affect the "size of the 
> > posix_dent structure").
> > 
> > Does the above not imply that the following should use "greater than", 
> > rather than "greater than or equal", to make room for "a terminating null 
> > byte"?
> > 
> > "The number of posix_dent structures populated in buf ... shall be at least 
> > one if nbyte is greater than or equal to the size of the posix_dent 
> > structure plus {NAME_MAX} ..."
> > 
> 
> Good catch. This text predates the stuff about d_name possibly being a
> flexible array member, and needs updating.  For now I have marked "or
> equal to" for deletion in the etherpad, but I think there is still a
> problem with "size of the posix_dent structure" as you can't use
> sizeof(struct posix_dent) if d_name is a flexible array member.
> 
> To be correct it would have to distinguish the two cases:
> 
>     ... greater than {NAME_MAX} plus
>     * the size of the posix_dent structure, if d_name is not a
>       flexible array member, or
>     * the offset of d_name in the posix_dent structure, if d_name is a
>       flexible array member.
> 
> but I'm not sure how useful this would be to applications.  In any case
> it would be highly unusual for an application to use such a small buffer.

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread Steffen Nurpmeso via austin-group-l at The Open Group
Geoff Clare via austin-group-l at The Open Group wrote in
 <20200901143300.GB24606@localhost>:
 |> -- 
 |>  (0004953) philip-guenther (reporter) - 2020-08-28 22:52
 |>  https://www.austingroupbugs.net/view.php?id=697#c4953 
 |> -- 
 |> I think the unspecified nature of the d_name member in the new posix_dent
 |> makes writing portable software more difficult while providing only \
 |> minimal
 |> benefit to programs that don't care.  I would support requiring it \
 |> to be a
 |> flexible array member and thus eliminating the error of declaring \
 |> an array
 |> and trying to walk it via indexing instead of by advancing a char pointer
 |> by d_reclen.
 |
 |I think we should keep the requirements for d_name the same between
 |struct dirent and struct posix_dent.  Some implementations of
 |getdents() and getdirentries() use struct dirent and they should be
 |able to make posix_getdents() a synonym (or a light wrapper) for the
 |existing function by making struct posix_dent be identical to struct
 |dirent.  We can't require d_name in struct dirent to be a VLA since
 |there are implementations where it is not.

The standard could also introduce a macro which could be used to
space a buffer accordingly, something like (very ugly)
POSIX_GETDENTS_BYTES_FOR_DENTS(number-of-desired-dents), and use
it in the example.
Like that any possible errors with buffer space allocation would
not even be introduced (except for possible integer overflows,
maybe).

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread Steffen Nurpmeso via austin-group-l at The Open Group
Robert Elz via austin-group-l at The Open Group wrote in
 <9252.1598969...@jinx.noi.kre.to>:
 |Date:Tue, 1 Sep 2020 10:32:55 +0100
 |From:"Geoff Clare via austin-group-l at The Open Group" \
 |
 |Message-ID:  <20200901093255.GA7629@localhost>
 |
 || but I'm not sure how useful this would be to applications.  In any case
 || it would be highly unusual for an application to use such a small buffer.
 |
 |I would suggest that we stop worrying about telling applications how
 |to write code to use the interfaces, or at least until the interface is
 |properly specified.

But the interfaces are decades old, isn't that wording too strong?

if(_impl->cast.ui1p < _impl->maxcast || a_FillBuffer(_impl)) {
de = _impl->cast.dep;
_impl->cast.ui1p += de->d_reclen;
} else
de = NIL;

This code worked two decades ago, and it would be great if it
would still work in two decades from now on.

   _impl->cast.ui1p = _impl->buffer;
   _impl->maxcast = _impl->buffer + u.osret;

where u.osret is "what posix_getdents()" returned (positively).

 |What's more important is what happens if the application buffer isn't
 |big enough for the next entry.What do the existing getdents()
 |implementations do in that case?   If they're all the same then
 |posix_getdent() should do the same thing (EINVAL?  E2BIG?) - if they
 |differ, then we can decide what's best.

Isn't that covered nicely by the posted text?  There must be space
for at least one entry, otherwise EINVAL occurs?  And upon success
"a non-negative integer shall be returned indicating the number of
bytes occupied by the posix_dent structures placed in
buf", which even for a non-native tongue implies that there
may be pad left.  I think you are overcomplicating here.

 |Similarly for what is done for directory pieces that don't contain
 |files, on filesystems that allow that (inode number == 0 or perhaps
 |a file type for "dummy entry" or something, or whatever).

I personally would say that these should be skipped.  The data is
copied over to user buffers, and these entries are simply not
copied.  That seems to be the best.  The Group does not seem to
want to add DT_WHITEOUT or similar things.

With directory entries you always have races, as you of course
surely know, so any data you see can anyway only be an indication,
the standard itself talks about races regarding this (and
introduced the "at" series to overcome some of them).

 |Do the existing implementations ever return such things?   Do they

I personally have not seen it, but this likely is a very
filesystem dependent thing, which possibly even changes over time.

 |hide them by making the reclen of the previous entry (if there is
 |one in the buffer) bigger, or do they squash them out, moving the
 |next existing entry down to follow immediately after the previous one
 |(where all the reclen's are as small as possible to contain the
 |sctuct header, the name (and its \0) and alignment padding.)   This is
 |a case where we don't necessarily need to specify one scheme that
 |must be used - we can leave that for the implementation, as long as
 |applications are informed what might happen.

The proposed text says that filenames are NUL terminated and
hopping from entry to entry happens by adding the reclen to the
current entry (casted to char*).  So it seems there could be data
in between.  Empty names are not allowed, so this.  As some
getdent implementations used to use d_ino fields, others d_fileno,
it may be necessary anyway to create a very small posix_getdents()
system call wrapper, one which boils down the huge number of
filesystem informations to what posix_dent actually serves?

 |If after all of this (and perhaps more) is worked out, if there is
 |an example application fragment that can usefully be included to
 |demonstrate how the interface might be used, then fine - but this
 |is a bonus extra, not really required in the standard.

It is wonderful you say this, having a way to directly read
directory content into buffers without having to use these other
functions, which may perform memory allocations which may impose
locking noise etc etc, and getting the d_type field directly as
well, and you know _how_ terribly it was to write that code in the
past, where you possibly even had to have valid path names around
in order to stat(2) a directory entry, at least for the
theoretical case that d_type does not exist!

Thanks,

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread Geoff Clare via austin-group-l at The Open Group
> -- 
>  (0004958) philip-guenther (reporter) - 2020-08-30 23:06
>  https://austingroupbugs.net/view.php?id=697#c4958 
> -- 
> The proposed text includes:
> The d_name member shall be a filename string, and (if not dot or
> dot-dot)
> shall contain the same byte sequence as the last pathname component of
> the
> string used to create the directory entry, plus the terminating 
> byte.
> 
> That would seem to require that all returned entries correspond to
> filenames that existed in the directory at _some_ point in time.

It is just copied from existing text for readdir() in Issue 8 draft 1.
See bug 293.

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread Geoff Clare via austin-group-l at The Open Group
> -- 
>  (0004953) philip-guenther (reporter) - 2020-08-28 22:52
>  https://www.austingroupbugs.net/view.php?id=697#c4953 
> -- 
> I think the unspecified nature of the d_name member in the new posix_dent
> makes writing portable software more difficult while providing only minimal
> benefit to programs that don't care.  I would support requiring it to be a
> flexible array member and thus eliminating the error of declaring an array
> and trying to walk it via indexing instead of by advancing a char pointer
> by d_reclen.

I think we should keep the requirements for d_name the same between
struct dirent and struct posix_dent.  Some implementations of
getdents() and getdirentries() use struct dirent and they should be
able to make posix_getdents() a synonym (or a light wrapper) for the
existing function by making struct posix_dent be identical to struct
dirent.  We can't require d_name in struct dirent to be a VLA since
there are implementations where it is not.

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread Geoff Clare via austin-group-l at The Open Group
> -- 
>  (0004949) kre (reporter) - 2020-08-28 17:52
>  https://www.austingroupbugs.net/view.php?id=697#c4949 
> -- 
> I suspect that the intent of all of this is good, but this one
> phrase's wording (wrt the d_name field, it is used for that
> field in both struct direct and struct posix_dent):
> 
>  but shall contain a filename of at most {NAME_MAX} bytes
> 
> is incomprehensible to me.   I can read it as saying
> 
>  no file names longer than NAME_MAX bytes can ever occur in a d_name
> or as
>  the d_name field must be able to contain file names at least
>  NAME_MAX bytes long
> 
> I suspect that the latter is most likely what was intended, but I really
> don't know for sure - that one allows implementations to support
> filesystems
> where directories might contain names longer than can be regularly passed
> to
> system calls, for example, and generally allowing implementation
> extensions
> is desirable, but the former allows an application to declare an array of
> NAME_MAX+1 bytes and be sure that any d_name entry will fit.

The current wording is from bug 291 - see bugnote 578 which includes this:

At line 7577 (XBD dirent.h DESCRIPTION), change:

The character array d_name is of unspecified size, but the number of
bytes preceding the terminating null byte shall not exceed {NAME_MAX}.

to:

The array d_name is of unspecified size, but shall contain a filename
of at most {NAME_MAX} bytes followed by a terminating null byte.

If you want to get this changed, you should look back at the discussions
that led to the above change to understand the reasons behind it, and
then submit a separate Mantis bug in the Issue7+TC2 project.

Bug 697 is not the right place to make extra changes of this nature - it
should be limited to what's needed to add posix_getdents().

> 
> This phrase ought to be reworded to make it clear what is intended there.
> Since it is talking about a variable length array, better phrasing would
> probably concentrate on what bounds exist for the size of that array,
> rather
> than the length of what might be stored within it.

The array itself, in the struct dirent definition, can be 1 byte in size
(as stated in existing RATIONALE).  It can also be a VLA, which has no
size.  So talking about the size of the array is not meaningful. The only
thing that matters is how many bytes are stored at that address.

> And while I'm here, I don't think we need to be providing C tutorials,
> so I'd drop all the stuff about arrays of posix_dent structures
> completely - attempting to write a program using such a thing would be
> folly 

The reason for including it is that some implementations might have
d_name as full size, others might use the d_name[1] trick or have it
as a VLA.  An application writer who develops an application on an
implementation with a full size d_name might use:

struct posix_dent buf[100];

and it would work fine on that system, but would cause problems if
it is later ported to other systems.  This is entirely in keeping with
the purpose of the APPLICATION USAGE section, i.e. to warn about
potential portability issues.

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread Robert Elz via austin-group-l at The Open Group
Date:Tue, 1 Sep 2020 10:32:55 +0100
From:"Geoff Clare via austin-group-l at The Open Group" 

Message-ID:  <20200901093255.GA7629@localhost>


  | but I'm not sure how useful this would be to applications.  In any case
  | it would be highly unusual for an application to use such a small buffer.

I would suggest that we stop worrying about telling applications how
to write code to use the interfaces, or at least until the interface is
properly specified.

What's more important is what happens if the application buffer isn't
big enough for the next entry.What do the existing getdents()
implementations do in that case?   If they're all the same then
posix_getdent() should do the same thing (EINVAL?  E2BIG?) - if they
differ, then we can decide what's best.

Similarly for what is done for directory pieces that don't contain
files, on filesystems that allow that (inode number == 0 or perhaps
a file type for "dummy entry" or something, or whatever).

Do the existing implementations ever return such things?   Do they
hide them by making the reclen of the previous entry (if there is
one in the buffer) bigger, or do they squash them out, moving the
next existing entry down to follow immediately after the previous one
(where all the reclen's are as small as possible to contain the
sctuct header, the name (and its \0) and alignment padding.)   This is
a case where we don't necessarily need to specify one scheme that
must be used - we can leave that for the implementation, as long as
applications are informed what might happen.

If after all of this (and perhaps more) is worked out, if there is
an example application fragment that can usefully be included to
demonstrate how the interface might be used, then fine - but this
is a bonus extra, not really required in the standard.

kre



Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-09-01 Thread Geoff Clare via austin-group-l at The Open Group
Per Mildner wrote, on 30 Aug 2020:
>
> The posix_getdents() function shall ... place ... posix_dent structures in 
> the buffer pointed to by buf up to a maximum of nbyte bytes"
> "The array d_name ... shall contain a filename of at most {NAME_MAX} bytes 
> followed by a terminating null byte." (so could need up to {NAME_MAX} + 1 
> bytes).
> "Implementations may define the d_name array .. to ... use a flexible array 
> member" (meaning the d_name array does not affect the "size of the posix_dent 
> structure").
> 
> Does the above not imply that the following should use "greater than", rather 
> than "greater than or equal", to make room for "a terminating null byte"?
> 
> "The number of posix_dent structures populated in buf ... shall be at least 
> one if nbyte is greater than or equal to the size of the posix_dent structure 
> plus {NAME_MAX} ..."
> 

Good catch. This text predates the stuff about d_name possibly being a
flexible array member, and needs updating.  For now I have marked "or
equal to" for deletion in the etherpad, but I think there is still a
problem with "size of the posix_dent structure" as you can't use
sizeof(struct posix_dent) if d_name is a flexible array member.

To be correct it would have to distinguish the two cases:

... greater than {NAME_MAX} plus
* the size of the posix_dent structure, if d_name is not a
  flexible array member, or
* the offset of d_name in the posix_dent structure, if d_name is a
  flexible array member.

but I'm not sure how useful this would be to applications.  In any case
it would be highly unusual for an application to use such a small buffer.

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-08-30 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


A NOTE has been added to this issue. 
== 
https://austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: New
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2020-08-30 23:44 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

-- 
 (0004960) shware_systems (reporter) - 2020-08-30 23:44
 https://austingroupbugs.net/view.php?id=697#c4960 
-- 
For file systems that relate inode values to start sector index, inode 0 is
the superblock for that volume, and entirely valid. Saying it is illegal I
don't see happening; that code is simply buggy, conceptually. The safer
value to use is (ino_t)-1, if it is to be overloaded that way, but defining
a DT_DELETED value for d_type is even safer. 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947  
2020-08-28 08:27 geoffclare Note Edited: 0004947 
2020-08-28 17:07 shware_systems Note Added: 0004948  
2020-08-28 17:52 kreNote Added: 0004949  
2020-08-28 22:42 philip-guentherNote Added: 0004952  
2020-08-28 22:52 philip-guentherNote Added: 0004953  
2020-08-29 11:34 shware_systems Note Added: 0004955  
2020-08-29 11:34 shware_systems Note Added: 0004956  
2020-08-29 11:41 shware_systems Note Edited: 0004956 
2020-08-29 11:44 shware_systems Note Deleted: 0004955
2020-08-29 13:07 shware_systems Note Added: 0004957  
2020-08-29 13:11 shware_systems Note Edited: 0004957 
2020-08-30 23:06 philip-guentherNote Added: 0004958  
2020-08-30 23:11 philip-guentherNote Added: 0004959  
2020-08-30 23:44 shware_systems Note Added: 0004960  
==




[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-08-30 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


A NOTE has been added to this issue. 
== 
https://austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: New
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2020-08-30 23:11 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

-- 
 (0004959) philip-guenther (reporter) - 2020-08-30 23:11
 https://austingroupbugs.net/view.php?id=697#c4959 
-- 
Oh, and if (b) is chosen and d_ino==0 means an entry that should be
skipped, then XBD's entry for "File Serial Number" should be updated to
indicate that no file will have a file serial number of zero.  (It appears
that the standard does not currently reserve that value.) 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947  
2020-08-28 08:27 geoffclare Note Edited: 0004947 
2020-08-28 17:07 shware_systems Note Added: 0004948  
2020-08-28 17:52 kreNote Added: 0004949  
2020-08-28 22:42 philip-guentherNote Added: 0004952  
2020-08-28 22:52 philip-guentherNote Added: 0004953  
2020-08-29 11:34 shware_systems Note Added: 0004955  
2020-08-29 11:34 shware_systems Note Added: 0004956  
2020-08-29 11:41 shware_systems Note Edited: 0004956 
2020-08-29 11:44 shware_systems Note Deleted: 0004955
2020-08-29 13:07 shware_systems Note Added: 0004957  
2020-08-29 13:11 shware_systems Note Edited: 0004957 
2020-08-30 23:06 philip-guentherNote Added: 0004958  
2020-08-30 23:11 philip-guentherNote Added: 0004959  
==




[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-08-30 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


A NOTE has been added to this issue. 
== 
https://austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: New
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2020-08-30 23:06 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

-- 
 (0004958) philip-guenther (reporter) - 2020-08-30 23:06
 https://austingroupbugs.net/view.php?id=697#c4958 
-- 
The proposed text includes:
The d_name member shall be a filename string, and (if not dot or
dot-dot)
shall contain the same byte sequence as the last pathname component of
the
string used to create the directory entry, plus the terminating 
byte.

That would seem to require that all returned entries correspond to
filenames that existed in the directory at _some_ point in time.  However,
I see no requirement that posix_getdents() only return currently existing
files, nor any requirement that non-existing files be flagged in any way. 
The former at least seems like an oversight that should be corrected.

I interpret the definition of readdir(), including its description of the
DIR type:
The type DIR, which is defined in the  header, represents a
directory stream, which is an ordered sequence of all the directory
entries in a particular directory.
and later text as effectively requiring that readdir() may not return an
entry for a file that did not exist at some point between when opendir() or
rewinddir() was last called and when readdir() returned it.

Since posix_getdents() is all about bulk transfer with no buffering, I
think its description should be updated EITHER to require that
a) all returned entries must have existed at some point during the call,
OR
b) all returned entries with d_ino != 0 must have existed at some point
during
   the call and specify that entries with d_ino == 0 may have d_name[0] ==
'\0'

Specifying (b) is more in line with historical BSD behavior, but does
require additional application logic, so I'm sympathetic to a view that
specifying (a) is the cleaner choice. 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947  
2020-08-28 08:27 geoffclare Note Edited: 0004947 
2020-08-28 17:07 shware_systems Note Added: 0004948  
2020-08-28 17:52 kreNote Added: 0004949  
2020-08-28 22:42 philip-guentherNote Added: 0004952  
2020-08-28 22:52 philip-guentherNote Added: 0004953  
2020-08-29 11:34 shware_systems Note Added: 0004955  
2020-08-29 11:34 shware_systems Note Added: 0004956  

Re: [1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-08-30 Thread Per Mildner via austin-group-l at The Open Group
The posix_getdents() function shall ... place ... posix_dent structures in the 
buffer pointed to by buf up to a maximum of nbyte bytes"
"The array d_name ... shall contain a filename of at most {NAME_MAX} bytes 
followed by a terminating null byte." (so could need up to {NAME_MAX} + 1 
bytes).
"Implementations may define the d_name array .. to ... use a flexible array 
member" (meaning the d_name array does not affect the "size of the posix_dent 
structure").

Does the above not imply that the following should use "greater than", rather 
than "greater than or equal", to make room for "a terminating null byte"?

"The number of posix_dent structures populated in buf ... shall be at least one 
if nbyte is greater than or equal to the size of the posix_dent structure plus 
{NAME_MAX} ..."



Per Mildner
Ph.D.
Digital Systems
Department Computer Science
Unit Computer Systems

D: +46 10 228 43 11
per.mild...@ri.se

RISE Research Institutes of Sweden | ri.se




[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-08-29 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


A NOTE has been added to this issue. 
== 
https://austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: New
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2020-08-29 13:07 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

-- 
 (0004957) shware_systems (reporter) - 2020-08-29 13:07
 https://austingroupbugs.net/view.php?id=697#c4957 
-- 
Re: 4952
Packing of records does not imply record removal, unless flags are defined
for this, such as my suggestions. So, seeing inode values of zero isn't
precluded. 


The only case I see where this might not hold is if a file system does not
report an entry as part of the "file" because it has failed some file
system specific validity test. Something like this would be invisible to
the implementation, however. 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947  
2020-08-28 08:27 geoffclare Note Edited: 0004947 
2020-08-28 17:07 shware_systems Note Added: 0004948  
2020-08-28 17:52 kreNote Added: 0004949  
2020-08-28 22:42 philip-guentherNote Added: 0004952  
2020-08-28 22:52 philip-guentherNote Added: 0004953  
2020-08-29 11:34 shware_systems Note Added: 0004955  
2020-08-29 11:34 shware_systems Note Added: 0004956  
2020-08-29 11:41 shware_systems Note Edited: 0004956 
2020-08-29 11:44 shware_systems Note Deleted: 0004955
2020-08-29 13:07 shware_systems Note Added: 0004957  
==




[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-08-29 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


A NOTE has been added to this issue. 
== 
https://austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: New
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2020-08-29 11:34 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

-- 
 (0004955) shware_systems (reporter) - 2020-08-29 11:34
 https://austingroupbugs.net/view.php?id=697#c4955 
-- 
As written the intent is more the former. The function attempts to pack in
to buf as many records as will fit, only adding padding between records to
satisfy alignment requirements for accessing the d_inode field of a
following record. If ino_t is 64 bits, the d_reclen field will probably be
a multiple of 8 to reflect this. In this scenario d_reclen, and the length
of the record, may be as low as 16.

To ensure success for the corner case where a filename is the maximum
length, the buffer passed in should be at least large enough to hold one of
these records, a malloc( sizeof posix_dent + fpathconf(fildes, PC_NAME_MAX)
+ sizeof ino_t ) should suffice. Normally the buffer will be much larger,
however, based on the size reported by a fstatat(fildes, ".") call. 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947  
2020-08-28 08:27 geoffclare Note Edited: 0004947 
2020-08-28 17:07 shware_systems Note Added: 0004948  
2020-08-28 17:52 kreNote Added: 0004949  
2020-08-28 22:42 philip-guentherNote Added: 0004952  
2020-08-28 22:52 philip-guentherNote Added: 0004953  
2020-08-29 11:34 shware_systems Note Added: 0004955  
==




[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-08-29 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


A NOTE has been added to this issue. 
== 
https://austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: New
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2020-08-29 11:34 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

-- 
 (0004956) shware_systems (reporter) - 2020-08-29 11:34
 https://austingroupbugs.net/view.php?id=697#c4956 
-- 
As written the intent is more the former. The function attempts to pack in
to buf as many records as will fit, only adding padding between records to
satisfy alignment requirements for accessing the d_inode field of a
following record. If ino_t is 64 bits, the d_reclen field will probably be
a multiple of 8 to reflect this. In this scenario d_reclen, and the length
of the record, may be as low as 16.

To ensure success for the corner case where a filename is the maximum
length, the buffer passed in should be at least large enough to hold one of
these records, a malloc( sizeof posix_dent + fpathconf(fildes, PC_NAME_MAX)
+ sizeof ino_t ) should suffice. Normally the buffer will be much larger,
however, based on the size reported by a fstatat(fildes, ".") call. 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947  
2020-08-28 08:27 geoffclare Note Edited: 0004947 
2020-08-28 17:07 shware_systems Note Added: 0004948  
2020-08-28 17:52 kreNote Added: 0004949  
2020-08-28 22:42 philip-guentherNote Added: 0004952  
2020-08-28 22:52 philip-guentherNote Added: 0004953  
2020-08-29 11:34 shware_systems Note Added: 0004955  
2020-08-29 11:34 shware_systems Note Added: 0004956  
==




[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-08-28 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


A NOTE has been added to this issue. 
== 
https://www.austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: New
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2020-08-28 22:52 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

-- 
 (0004953) philip-guenther (reporter) - 2020-08-28 22:52
 https://www.austingroupbugs.net/view.php?id=697#c4953 
-- 
I think the unspecified nature of the d_name member in the new posix_dent
makes writing portable software more difficult while providing only minimal
benefit to programs that don't care.  I would support requiring it to be a
flexible array member and thus eliminating the error of declaring an array
and trying to walk it via indexing instead of by advancing a char pointer
by d_reclen.

(The unspecified nature of d_name in struct dirent made using readdir_r()
unnecessarily painful; thankfully the push to use that API is dead with the
wide recognition that readdir() is thread safe on a per-DIR basis, but the
memory of that pain lingers.) 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947  
2020-08-28 08:27 geoffclare Note Edited: 0004947 
2020-08-28 17:07 shware_systems Note Added: 0004948  
2020-08-28 17:52 kreNote Added: 0004949  
2020-08-28 22:42 philip-guentherNote Added: 0004952  
2020-08-28 22:52 philip-guentherNote Added: 0004953  
==




[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-08-28 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


A NOTE has been added to this issue. 
== 
https://www.austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: New
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2020-08-28 22:42 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

-- 
 (0004952) philip-guenther (reporter) - 2020-08-28 22:42
 https://www.austingroupbugs.net/view.php?id=697#c4952 
-- 
I believe all the historical versions of this interface can return entries
that did not exist (any longer) and that the caller is expected to skip. 
At least on the BSDs this has been with entries with d_ino == 0.  I believe
this had a positive side effect of making readdir() work "more often" after
a seekdir() to the position of a file deleted after the telldir(), as the
position could remain valid despite the deletion.  I take it that this new
interface intentionally does not support this and that the implementation
is required to only expose valid entries. 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947  
2020-08-28 08:27 geoffclare Note Edited: 0004947 
2020-08-28 17:07 shware_systems Note Added: 0004948  
2020-08-28 17:52 kreNote Added: 0004949  
2020-08-28 22:42 philip-guentherNote Added: 0004952  
==




[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-08-28 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


A NOTE has been added to this issue. 
== 
https://www.austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: New
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2020-08-28 17:52 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

-- 
 (0004949) kre (reporter) - 2020-08-28 17:52
 https://www.austingroupbugs.net/view.php?id=697#c4949 
-- 
I suspect that the intent of all of this is good, but this one
phrase's wording (wrt the d_name field, it is used for that
field in both struct direct and struct posix_dent):

 but shall contain a filename of at most {NAME_MAX} bytes

is incomprehensible to me.   I can read it as saying

 no file names longer than NAME_MAX bytes can ever occur in a d_name
or as
 the d_name field must be able to contain file names at least
 NAME_MAX bytes long

I suspect that the latter is most likely what was intended, but I really
don't know for sure - that one allows implementations to support
filesystems
where directories might contain names longer than can be regularly passed
to
system calls, for example, and generally allowing implementation
extensions
is desirable, but the former allows an application to declare an array of
NAME_MAX+1 bytes and be sure that any d_name entry will fit.

This phrase ought to be reworded to make it clear what is intended there.
Since it is talking about a variable length array, better phrasing would
probably concentrate on what bounds exist for the size of that array,
rather
than the length of what might be stored within it.

And while I'm here, I don't think we need to be providing C tutorials,
so I'd drop all the stuff about arrays of posix_dent structures
completely - attempting to write a program using such a thing would be
folly (as it is for any variable sized objects, arrays (in C at least)
require equal sized objects so addresses can be calculated by arithmetic
on
the index - the technique to use to handle several dirent or posix_dent
(or anything similar) that are to be combined into a single linear data
struct is the sequence (not a C data type...) where each item follows on
after the one before it, and the only way (without an ancillary data
struct)
to locate the N'th item is to examine each of the preceding N-1 in order.
(Network packets are full of this kind of object).But none of this
needs
to be in the standard - how to write code is the application programmer's
problem, and text books or "how to" or xxx for dummies, or whatever are
the places where programming techniques belong. 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947   

[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-08-28 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


A NOTE has been added to this issue. 
== 
https://austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: New
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2020-08-28 17:07 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

-- 
 (0004948) shware_systems (reporter) - 2020-08-28 17:07
 https://austingroupbugs.net/view.php?id=697#c4948 
-- 
Since NAME_MAX can vary for subdirs that reference file systems other than
the root, '/' or "//", I think it is better to characterize the possible
length of a posix_dent structure in terms of "as if fpathconf( fd,
PC_NAMEMAX ) is used".

Since nbytes is size_t, and maybe should be ofs_t, the function return
value should be of the same type.

I think there should be a note about the Return Value, that a value of 0 is
possible on a first call, indicating an empty directory on a file system
that does not store '.' and '..' entries.

I feel it should also be emphasized if the fd used to access the file is
not opened with O_RDWR | O_EXCL only the first call to the interface may be
successful. Any additional calls looking for a 0 return may return garbage
data due to another thread modifying the directories entries. An
alternative is to have buf typed as posix_dent**, which if NULL on entry
indicates the interface should malloc() sufficient space for all the
records at once.

I would prefer to see the prototype as varargs one, so any flags an
implementation defines that require additional arguments have a place for
them. For example, a DT_ONLY_TYPE or DT_NOT_TYPE flag would need an
additional value to specify which DT_* file type to include or exclude,
respectively, from the returned records.

As a wrapper for multiple readdir() calls, conceptually, some might like an
additional prototype that uses a DIR * rather than a file descriptor, e.g.
posix_fgetdents(DIR *fildes, ...). 

Issue History 
Date ModifiedUsername   FieldChange   
== 
2013-05-15 10:31 steffenNew Issue
2013-05-15 10:31 steffenName  => Steffen Nurpmeso
2013-05-15 10:31 steffenSection   => none
2013-05-15 10:31 steffenPage Number   => none
2013-05-15 10:31 steffenLine Number   => none
2013-05-15 22:08 jilles Note Added: 0001607  
2013-05-16 10:22 steffenNote Added: 0001608  
2013-05-30 15:37 eblake Relationship added   related to 696  
2013-05-30 15:57 geoffclare Note Added: 0001629  
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus 
  
2020-08-28 08:21 geoffclare Note Added: 0004947  
2020-08-28 08:27 geoffclare Note Edited: 0004947 
2020-08-28 17:07 shware_systems Note Added: 0004948  
==




[1003.1(2013)/Issue7+TC1 0000697]: Adding of a getdirentries() function

2020-08-28 Thread Austin Group Bug Tracker via austin-group-l at The Open Group


A NOTE has been added to this issue. 
== 
https://austingroupbugs.net/view.php?id=697 
== 
Reported By:steffen
Assigned To:
== 
Project:1003.1(2013)/Issue7+TC1
Issue ID:   697
Category:   System Interfaces
Type:   Clarification Requested
Severity:   Comment
Priority:   normal
Status: New
Name:   Steffen Nurpmeso 
Organization:
User Reference:  
Section:none 
Page Number:none 
Line Number:none 
Interp Status:  --- 
Final Accepted Text: 
== 
Date Submitted: 2013-05-15 10:31 UTC
Last Modified:  2020-08-28 08:21 UTC
== 
Summary:Adding of a getdirentries() function
==
Relationships   ID  Summary
--
related to  696 either NAME_MAX shouldn't be optional, ...
== 

-- 
 (0004947) geoffclare (manager) - 2020-08-28 08:21
 https://austingroupbugs.net/view.php?id=697#c4947 
-- 
In the Aug 27th teleconference the following proposed changes were agreed
to be sufficiently mature to form the basis for a call for sponsorship to
The Open Group Base Working Group.  There may be further minor changes,
particularly in the APPLICATION USAGE for posix_getdents(), before they are
considered ready to go into a formal document to be submitted for The Open
Group company review.

(Note that the first  change removes XSI shading from d_ino in
struct dirent. This shading is already questionable because the readdir()
page states an unshaded requirement about how d_ino is set.)
 
Page and line numbers are for the 2016/2018 edition.
 
On page 231 line 7773 section , change:It shall also
define the structure dirent which shall include the following
members:[XSI]ino_t  d_ino   File serial number.[/XSI]
 char   d_name[]Filename string of entry.
[XSI]The  header shall define the ino_t type as
described in .[/XSI]
 
The array d_name is of unspecified size, but shall contain a
filename of at most {NAME_MAX} bytes followed by a terminating null
byte.to:It shall also define the structure
dirent which shall include the following members:ino_t 
d_ino   File serial number.
char   d_name[]Filename string of this entry.
and the structure posix_dent which shall include the following
members:ino_t  d_ino  File serial number.
reclen_t   d_reclen   Length of this entry, including
   trailing padding if necessary. See
   posix_getdents().
unsigned char  d_type File type or unknown-file-type
indication.
char   d_name[]   Filename string of this entry.
The array d_name in each of these structures is of unspecified size,
but shall contain a filename of at most {NAME_MAX} bytes followed by a
terminating null byte.
 
The  header shall define the ino_t,
reclen_t, and size_t types as described in
.
 
The  header shall define the following symbolic constants
for the file types and unknown-file-type indicator returned in the
d_type member of the posix_dent structure. The values shall
be distinct and shall be suitable for use in #if preprocessing
directives:
 
DT_BLKBlock special.
DT_CHRCharacter special.
DT_DIRDirectory.
DT_FIFOFIFO special.
DT_LNKSymbolic link.
DT_REGRegular.
DT_SOCKSocket.
DT_UNKNOWNUnknown file type.
The implementation may implement message queues, semaphores, shared memory
objects [TYM]or typed memory objects[/TYM] as distinct file types. The
following macros shall be provided to represent these types. The values
shall be distinct from each other and from the above symbolic constants
beginning with DT_, except when a distinct file type is not implemented, in
which case the correponding constant shall have a value that is never
returned in d_type by posix_getdents(). The values shall be
suitable for use in #if preprocessing directives:
 
DT_MQMessage queue.
DT_SEMSemaphore.
DT_SHMShared memory object.
[TYM]DT_TMOTyped memory
object.[/TYM]
On page 231 line 7784 section , add:int
posix_getdents(int, void *, size_t, int);
On page 231 line 7804 section , add new paragraphs to
RATIONALE:The posix_dent structure was based on existing