Re: can not download IMAP messages with isync/mbsync

2022-11-15 Thread Mouse
> And, the wraparound seems to happen at 0x7fff instead of
> 0x.  Don't know ARM well enough to explain why.

It's probably using a signed, instead of unsigned, conditional branch
instruction.  (I think for ARM it's the branch rather than the compare
that differs for signed vs unsigned.)

If the ARM ABI can place data both above and below the 0x8000
divide, that's another bug waiting to happen in the ARM assembly
strnlen; it will misbehave for a string that crosses that point, even
when given a non-ludicrous second argument.

But I suspect it really should just get rid of the "end = str +
maxlen;" and "ptr < end" paradigm altogether, whether or not it's
written in assembly.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: can not download IMAP messages with isync/mbsync

2022-11-14 Thread Mouse
>> My guess is that the buffer you're testing with is near the top of
>> the address space, within ~1GB of address 0x, and what
>> you're seeing is due to wraparound.
> Thanks for that analysis--address-wrapping was my first guess too,
> but, I didn't have the time to confirm it: the 1GB was with a
> standalone program; in mbsync itself, the range was much
> smaller--less than 1MB even.

I haven't confirmed it myself.  I don't have an ARM machine running
anything more recent than 4.0.1 (and that much only quite recently - I
found my shark in storage and am only just getting it back in full
operation).  4.0.1 appears to not even _have_ strnlen.  But my reading
of the assembly code I found in 9.1's /usr/src matches the behaviour
you describe far too well for me to think it's entirely coincidence;
I'm fairly fairly confident of my analysis.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: can not download IMAP messages with isync/mbsync

2022-11-14 Thread Mouse
> Or is UINT_MAX not guaranteed to fit in size_t

I _think_ there is no guarantee that UINT_MAX fits in a size_t.  But,
upthread, I see...

> Turn out, on ARM, strnlen(3) is written in assembly and this always
> returns `maxlen' for any value of `maxlen' > ~1GB.

Not quite.

I have a guest login on a 9.1 machine, and found the ARM strnlen there.
I am not an ARM expert, but I know it enough to, I think, find and
explain the bug.

My guess is that the buffer you're testing with is near the top of the
address space, within ~1GB of address 0x, and what you're
seeing is due to wraparound.

Here's the relevant code (from 9.1):

addsr5, r0, r1  /* get ptr to end of string */
mov r4, r1  /* save maxlen */
...
.Lmain_loop:
#ifdef STRNLEN
cmp r0, r5  /* gone too far? */
bge .Lmaxed_out /*   yes, return maxlen */
#endif
(The code at .Lmaxed_out just returns maxlen, as the comment implies.)

Back-translating loosely into C, what we have here is

strnlen(const char *buf, int maxlen)
{
const char *end;

end = buf + maxlen;
...
while (1) {
if (buf > end)
return(maxlen);
...
}
}

This back-translation is, of course, broken from a C perspective, but
it's supposed to be illustrative, not precise.  The bug: if buf+maxlen
overflows (at the machine-code level, on ARM32, buf, maxlen, and end
are each just 32-bit integers), then buf>end can be true right from the
start, terminating the loop (and returning maxlen) before it should.

The 9.1 manpage for strnlen says

 The strnlen() function returns either the same result as strlen() or
 maxlen, whichever is smaller.

which makes this a violation of its spec.  The only way it could be
non-broken is if size_t's range and the address space layout
collaborate to ensure that string + maxlen can't wrap around.  Since I
think both are 32-bit on (32-bit) ARM, this isn't so.

Also,

-   uint maxlen = UINT_MAX;
+   uint maxlen = sizeof(buf);

if maxlen is passed unchanged to strnlen, I can't see how the original
code isn't a bug; there's no point in using strnlen if you're pass a
maxlen greater than the space remaining in the buffer your pointer
points into.  I'd have to look at more of the surrounding code to be
sure, though.  (It also depends on nonportable assumptions about the
relative sizes of uint and size_t, but that bug is concealed on 32-bit
NetBSD.)

/~\ The ASCII     Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Feed facility/priority to logger(1) via stdin - desirable extension or bad idea?

2022-10-16 Thread Mouse
>> Is there some reason you don't actually syslog() the log messages,
>> then, rather than sending them down a pipe?  [...]

It sounds to me as though there are two reasons: (1) you want
specifying the facility to be separate from specifying the priority and
(2) you're writing in a language, such as sh, which doesn't have direct
access to syslog().

> =E2=9D=AF echo "alert|This in an alert message"|logger -F local2

This may well be useful, though I have some thoughts on your use case
as sketched.

I am not trying to criticize, here.  I am trying to examine all the
angles, in an attempt to help ensure you get the best fit for your use
case - and, if this ends up with a change to the main tree, that it is
more generally useful, to the extent that that's easy.

> The only point of contact is the naming of the loglevels, which is
> oriented on the syslog standard.

There are actually at least three (or arguably two - the first two are
closely related) syslog standards.

(1) There is the syslog(3) API, which uses names such as LOG_ALERT and
LOG_AUTH.

(2) There is the syslog.conf language, which uses two-part names such
as kern.notice and auth.warning, with support for some further syntax
(eg ftp.*, kern.none).

(3) There is the wire protocol, which uses small integer values in
ASCII decimal (see RFC 3164).

There may be others, but these are the ones I'm aware of.

> Otherwise, the only communication is via stderr - a way that is
> available in every conceivable programming language.

But often undesirable, as it means losing the ability to report error
messages in the usual way.  For your use case, this may not be a big
deal, as the alternative is for stderr to end up mailed to wherever
cron sends mail for that job.  (I also could quibble about "every
conceivable programming language", but I think "every practical
language on NetBSD" would probably be fair.)

> The program itself can recognize at runtime whether it writes to a
> terminal or to stderr (pipe).  [...colour when to a tty...]

If you really want.  Personally, I'm having trouble thinking of an
instance I've seen of doing that - changing output based on whether
it's going to a tty - that I don't think would better be eliminated or
controlled some other way.  But your use case is for a very restricted
domain.

However, this is pushing yet more knowledge, in this case knowledge
that "going to a tty" means "going to a tty capable of colour using
these sequences, to be displayed to a user who wants colour", back into
the program.

> Another advantage is that the logger process is started only once
> when the program is started, instead of every time a log line is
> written.

This is the remark from which I infer that you are writing in a
scripting language such as sh rather than something, like C, that has
direct access to syslog(3): if you had syslog(3) or moral equivalent,
you would be starting _zero_ processes per message logged if the
generator were to log them directly.

I also offer two thoughts: 1, if starting another process is expensive
enough to care about, do you really want to be writing in a scripting
language such as sh at all? and 2, if you're logging enough for the
cost of running logger(1) for each log message to be significant,
perhaps your logs are verbose enough that syslog is not the best way to
log them?

And one final thought on the syslog change.  Perhaps the part before
the | delimiter could be facility.priority, with either or both
optionally missing and the -F argument providing defaults used when
they are missing?  Your use case may have no use for providing anything
but the priority with the message, but others may.

> In contrast, I can't think of a practical use case for the comment
> included in the original logger code (parsing the syslog log via
> stdin).

Processing log messages (presumably received from elsewhere, most
likely using the wire protocol) in a scripting language?

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Feed facility/priority to logger(1) via stdin - desirable extension or bad idea?

2022-10-07 Thread Mouse
> With a lot of scripts and tools written, I have gotten into the habit
> of logging all logging output to stderr, as well as any form of
> payload to stdout.

What do you do with error messages, then?

> In my productive NetBSD environments, this logging tool is then usually 
> logger(1), which is at the end of the pipeline and sends the output to 
> syslogd.
> 
> logger(1) takes a parameter -p, which I can use to set the facility
> and the priority.  This is where it starts to get a bit
> uncomfortable.  I have no ability to influence the facility /
> priority on the other side of the pipeline.  This means that I lose
> the capabilities of Syslogd, e.g. to use different log files
> depending on the priority.

Is there some reason you don't actually syslog() the log messages,
then, rather than sending them down a pipe?  It sounds to me as though
you are going to have to make your log generator logging-aware, but,
then, I don't see what benefit you get from piping the output to a tool
instead of just logging it directly.  (The obvious (to me) benefit is
that you can control facility and priority with the logging tool
instead of wiring it into the code, but here you're pushing it back
into the log-generation code anyway.)

> It would be nice if one could, for example, optionally omit the -p
> parameter and instead specify the facility and priority via the
> standard input of logger in coded form (coded with angle brackets as
> in the raw syslog protocol)

Regardless of the motivation, this strikes me as a reasonable thing.

> *) You could then decide in your scripts, depending on whether stderr
> is connected to a terminal or a pipe, whether you want to output nice
> coloured terminal logging or logging optimised for syslog with a
> prefix

Speaking as a user, please do not assume that anyone sending logs to a
terminal wants coloured output.  It's not true, and assuming it is
tends to produce annoyingly corrupted output when it's not applicable.
This is one of my bigger beefs with recent Ubuntu and Debian: more and
more tools blindly assume that (a) the user wants colour when the
output is going to a tty and (b) that it knows how to generate colour.
Each of those is false for me.  Typically, the resulting output looks
like this (where I forced colour on on the command line because I've
gone to some lengths to get rid of it by default):

$ ls --color=always /
[0m[01;34mbin[0m   [01;34metc[0m [01;34mlib[0m [01;34mmnt[0m   
[01;34mroot[0m  [01;34mselinux[0m  [30;42mtmp[0m  [01;36mvmlinuz[0m
[01;34mboot[0m  [01;34mhome[0m[01;34mlost+found[0m  [01;34mopt[0m   
[01;34mrun[0m   [01;34msrv[0m  [01;34musr[0m
[01;34mdev[0m   [01;36minitrd.img[0m  [01;34mmedia[0m   [01;34mproc[0m  
[01;34msbin[0m  [01;34msys[0m  [01;34mvar[0m
$ 

It's far worse with other tools.  Modern gdb borders on unusable.
Here's an example, cut-and-pasted from the window I just did a test in:

Breakpoint 1, [33ml_cmp[m ([36mcookie[m=0xb220 , 
[36ma[m=0xb020 , [36mb[m=0xb028 ) at 
[32mtest.c[m:106
106  [01;34mif[m [31m([mcookie [31m!=[m [31m&[mlist[31m)[m 
[01mpanic[m[31m();[m
[?2004h(gdb) l

gdb is blindly assuming I want colour, and furthermore assuming, not
only blindly but in defiance of active evidence that it's not so, that
the ISO 6429 values to X3.64's SGR sequence will generate it.  (The
"active evidence" is that $TERM is a type whose description includes
not only no indication of colour support but no X3.64 at all.)  It's
also generating other sequences, like that peculiar [?2004h, with the
same negative amount of reason to think they'll work.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: So it seems "umount -f /nfs/mount" still doesn't work.....

2020-07-09 Thread Mouse
> So, I should have mentioned that "umount -f nfs.server:/remotefs"
> does work ([...]).

> I.e. the problem is in how umount(8) looks up the parameters of the
> mount point.  If it looks at the mount point it hangs, but if it
> looks through the mount table, it works.

Is this a case where -R helps?

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Strange semaphore behavior, sem_init() fails with errno 4294967295 (-1)

2019-02-20 Thread Mouse
> In each thread, my software does a fork() followed by an execve().
> If I remove this fork(), I'm unable to reproduce this bug.

I have a fuzzy memory that fork() may do something to semaphores...?

> int
> sem_init(sem_t *sem, int pshared, unsigned int value)
> {
> intptr_tsemid;
> int error;
> 
> if (_ksem_init(value, ) == -1)
> return (-1);
> 
> if ((error = sem_alloc(value, semid, sem)) != 0) {
> _ksem_destroy(semid);
> errno = error;
> return (-1);
> }
> 
> return (0);
> }

> As errno contains an error, I suppose that sem_alloc() returns this
> error, but sem_alloc() can only return ENOSPC or EINVAL...

If _ksem_init is, as the name seems to imply, a kernel call, could it
maybe be setting errno?

What is your basis for saying that sem_alloc can generate only ENOSPC
and EINVAL?  Reading the source, or looking at documentation, or what?
In particular, if it's documentation, don't trust it too much; I've
seen documentation lie far too often.

Also, don't forget that successful calls normally don't touch errno,
though I _think_ that doesn't matter here

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B