Re: [Nmh-workers] m_getfld() assert(3) failure with scan of -file /dev/null.

2017-08-08 Thread David Levine
Valdis wrote:

> On Tue, 08 Aug 2017 18:04:56 +0100, Ralph Corderoy said:
>
> > I did git-bisect(1) it. Over time, it homed in on the enabling of
> > assert(3)s by default. :-)
>
> Which points at it *always* having been broken.. 

Indeed, though in this case, that's just since the m_getfld() rework of
4.5 years ago.

Fixed by commit 0d593e1ce1a218332af78b83987543756b0c6cf4.  I also applied
it to the 1.7 release branch, it's low risk and fixes a bug.

David

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] m_getfld() assert(3) failure with scan of -file /dev/null.

2017-08-08 Thread valdis . kletnieks
On Tue, 08 Aug 2017 18:04:56 +0100, Ralph Corderoy said:

> I did git-bisect(1) it. Over time, it homed in on the enabling of
> assert(3)s by default. :-)

Which points at it *always* having been broken.. 

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


[Nmh-workers] m_getfld() assert(3) failure with scan of -file /dev/null.

2017-08-08 Thread Ralph Corderoy
Hi,

I've opened https://savannah.nongnu.org/bugs/index.php?51693 to track
this because I don't have time at the moment, but I don't know who gets
those "bug opened" emails so here's a copy.

$ git describe
1.7-branchpoint-10-ga091c28b
$ uip/scan -format '' -file /dev/null
scan: sbr/m_getfld.c:461: read_more: Assertion `retain <= s->readpos - 
s->msg_buf' failed.
Aborted (core dumped)
$

I did git-bisect(1) it. Over time, it homed in on the enabling of
assert(3)s by default. :-)

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] nmh-1.7-RC1: scan with complex subjects dumps core

2017-08-08 Thread Ralph Corderoy
Hi David,

> > Sunglasses have a width of 1 here, that's why David and I don't see
> > the problem.
>
> I'm surprised that I didn't see the same behavior as Norm, because we
> use the same locale, en_US.utf8.  Any idea why?

I'm en_GB.utf8, but I don't see it either.  It's the wcwidth(3) answer
for a codepoint, and as Unicode continue to add POO WITH SUNGLASSES, so
the answers change with the version of one's system's database.

$ test/getcwidth --ctype | grep 1f576
 1f576   1  -pg--@--

So here it's `print', `graph', and `punct', with a width of 1.  Norm's
gang have a width of -1 as they haven't the foggiest what it is.
http://unicode.org/cldr/utility/character.jsp?a=1f576=Show says its
East Asian width is `Neutral', which is treated as `Narrow', so
getcwidth reporting 1 matches.

Nearby is http://unicode.org/cldr/utility/character.jsp?a=1f57a=Show
that says it's `Wide', but here I don't know anything about that yet,
thankfully.

$ test/getcwidth --ctype | grep 1f57a
 1f57a  -1  

One can poke about the local definitions.

$ test/getcwidth --ctype | awk '{print $2}' |
> sort -n | uniq -c
  57249 -1
   1723 0
  29884 1
  95464 2
$
$ test/getcwidth --ctype | awk '{print $3}' |
> LC_ALL=C sort | uniq -c
  57183 
 14 -psb
  15563 -pg--@--
 10 -pg---dxN---
 107528 -pgaN---
   2167 -pga-l--N---
  6 -pga-l-xN---
   1772 -pgau---N---
  6 -pgau--xN---
  4 -pgaul--N---
 60 c---
  6 c-s-
  1 c-sb
$ 

That says there are four runes that are both upper and lower!

$ printf '%b\n' $(test/getcwidth --ctype |
> awk '$3 ~ /ul/ {print "\\u" $1}')
Dž
Lj
Nj
Dz
$

And here's the first printable zero-width.

$ test/getcwidth --ctype | grep -m1 ' 0 .*p'
ad   0  -pg--@--

U+00AD is soft hyphen.  Unicode is said to be an ISO 8859-1 superset,
and U+AD was soft hyphen in that too, but visible, with a width of 1.
ISO used it at the end of the line to show a word had been broken, but
not by the author, allowing it to be stripped on re-formatting.  Unicode
changed that.  For them, it's a hint from the author to the renderer
that here's a potential point to break the word, thus, when rendered,
it's not visible and has zero width.  Toc toc toc!

Terminals get this wrong.  libvte-based terminals here think it has
width.

$ s="$(printf '\uad')"
$ scan -format "_%4(lit foo)_\n_%4(lit £)_\n_%4(lit $s)_" .
_foo _
_£   _
_­_   [Rune after first _ isn't a space.]
$

Dickey's venerable xterm(1) does better.

$ s="$(printf '\uad')"
$ scan -format "_%4(lit foo)_\n_%4(lit £)_\n_%4(lit $s)_" .
_foo _
_£   _
__   [All four are spaces.]
$

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] nmh-1.7-RC1: scan with complex subjects dumps core

2017-08-08 Thread Ralph Corderoy
Hi,

David wrote:
> > inc seems to have a similar, if not identical problem.
>
> It's identical:  inc uses the same code to print the scan line.

I was surprised recently when plodding through the source that inc(1)
actually uses the scan() function to do the inc-ing, with the scan as a
side effect.  IIRC.  And this means scan() has to bear in mind inc(1)
wants all the email even though the scan listing stopped a while ago.

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers