Re: [Nmh-workers] mhshow(1) iconv(3) Bug if Multibyte Straddles Buffer End.

2016-10-16 Thread Lyndon Nerenberg

See docs/README.developers.  We don't have a written convention for
when to use a branch, so it's a judgment call considering how
invasive the changes will be, duration, likelihood of success, and
whatever else.  (I am planning to remove my old merged branches,
maybe after the release.  We don't do that very often, maybe this
can give you a clue to how often we create branches.)


That pretty much nails it.  I think most of us just work on private 
local branches until we're ready to merge to the head and push back to the 
main repository.  Public branches generally show up when things need wider 
testing and review before being merged back to the trunk; they show up 
infrequently.


--lyndon


___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] mhshow(1) iconv(3) Bug if Multibyte Straddles Buffer End.

2016-10-15 Thread David Levine
Ralph wrote:

> Thanks, that helped quite a bit.  I've pushed a trivial fix to master.
> If I've done anything wrong, e.g. not configured my ID properly, then
> let me know.

Looks fine.

One thing that we should add to README.developers, the buildbot results
are available at http://orthanc.ca:8010/waterfall .  It polls for commits,
I'm not sure what the interval is but it seems like a small number of
minutes.  Green is good. Only three hosts are active on master now.

David

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] mhshow(1) iconv(3) Bug if Multibyte Straddles Buffer End.

2016-10-15 Thread Ralph Corderoy
Hi,

David wrote:
> See docs/README.developers.

Thanks, that helped quite a bit.  I've pushed a trivial fix to master.
If I've done anything wrong, e.g. not configured my ID properly, then
let me know.

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] mhshow(1) iconv(3) Bug if Multibyte Straddles Buffer End.

2016-10-15 Thread Ken Hornstein
>:-)  Through a cunning bit of social engineering the other day, I did
>apparently gain access to nmh's git repository.  Is there anything that
>documents conventions in using it for the project, e.g. whether to check
>in directly on master or use a branch?

I think David covered the (lack of rules) adequately; we don't have any
real rules for branch vs master, other than "use your best judgement".

The only other thing to be aware of is if you're building from the git
repo, you need to have more stuff than you do if you're building from a
distribution tarfile (like the Autotools suite, yacc, and lex).

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] mhshow(1) iconv(3) Bug if Multibyte Straddles Buffer End.

2016-10-15 Thread David Levine
Ralph wrote:

> Hi David,
>
> :-)  Through a cunning bit of social engineering the other day,

I don't want to know :-)

> I did apparently gain access to nmh's git repository.  Is there
> anything that documents conventions in using it for the project,
> e.g. whether to check in directly on master or use a branch?

See docs/README.developers.  We don't have a written convention for
when to use a branch, so it's a judgment call considering how
invasive the changes will be, duration, likelihood of success, and
whatever else.  (I am planning to remove my old merged branches,
maybe after the release.  We don't do that very often, maybe this
can give you a clue to how often we create branches.)

> > This shouldn't happen very often, so I'd lean toward the simpler code.
>
> By that I guess you mean the status quo?

Yes :-)

David

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] mhshow(1) iconv(3) Bug if Multibyte Straddles Buffer End.

2016-10-15 Thread Ralph Corderoy
Hi David,

> Great!  We'll get you to the bleeding edge yet.

:-)  Through a cunning bit of social engineering the other day, I did
apparently gain access to nmh's git repository.  Is there anything that
documents conventions in using it for the project, e.g. whether to check
in directly on master or use a branch?

> > An alternative to sliding down the remaining unprocessed input with
> > memmove(3) and shortening the next fread, I suppose.
>
> This shouldn't happen very often, so I'd lean toward the simpler code.

By that I guess you mean the status quo?

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] mhshow(1) iconv(3) Bug if Multibyte Straddles Buffer End.

2016-10-15 Thread David Levine
Ralph wrote:

> Hi David,
>
> > Might Ken's commit adfed5f72bc07ac7de8dfc62188338d4d4f25a38 have fixed
> > this?
>
> Yes, indeed.

Great!  We'll get you to the bleeding edge yet.

> IOW, it seeks to 4Ki and reads 4Ki - 1 so it's left in the right place
> to read the one byte, that we already have, next time.  An alternative
> to sliding down the remaining unprocessed input with memmove(3) and
> shortening the next fread, I suppose.

This shouldn't happen very often, so I'd lean toward the simpler code.

> `strace -e desc mhparam foo' shows many lseek(2)s to get the current
> position on .mh_profile;  always the same.  Triggered by the infamous
> m_getfld()'s ftello(3).  :-)  Must trigger for every header of every
> email processed too.

Yes, m_getfld() could use another rewrite.  Though the last one really
wasn't, I tried to maintain the existing logic to avoid too many
simultaneous changes.

David

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] mhshow(1) iconv(3) Bug if Multibyte Straddles Buffer End.

2016-10-15 Thread Ralph Corderoy
Hi David,

> Might Ken's commit adfed5f72bc07ac7de8dfc62188338d4d4f25a38 have fixed
> this?

Yes, indeed.  I get identical output from iconv(1) and mhshow(1) with
the function from
http://git.savannah.gnu.org/cgit/nmh.git/tree/uip/mhshowsbr.c.

> +   if (errno == EINVAL) {
> +   /* middle of multi-byte sequence */
> +   if (write (fd, dest_buffer, outbytes_before - outbytes) < 0) {
> +   advise (dest, "write");
> +   }
> +   fseeko (*fp, -inbytes, SEEK_CUR);

Interestingly, that seeking back by the 1 unprocessed byte of input so
the top-of-loop's fread can take another whole 8KiB triggers

fseeko(0xaf20d0, -1, 1, 0x7f1c02ff3530 
SYS_lseek(3, 4096, 0)= 4096
SYS_read(3, "\315\273\n5)"..., 4095) = 4095
<... fseeko resumed> )   = 0
__fread_chk(0x7fff69f40800, 8192, 1, 8192 

IOW, it seeks to 4Ki and reads 4Ki - 1 so it's left in the right place
to read the one byte, that we already have, next time.  An alternative
to sliding down the remaining unprocessed input with memmove(3) and
shortening the next fread, I suppose.

`strace -e desc mhparam foo' shows many lseek(2)s to get the current
position on .mh_profile;  always the same.  Triggered by the infamous
m_getfld()'s ftello(3).  :-)  Must trigger for every header of every
email processed too.

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] mhshow(1) iconv(3) Bug if Multibyte Straddles Buffer End.

2016-10-15 Thread Ken Hornstein
>> 1.6's mhshow(1) says
>
>Might Ken's commit adfed5f72bc07ac7de8dfc62188338d4d4f25a38
>have fixed this?

I think our generic assumption is that utf8 is the only multibyte sequence
we have to deal with.  Although I guess that really only matters if we get
an EILSEQ and we're substituting a '?'.

--Ken

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] mhshow(1) iconv(3) Bug if Multibyte Straddles Buffer End.

2016-10-15 Thread David Levine
Ralph wrote:

> > 1.6's mhshow(1) says
> >
> > mhshow: unable to convert character set to gb2312, continuing...
>
> I meant to draw attention to that.  It was converting *from* gb2312 (to
> UTF-8).

Fixed, thanks.

David

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] mhshow(1) iconv(3) Bug if Multibyte Straddles Buffer End.

2016-10-15 Thread David Levine
Ralph wrote:

> 1.6's mhshow(1) says

Might Ken's commit adfed5f72bc07ac7de8dfc62188338d4d4f25a38
have fixed this?

> I took a look at mhshowsbr.c's convert_charset() and I think it's
> failing to handle an EINVAL return.

That commit adds handling of EINVAL and EISLEQ, relevant portion of
the diff is below.

David

+   if (errno == EINVAL) {
+   /* middle of multi-byte sequence */
+   if (write (fd, dest_buffer, outbytes_before - outbytes) < 0) {
+   advise (dest, "write");
+   }
+   fseeko (*fp, -inbytes, SEEK_CUR);
+   if (end > 0) bytes_to_read += inbytes;
+   /* advise(NULL, "convert_charset: EINVAL"); */
+   continue;
+   }
+   if (errno == EILSEQ) {
+   /* invalid multi-byte sequence */
+   if (fromutf8) {
+   for (++ib, --inbytes;
+inbytes > 0 &&
+   (((unsigned char) *ib) & 0xc0) == 0x80;
+++ib, --inbytes)
+   continue;
+   } else {
+   ib++; inbytes--; /* skip it */
+   }
+   (*ob++) = '?'; outbytes --;
+   /* advise(NULL, "convert_charset: EILSEQ"); */
+   goto iconv_start;
+   }
+   advise (NULL, "convert_charset: errno = %d", errno);

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] mhshow(1) iconv(3) Bug if Multibyte Straddles Buffer End.

2016-10-15 Thread Ralph Corderoy
Hi again,

> 1.6's mhshow(1) says
>
> mhshow: unable to convert character set to gb2312, continuing...

I meant to draw attention to that.  It was converting *from* gb2312 (to
UTF-8).

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

___
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers