Re: [nmh-workers] Formatting HTML to Text: netrik.

2019-07-07 Thread Robert Elz
Date:Sun, 07 Jul 2019 19:39:47 -0700
From:"Ronald F. Guilmette" 
Message-ID:  <42579.1562553...@segfault.tristatelogic.com>


  | I confess I have and did explicitly set within my personal .login file
  | thusly:
  |
  |setenv LANG en_US.UTF-8

That should be fine (I'll pass on the sanity of still using any csh
variety shell in this day and age, as that's unrelated).

  | For the rest, if you think that they are all improper,

No, as David Levine said, and I noted in my subsequent message,
I was just misinterpreting the output you supplied (as if it were
part of a .profile).

  | since I seem to be simply inheriting their
  | common and systemwide defaults for all of these things.

Not quite, since you set LANG (which is reasonable) the others
are just inheriting its value (not being set to anything different)
which it as it should be.   The systemwide defaults (if you did not
have lang set) would be "C" for everything (except LC_ALL).


  |  *I* do not even have any real clear idea of
  | what any of these envars do, or are supposed to do,

LC_CTYPE sets the character type - defines how characters are encoded
(as in UTF-8 or ISO-8859-1 or BIG5 (Chinese) etc).

LC_COLLATE defines how characters are ordered

LC_TIME says how time (of day) is represented (d/m/y m/d/y, 12 or 24 hour, etc)

LC_NUMERIC is how numbers are represented, incuding what character
is used for the "decimal point" (aka radix character) and as the
grouping character (and how many digits in a group) etc.   Also how
negative numbers are written.  That is for everything except:

LC_MONETARY does similar for numeric values that are monetary values ($3.75
etc).  That allows -3 to be how a normal negative number is represented,
where as a $3 debt might be (3) instead (the way accountants write things
sometimes).

LC_MESSAGES defines the language to use for messages from any utilities
that have message catalog files (essentially it gets converted to the file
name of the file which contains the strings (format strings) to be used
for messages for a particular program).

Then LANG provides a default value for all of those, and unless you need
some special effect, is generally the right thing to set to simply say
(I want French as in France, or as in Canada, or I want US English, or
British English, or Australian English) - and then you just get all of
the others being correct for that environment.   I sometimes set LANG
and then LC_TIME to a locale of my own which avoids the ambiguous x/y/z
format dates (which are used, in different ways, in most English locales)
and sets 24 hour time rather than 12 hgour am/pm which I simply prefer.

LC_ALL is an override - when set all the others get ignored, and this
one is used for everything.   It's only appropriate use is when something
needs to ignore whatever the user might have set and operate in a particular
known locale in order to work correctly - things like [a-z] in patterns
give weird results in some locales where 'z' isn't the last letter of the
alphabet (there are many such oddities around).

kre


-- 
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers

Re: [nmh-workers] Formatting HTML to Text: netrik.

2019-07-07 Thread Ronald F. Guilmette
In message <13507.1562541...@jinx.noi.kre.to>, you wrote:

>Date:Sun, 07 Jul 2019 12:56:13 -0700
>From:"Ronald F. Guilmette" 
>Message-ID:  <41320.1562529...@segfault.tristatelogic.com>
>
>  | But to answer Ralph's inquiry...
>  |
>  |
>  | LANG=en_US.UTF-8
>  | LC_CTYPE="en_US.UTF-8"
>  | LC_COLLATE="en_US.UTF-8"
>  | LC_TIME="en_US.UTF-8"
>  | LC_NUMERIC="en_US.UTF-8"
>  | LC_MONETARY="en_US.UTF-8"
>  | LC_MESSAGES="en_US.UTF-8"
>  | LC_ALL=
>
>That looks wrong ... that last line undoes the effect of all of the
>others, when LC_ALL is set, it overridses everything else.  (Though
>XTERM_LOCALE isn't one of this set and is entirely unrelated).
>Nb: it is not the order that matters, it is having LC_ALL set at all.
>
>In general, LC_ALL should almost never be set except in a usage like
>
>   LC_ALL=C grep ...
>
>in a script, where you want (need) to return temporarily to the default
>locale in order to run one particular command that way (so any other
>locale settings that may be in the environment are temporarily ignored).
>(This can be needed to make things like '[a-z]' work properly in the pattern.)
>
>You *never* want it set in the environment.
>
>Setting LANG along with all the others is also kind of useless, but that
>one is harmless - that's the fallback default for the LC_xxx's that aren't
>set.
>
>The double quotes that are used are also not needed - but those affect
>nothing and are largely an issue of style.


You are telling all of this to the Wrong Guy my friend.

I have not myself taken any affirmative steps to set *any* of these
envars to their indicated values, with the exception of LANG, which
I confess I have and did explicitly set within my personal .login file
thusly:

   setenv LANG en_US.UTF-8

(Note that I use C-shell, not Borne shell, and that is why I have a
.login file.)

For the rest, if you think that they are all improper, I can only
respectfully suggest that you kindly address these issues to the
maintainers of FreeBSD, since I seem to be simply inheriting their
common and systemwide defaults for all of these things.

If you really feel that these are all improper setting, then please,
you will have a much wider and more beneficial effect if you do
notify the FreeBSD people. I do not feel that *I* can even do that
effectively, since *I* do not even have any real clear idea of
what any of these envars do, or are supposed to do, let alone what
might arguably be wrong with the default FreeBSD settings of any
of them.

In short, I am entirely out of my depth.


Regards,
rfg

-- 
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers

Re: [nmh-workers] Formatting HTML to Text: netrik.

2019-07-07 Thread Robert Elz
Date:Sun, 07 Jul 2019 21:43:37 -0400
From:David Levine 
Message-ID:  <9828-1562550217.965...@xegg.pdiv.zhwy>


  | That looks like the output of locale(1) rather than variable
  | assignments.

Ah yes - even though that's what Ralph asked for, I didn't bother
to imagine that that was literally what was given ... I should have.

So, it indicates that LANG is set, and nothing else, which should
be fine.

kre



-- 
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers

Re: [nmh-workers] Formatting HTML to Text: netrik.

2019-07-07 Thread Robert Elz
Date:Sun, 07 Jul 2019 12:56:13 -0700
From:"Ronald F. Guilmette" 
Message-ID:  <41320.1562529...@segfault.tristatelogic.com>

  | But to answer Ralph's inquiry...
  |
  |
  | LANG=en_US.UTF-8
  | LC_CTYPE="en_US.UTF-8"
  | LC_COLLATE="en_US.UTF-8"
  | LC_TIME="en_US.UTF-8"
  | LC_NUMERIC="en_US.UTF-8"
  | LC_MONETARY="en_US.UTF-8"
  | LC_MESSAGES="en_US.UTF-8"
  | LC_ALL=

That looks wrong ... that last line undoes the effect of all of the
others, when LC_ALL is set, it overridses everything else.  (Though
XTERM_LOCALE isn't one of this set and is entirely unrelated).
Nb: it is not the order that matters, it is having LC_ALL set at all.

In general, LC_ALL should almost never be set except in a usage like

LC_ALL=C grep ...

in a script, where you want (need) to return temporarily to the default
locale in order to run one particular command that way (so any other
locale settings that may be in the environment are temporarily ignored).
(This can be needed to make things like '[a-z]' work properly in the pattern.)

You *never* want it set in the environment.

Setting LANG along with all the others is also kind of useless, but that
one is harmless - that's the fallback default for the LC_xxx's that aren't
set.

The double quotes that are used are also not needed - but those affect
nothing and are largely an issue of style.

kre


-- 
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers

Re: [nmh-workers] Formatting HTML to Text: netrik.

2019-07-07 Thread David Levine
Ralph writes:

> What if mhfixmsg could also append a new text/plain after the existing
> one, i.e. with a rank that's `better quality'.  (Assuming RFCs allow two
> multipart/alternative with the same MIME type.)

That's a good idea.  (My read of RFC 2045 is that it is allowed.)

Though in this case, I don't know if it would meet the need of preserving
the original.

David

-- 
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers

Re: [nmh-workers] Formatting HTML to Text: netrik.

2019-07-07 Thread Ronald F. Guilmette
In message <20190707172213.aeaff21...@orac.inputplus.co.uk>, 
Ralph Corderoy  wrote:

>> > Here is what I have set. is this what you are talking about?  Or do
>> > I need to fiddle sonmething else entirely?
>> > 
>> > % env | fgrep LOCALE
>> > XTERM_LOCALE=en_US.UTF-8
>>
>> Yeah, that's correct.
>
>Is it?  That's only xterm(1)'s locale.  I think rfg should show us the
>output of locale(1) at his normal shell prompt to be sure.

My apologies yet again friends.  I know you're all trying to help me,
especially Ken, but I keep on getting involved in other projects and
thus, I keep on failing to come back to this little project of getting
my NMH configuration all properly sorted out at long last.

I still plane to come back to the several helpful emails that Ken and
others have sent me, trying to give me guidance to get this all fixed up
properly, but I am off now chasing down what I think is a serious
cybercriminal so that really has 100% of my attention for now.

But to answer Ralph's inquiry...


LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_ALL=


-- 
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers

Re: [nmh-workers] Formatting HTML to Text: netrik.

2019-07-07 Thread Ralph Corderoy
Hi Ken,

> > Here is what I have set. is this what you are talking about?  Or do
> > I need to fiddle sonmething else entirely?
> > 
> > % env | fgrep LOCALE
> > XTERM_LOCALE=en_US.UTF-8
>
> Yeah, that's correct.

Is it?  That's only xterm(1)'s locale.  I think rfg should show us the
output of locale(1) at his normal shell prompt to be sure.

-- 
Cheers, Ralph.

-- 
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers

Re: [nmh-workers] Formatting HTML to Text: netrik.

2019-07-07 Thread Ralph Corderoy
Hi,

rfg wrote:
> I have trouble believe that in this day and age, when we have had
> REALLY widespread use of HTML for around a couple of decades now, that
> there are still -zero- tools tyat can quicky render HTML into plain
> text without mucking it up somehow.

I found https://github.com/jaytaylor/html2text#example-usage recently.
It's intended use is for accompany one's text/html email part with a
text/plain equivalent, but it could be the basis of rendering a received
text/html better than lynx(1), links(1), w3m(1), ...

It pulls in others' work for tables, etc.  Might be nice to see
https://en.wikipedia.org/wiki/Select_Graphic_Rendition_(ANSI)#SGR_parameters
supported.  text/sgr anyone?  :-)  less(1) has -R.

-- 
Cheers, Ralph.

-- 
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers

Re: [nmh-workers] Formatting HTML to Text: netrik.

2019-07-07 Thread Ralph Corderoy
Hi David,

rfg wrote:
> > All these in combination you end with a reasonable reply to HTML
> > emails.  The downside is that you don't get to keep the original
> > email unless you make a copy of it and it's fairly hacky.
>
> Thanks for all the tips, but this is a non-starter for me.  I need to
> preserve originals.
>
> I would think there should be some way of doing that *and* getting
> nicely TEXTified emails, no?

mhfixmsg(1)'s -replacetextplain replaces the text/plain with a rendering
of the text/html part.  This is useful because the shipped text/plain is
often poor.

What if mhfixmsg could also append a new text/plain after the existing
one, i.e. with a rank that's `better quality'.  (Assuming RFCs allow two
multipart/alternative with the same MIME type.)  Then showing the email
with `-prefer text/plain' would give mhfixmsg's version but the original
would still be there for explicit reference with -part, etc.

-- 
Cheers, Ralph.

-- 
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers

Re: [nmh-workers] success using the OAUTH2 with gmail.

2019-07-07 Thread Ralph Corderoy
Hi Ken,

> Let's say in a hypothetical future we support IMAP.  That means that
> nearly every command would take a whole pile of arguments like
> -initialtls, -host, -port, -sasl, and more.  Obviously changing your
> profile for every nmh command would be awful.  So there should be some
> way of handling that.  What I had thought maybe was tying profile
> entries to mailboxes, so if you did "scan my-imap-server:foo" it could
> possibly look in your profile and find:
>
> my-imap-server: -host my.server.com -port imap -tls -sasl
> -saslmech GSSAPI -user me

That seems very specific and introduces a new colon operator that
restricts what's available for other features later.

How about allowing an mh-profile(5) in a folder's directory with its
content having higher priority than the ancestor folders' .mh_profile
and the general ~/.mh_profile.  This could be used for more general
things, e.g. the template used for replies to emails in that folder, or
the preferred format for scanning it.

In the IMAP case, the folder exists locally, and its .mh_profile, but
the emails are remote, as are sub-folders.

> You get the idea.  But thinking about this more makes me think that we
> should extend this a bit so it's not tied to folders, but a generic
> connection profile defaults and we could provide ones that work with
> Gmail.  I don't have it all jelled in my head how this would look and
> you'd need to do something to ADD to an existing connection profile so
> you could supply your own username, for example.  But it seems like it
> should be doable.  But I guess my idea is that you should be able to
> do something like
>
> inc -conn gmail -user myu...@gmail.com
>
> and the right stuff should happen.  Make sense?

If `foo: -bar xyzzy...' is in an .mh_profile then often, depending on
the value's complexity, it can be interpolated with «`mhparam foo`» in
the shell.  What if a similar capability existed on the value side of an
.mh_profile's `key: value'.  Except it could cope with the interpolated
value having quotes.

This would allow collections of options to be defined and then
referenced with a shorthand.  Either back quotes copied from sh(1), or
`-use foo' so it can work easily at the shell too.

-- 
Cheers, Ralph.

-- 
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers