Re: How do I do this in Python 3 (string.join())?

2020-08-28 Thread Chris Green
Cameron Simpson  wrote:
[snip]
> 
> >The POP3 processing is solely to collect E-Mail that ends up in the
> >'catchall' mailbox on my hosting provider.  It empties the POP3
> >catchall mailbox, checks for anything that *might* be for me or other
> >family members then just deletes the rest.
> 
> Very strong email policy, that one. Personally I fear data loss, and 
> process everything; anything which doesn't match a rule lands in my 
> "UNKNOWN" mail folder for manual consideration when I'm bored. It is 
> largely spam, but sometimes has a message wanting a new filing rule.
> 
It's not *that* strong, the catchall is for *anything* that is
addressed to either of the two domains hosted there.  I.e. mail for
xhghj...@isbd.net will arrive in the catchall mailbox.  So I just
search the To: address for anything that might be a typo for one of
our names or anything else that might be of interest.  I have an
associated configuration file that specifies the patterns to look for
so I can change things on the fly as it were.

One of the scripts that I'm having trouble converting to Python 3 is
the one that does this catchall management.


> >> >E.g. in this case the only (well the only ready made) way to get a
> >> >POP3 message is using poplib and this just gives you a list of lines
> >> >made up of "bytes as text" :-
> >> >
> >> >popmsg = pop3.retr(i+1)
> >>
> >> Ok, so you have bytes? You need to know.
> >>
> >The documentation says (and it's exactly the same for Python 2 and
> >Python 3):-
> >
> >POP3.retr(which)
> >Retrieve whole message number which, and set its seen flag. Result
> >is in form (response, ['line', ...], octets).
> >
> >Which isn't amazingly explicit unless 'line' implies a string.
> 
> Aye. But "print(repr(a_pop_line))" will tell you. Almost certainly a 
> string-of-bytes, so I would expect bytes. The docs are probably 
> unchanged during the Python2->3 move.
> 
Yes, I added some print statments to my catchall script to find out
and, yes, the returned value is a list of 'byte strings'.  It's a pity
there isn't a less ambiguous name for 'string-of-bytes'! :-)


> >> >I join the lines to feed them into mailbox.mbox() to create a mbox I
> >> >can analyse and also a message which can be sent using SMTP.
> 
> Ah. I like Maildirs for analysis; every message has its own file, which 
> makes adding and removing messages easy, and avoids contention with 
> other things using the Maildir.
> 
> My mailfiler can process Maildirs (scan, add, remove) and add to 
> Maildirs and mboxes.
> 
I've switched to maildir several times in the past and have always
switched back because they have so many 'standards'.  I use mutt as my
MUA and that does handle maildir as well as anything but still doesn't
do it for me.  :-)

-- 
Chris Green
·
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How do I do this in Python 3 (string.join())?

2020-08-27 Thread Cameron Simpson
On 27Aug2020 14:36, Chris Green  wrote:
>Cameron Simpson  wrote:
>> I do ok, though most of my message processing happens to messages
>> already landed in my "spool" Maildir by getmail. My setup uses getmail
>> to get messages with POP into a single Maildir, and then I process the
>> message files from there.
>>
>Most of my mail is delivered by SMTP, I run a Postfix SMTP *serever*
>on my desktop machine which stays on permanently.

I run postfix on my machines too, including my laptop, but mostly for 
sending - it means I can queue messages while offline, and they'll go 
out later.

I don't receive SMTP on my laptop (which is where my mail lives); I 
receive elsewhere such as the machine hosting my email domain (which 
also runs postfix), and the various external addresses I have (one for 
each ISP of course, and a couple of external email addresses such as a 
GMail one (largely to interact with stuff like Google Groups, which is 
pretty parochial).

So I use getmail to fetch from most of these (GMail just forwards a copy 
of everything "personal" to my primary address) and deliver to a spool 
Maildir on my laptop, and the mailfiler processes the spool Maildir.

>The POP3 processing is solely to collect E-Mail that ends up in the
>'catchall' mailbox on my hosting provider.  It empties the POP3
>catchall mailbox, checks for anything that *might* be for me or other
>family members then just deletes the rest.

Very strong email policy, that one. Personally I fear data loss, and 
process everything; anything which doesn't match a rule lands in my 
"UNKNOWN" mail folder for manual consideration when I'm bored. It is 
largely spam, but sometimes has a message wanting a new filing rule.

>> >E.g. in this case the only (well the only ready made) way to get a
>> >POP3 message is using poplib and this just gives you a list of lines
>> >made up of "bytes as text" :-
>> >
>> >popmsg = pop3.retr(i+1)
>>
>> Ok, so you have bytes? You need to know.
>>
>The documentation says (and it's exactly the same for Python 2 and
>Python 3):-
>
>POP3.retr(which)
>Retrieve whole message number which, and set its seen flag. Result
>is in form (response, ['line', ...], octets).
>
>Which isn't amazingly explicit unless 'line' implies a string.

Aye. But "print(repr(a_pop_line))" will tell you. Almost certainly a 
string-of-bytes, so I would expect bytes. The docs are probably 
unchanged during the Python2->3 move.

>> >I join the lines to feed them into mailbox.mbox() to create a mbox I
>> >can analyse and also a message which can be sent using SMTP.

Ah. I like Maildirs for analysis; every message has its own file, which 
makes adding and removing messages easy, and avoids contention with 
other things using the Maildir.

My mailfiler can process Maildirs (scan, add, remove) and add to 
Maildirs and mboxes.

Cheers,
Cameron Simpson 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How do I do this in Python 3 (string.join())?

2020-08-27 Thread Chris Green
Cameron Simpson  wrote:
> On 27Aug2020 09:16, Chris Green  wrote:
> >Cameron Simpson  wrote:
> >> But note: joining bytes like strings is uncommon, and may indicate 
> >> that
> >> you should be working in strings to start with. Eg you may want to
> >> convert popmsg from bytes to str and do a str.join anyway. It depends on
> >> exactly what you're dealing with: are you doing text work, or are you
> >> doing "binary data" work?
> >>
> >> I know many network protocols are "bytes-as-text, but that is
> >> accomplished by implying an encoding of the text, eg as ASCII, where
> >> characters all fit in single bytes/octets.
> >>
> >Yes, I realise that making everything a string before I start might be
> >the 'right' way to do things but one is a bit limited by what the mail
> >handling modules in Python provide.
> 
> I do ok, though most of my message processing happens to messages 
> already landed in my "spool" Maildir by getmail. My setup uses getmail 
> to get messages with POP into a single Maildir, and then I process the 
> message files from there.
> 
Most of my mail is delivered by SMTP, I run a Postfix SMTP *serever*
on my desktop machine which stays on permanently.

The POP3 processing is solely to collect E-Mail that ends up in the
'catchall' mailbox on my hosting provider.  It empties the POP3
catchall mailbox, checks for anything that *might* be for me or other
family members then just deletes the rest.

> >E.g. in this case the only (well the only ready made) way to get a
> >POP3 message is using poplib and this just gives you a list of lines
> >made up of "bytes as text" :-
> >
> >popmsg = pop3.retr(i+1)
> 
> Ok, so you have bytes? You need to know.
> 
The documentation says (and it's exactly the same for Python 2 and
Python 3):-

POP3.retr(which)
Retrieve whole message number which, and set its seen flag. Result
is in form (response, ['line', ...], octets).

Which isn't amazingly explicit unless 'line' implies a string.


> >I join the lines to feed them into mailbox.mbox() to create a mbox I
> >can analyse and also a message which can be sent using SMTP.
> >
> >Should I be converting to string somewhere?
> 
> I have not used poplib, but the Python email modules have a BytesParser, 
> which gets you a Message object; I would feed the poplib bytes to that 
> to parse the received message.  A Message object can then be transcribed 
> as text via its .as_string method. Or you can do other things with it.
> 
> I think my main points are:
> 
> - know whether you're using bytes (uninterpreted data) or text (strings 
>   of _characters_); treating bytes _as_ text implies an encoding, and 
>   when that assumption is incorrect you get mojibake[1]
> 
> - look at the email modules' parsers, which return Messages, a 
>   representation of the message in a structure (so that MIME subparts 
>   etc are correctly broken out, and the character sets are _known_, post 
>   parse)

OK, thanks Cameron.
 
-- 
Chris Green
·
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How do I do this in Python 3 (string.join())?

2020-08-27 Thread Cameron Simpson
On 27Aug2020 09:16, Chris Green  wrote:
>Cameron Simpson  wrote:
>> But note: joining bytes like strings is uncommon, and may indicate 
>> that
>> you should be working in strings to start with. Eg you may want to
>> convert popmsg from bytes to str and do a str.join anyway. It depends on
>> exactly what you're dealing with: are you doing text work, or are you
>> doing "binary data" work?
>>
>> I know many network protocols are "bytes-as-text, but that is
>> accomplished by implying an encoding of the text, eg as ASCII, where
>> characters all fit in single bytes/octets.
>>
>Yes, I realise that making everything a string before I start might be
>the 'right' way to do things but one is a bit limited by what the mail
>handling modules in Python provide.

I do ok, though most of my message processing happens to messages 
already landed in my "spool" Maildir by getmail. My setup uses getmail 
to get messages with POP into a single Maildir, and then I process the 
message files from there.

>E.g. in this case the only (well the only ready made) way to get a
>POP3 message is using poplib and this just gives you a list of lines
>made up of "bytes as text" :-
>
>popmsg = pop3.retr(i+1)

Ok, so you have bytes? You need to know.

>I join the lines to feed them into mailbox.mbox() to create a mbox I
>can analyse and also a message which can be sent using SMTP.
>
>Should I be converting to string somewhere?

I have not used poplib, but the Python email modules have a BytesParser, 
which gets you a Message object; I would feed the poplib bytes to that 
to parse the received message.  A Message object can then be transcribed 
as text via its .as_string method. Or you can do other things with it.

I think my main points are:

- know whether you're using bytes (uninterpreted data) or text (strings 
  of _characters_); treating bytes _as_ text implies an encoding, and 
  when that assumption is incorrect you get mojibake[1]

- look at the email modules' parsers, which return Messages, a 
  representation of the message in a structure (so that MIME subparts 
  etc are correctly broken out, and the character sets are _known_, post 
  parse)

[1] https://en.wikipedia.org/wiki/Mojibake

Cheers,
Cameron Simpson 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How do I do this in Python 3 (string.join())?

2020-08-27 Thread Chris Green
Cameron Simpson  wrote:
> On 26Aug2020 15:09, Chris Green  wrote:
> >2qdxy4rzwzuui...@potatochowder.com wrote:
> >> Join bytes objects with a byte object:
> >>
> >> b"\n".join(popmsg[1])
> >
> >Aaahhh!  Thank you (and the other reply).
> 
> But note: joining bytes like strings is uncommon, and may indicate that 
> you should be working in strings to start with. Eg you may want to 
> convert popmsg from bytes to str and do a str.join anyway. It depends on 
> exactly what you're dealing with: are you doing text work, or are you 
> doing "binary data" work?
> 
> I know many network protocols are "bytes-as-text, but that is 
> accomplished by implying an encoding of the text, eg as ASCII, where 
> characters all fit in single bytes/octets.
> 
Yes, I realise that making everything a string before I start might be
the 'right' way to do things but one is a bit limited by what the mail
handling modules in Python provide.

E.g. in this case the only (well the only ready made) way to get a
POP3 message is using poplib and this just gives you a list of lines
made up of "bytes as text" :-

popmsg = pop3.retr(i+1)

I join the lines to feed them into mailbox.mbox() to create a mbox I
can analyse and also a message which can be sent using SMTP.

Should I be converting to string somewhere?  I guess the POP3 and SMTP
libraries will cope with strings as input.  Can I convert to string
after the join for example?  If so, how?  Can I just do:-

msgbytes = b'\n'.join(popmsg[1])
msgstr = str(mshbytes)

(Yes, I know it can be one line, I was just being explicit).

... or do I need to stringify the lines returned by popmsg() before
joining them together?


Thank you for all your help and comments!

(I'm a C programmer at heart, preceded by being an assembler
programmer.  I started programming way back in the 1970s, I'm retired
now and Python is for relaxation (?) in my dotage)

-- 
Chris Green
·
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How do I do this in Python 3 (string.join())?

2020-08-26 Thread Cameron Simpson
On 26Aug2020 15:09, Chris Green  wrote:
>2qdxy4rzwzuui...@potatochowder.com wrote:
>> Join bytes objects with a byte object:
>>
>> b"\n".join(popmsg[1])
>
>Aaahhh!  Thank you (and the other reply).

But note: joining bytes like strings is uncommon, and may indicate that 
you should be working in strings to start with. Eg you may want to 
convert popmsg from bytes to str and do a str.join anyway. It depends on 
exactly what you're dealing with: are you doing text work, or are you 
doing "binary data" work?

I know many network protocols are "bytes-as-text, but that is 
accomplished by implying an encoding of the text, eg as ASCII, where 
characters all fit in single bytes/octets.

Cheers,
Cameron Simpson 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How do I do this in Python 3 (string.join())?

2020-08-26 Thread Chris Green
2qdxy4rzwzuui...@potatochowder.com wrote:
> On 2020-08-26 at 14:22:10 +0100,
> Chris Green  wrote:
> 
> > I have the following line in Python 2:-
> > 
> > msgstr = string.join(popmsg[1], "\n") # popmsg[1] is a list containing 
> the lines of the message 
> > 
> > ... so I changed it to:-
> > 
> > s = "\n"
> > msgstr = s.join(popmsg[1])  # popmsg[1] is a list containing the 
> > lines of the message
> > 
> > However this still doesn't work because popmsg[1] isn't a list of
> > strings, I get the error:-
> > 
> > TypeError: sequence item 0: expected str instance, bytes found
> > 
> > So how do I do this?  I can see clumsy ways by a loop working through
> > the list in popmsg[1] but surely there must be a way that's as neat
> > and elegant as the Python 2 way was?
> 
> Join bytes objects with a byte object:
> 
> b"\n".join(popmsg[1])

Aaahhh!  Thank you (and the other reply).

-- 
Chris Green
·
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How do I do this in Python 3 (string.join())?

2020-08-26 Thread MRAB

On 2020-08-26 14:22, Chris Green wrote:

I have the following line in Python 2:-

 msgstr = string.join(popmsg[1], "\n")  # popmsg[1] is a list 
containing the lines of the message

... so I changed it to:-

 s = "\n"
 msgstr = s.join(popmsg[1])  # popmsg[1] is a list containing the lines 
of the message

However this still doesn't work because popmsg[1] isn't a list of
strings, I get the error:-

 TypeError: sequence item 0: expected str instance, bytes found

So how do I do this?  I can see clumsy ways by a loop working through
the list in popmsg[1] but surely there must be a way that's as neat
and elegant as the Python 2 way was?


In Python 3, bytestring literals require the 'b' prefix:

msgstr = b"\n".join(popmsg[1])
--
https://mail.python.org/mailman/listinfo/python-list


Re: How do I do this in Python 3 (string.join())?

2020-08-26 Thread D'Arcy Cain
On 2020-08-26 09:22, Chris Green wrote:
> I have the following line in Python 2:-
> 
> msgstr = string.join(popmsg[1], "\n")  # popmsg[1] is a list 
> containing the lines of the message
> 
> ... so I changed it to:-
> 
> s = "\n"
> msgstr = s.join(popmsg[1])  # popmsg[1] is a list containing the 
> lines of the message
> 
> However this still doesn't work because popmsg[1] isn't a list of
> strings, I get the error:-
> 
> TypeError: sequence item 0: expected str instance, bytes found
> 
> So how do I do this?  I can see clumsy ways by a loop working through
> the list in popmsg[1] but surely there must be a way that's as neat
> and elegant as the Python 2 way was?

Well, the simple fix is to set s to b"\n" but that may not solve all of your
problems.  The issue is that popmsg[1] is a list of bytes.  You probably
want a list of strings.  I would look further back and think about getting a
list of strings in the first place.  Without knowing how popmsg was created
we can't tell you how to do that.

Of course, if a bytes object is what you want then the above will work.  You
can also convert to string after the join.

Cheers.

-- 
D'Arcy J.M. Cain
Vybe Networks Inc.
A unit of Excelsior Solutions Corporation - Propelling Business Forward
http://www.VybeNetworks.com/
IM:da...@vybenetworks.com VoIP: sip:da...@vybenetworks.com



signature.asc
Description: OpenPGP digital signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How do I do this in Python 3 (string.join())?

2020-08-26 Thread 2QdxY4RzWzUUiLuE
On 2020-08-26 at 14:22:10 +0100,
Chris Green  wrote:

> I have the following line in Python 2:-
> 
> msgstr = string.join(popmsg[1], "\n")  # popmsg[1] is a list 
> containing the lines of the message
> 
> ... so I changed it to:-
> 
> s = "\n"
> msgstr = s.join(popmsg[1])  # popmsg[1] is a list containing the 
> lines of the message
> 
> However this still doesn't work because popmsg[1] isn't a list of
> strings, I get the error:-
> 
> TypeError: sequence item 0: expected str instance, bytes found
> 
> So how do I do this?  I can see clumsy ways by a loop working through
> the list in popmsg[1] but surely there must be a way that's as neat
> and elegant as the Python 2 way was?

Join bytes objects with a byte object:

b"\n".join(popmsg[1])
-- 
https://mail.python.org/mailman/listinfo/python-list


How do I do this in Python 3 (string.join())?

2020-08-26 Thread Chris Green
I have the following line in Python 2:-

msgstr = string.join(popmsg[1], "\n")  # popmsg[1] is a list containing 
the lines of the message

... so I changed it to:-

s = "\n"
msgstr = s.join(popmsg[1])  # popmsg[1] is a list containing the lines 
of the message

However this still doesn't work because popmsg[1] isn't a list of
strings, I get the error:-

TypeError: sequence item 0: expected str instance, bytes found

So how do I do this?  I can see clumsy ways by a loop working through
the list in popmsg[1] but surely there must be a way that's as neat
and elegant as the Python 2 way was?

-- 
Chris Green
·
-- 
https://mail.python.org/mailman/listinfo/python-list