Re: Do subprocess.PIPE and subprocess.STDOUT sametime

2023-05-09 Thread Eryk Sun
On 5/9/23, Thomas Passin  wrote:
>
> I'm not sure if this exactly fits your situation, but if you use
> subprocess with pipes, you can often get a deadlock because the stdout
> (or stderr, I suppose) pipe has a small capacity and fills up quickly
> (at least on Windows),

The pipe size is relatively small on Windows only because
subprocess.Popen uses the default pipe size when it calls WinAPI
CreatePipe(). The default size is 4 KiB, which actually should be big
enough for most cases. If some other pipe size is passed, the value is
"advisory", meaning that it has to be within the allowed range (but
there's no practical limit on the size) and that it gets rounded up to
an allocation boundary (e.g. a multiple of the system's virtual-memory
page size). For example, here's a 256 MiB pipe:

>>> hr, hw = _winapi.CreatePipe(None, 256*1024*1024)
>>> _winapi.WriteFile(hw, b'a' * (256*1024*1024))
(268435456, 0)
>>> data = _winapi.ReadFile(hr, 256*1024*1024)[0]
>>> len(data) == 256*1024*1024
True

> then it blocks until it is emptied by a read.
> But if you aren't polling, you don't know there is something to read so
> the pipe never gets emptied.  And if you don't read it before the pipe
> has filled up, you may lose data.

If there's just one pipe, then there's no potential for deadlock, and
no potential to lose data. If there's a timeout, however, then
communicate() still has to use I/O polling or a thread to avoid
blocking indefinitely in order to honor the timeout.

Note that there's a bug in subprocess on Windows. Popen._communicate()
should create a new thread for each pipe. However, it actually calls
stdin.write() on the current thread, which could block and ignore the
specified timeout. For example, in the following case the timeout of 5
seconds is ignored:

>>> cmd = 'python -c "import time; time.sleep(20)"'
>>> t0 = time.time(); p = subprocess.Popen(cmd, stdin=subprocess.PIPE)
>>> r = p.communicate(b'a'*4097, timeout=5); t1 = time.time() - t0
>>> t1
20.2162926197052

There's a potential for deadlock when two or more pipes are accessed
synchronously by two threads (e.g. one thread in each process). For
example, reading from one of the pipes blocks one of the threads
because the pipe is empty, while at the same time writing to the other
pipe blocks the other thread because the pipe is full. However, there
will be no deadlock if at least one of the threads always polls the
pipes to ensure that they're ready (i.e. data is available to be read,
or at least PIPE_BUF bytes can be written without blocking), which is
how communicate() is implemented on POSIX. Alternatively, one of the
processes can use a separate thread for each pipe, which is how
communicate() is implemented on Windows.

Note that there are problems with the naive implementation of the
reader threads on Windows, in particular if a pipe handle leaks to
descendants of the child process, which prevents the pipe from
closing. A better implementation on Windows would use named pipes
opened in asynchronous mode on the parent side and synchronous mode on
the child side. Just implement a loop that handles I/O completion
using events, APCs, or an I/O completion port.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Problem with accented characters in mailbox.Maildir()

2023-05-09 Thread Peter J. Holzer
On 2023-05-08 23:02:18 +0200, jak wrote:
> Peter J. Holzer ha scritto:
> > On 2023-05-06 16:27:04 +0200, jak wrote:
> > > Chris Green ha scritto:
> > > > Chris Green  wrote:
> > > > > A bit more information, msg.get("subject", "unknown") does return a
> > > > > string, as follows:-
> > > > > 
> > > > >   Subject: 
> > > > > =?utf-8?Q?aka_Marne_=C3=A0_la_Sa=C3=B4ne_(Waterways_Continental_Europe)?=
> > [...]
> > > > ... and of course I now see the issue!  The Subject: with utf-8
> > > > characters in it gets spaces changed to underscores.  So searching for
> > > > '(Waterways Continental Europe)' fails.
> > > > 
> > > > I'll either need to test for both versions of the string or I'll need
> > > > to change underscores to spaces in the Subject: returned by msg.get().
[...]
> > > 
> > > subj = email.header.decode_header(raw_subj)[0]
> > > 
> > > subj[0].decode(subj[1])
[...]
> > email.header.decode_header returns a *list* of chunks and you have to
> > process and concatenate all of them.
> > 
> > Here is a snippet from a mail to html converter I wrote a few years ago:
> > 
> > def decode_rfc2047(s):
> >  if s is None:
> >  return None
> >  r = ""
> >  for chunk in email.header.decode_header(s):
[...]
> >  r += chunk[0].decode(chunk[1])
[...]
> >  return r
[...]
> > 
> > I do have to say that Python is extraordinarily clumsy in this regard.
> 
> Thanks for the reply. In fact, I gave that answer because I did
> not understand what the OP wanted to achieve. In addition, the
> OP opened a second thread on the similar topic in which I gave a
> more correct answer (subject: "What do these '=?utf-8?' sequences
> mean in python?", date: "Sat, 6 May 2023 14:50:40 UTC").

Right. I saw that after writing my reply. I should have read all
messages, not just that thread before replying.

> the OP, I discovered that the MAME is not the only format used
> to compose the subject.

Not sure what "MAME" is. If it's a typo for MIME, then the base64
variant of RFC 2047 is just as much a part of it as the quoted-printable
variant.

> This made me think that a library could not delegate to the programmer
> the burden of managing all these exceptions,

email.header.decode_header handles both variants, but it produces bytes
sequences which still have to be decoded to get a Python string.


> then I have further investigated to discover that the library also
> provides the conversion function beyond that of coding and this makes
> our labors vain:
> 
> --
> from email.header import decode_header, make_header
> 
> subject = make_header(decode_header( raw_subject )))
> --

Yup. I somehow missed that. That's a lot more convenient than calling
decode in a loop (or generator expression). Depending on what you want
to do with the subject you may have wrap that in a call to str(), but
it's still a one-liner.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Do subprocess.PIPE and subprocess.STDOUT sametime

2023-05-09 Thread jak

Horst Koiner ha scritto:

Hi @all,
i'm running a program which is still in development with subprocess.run (Python 
version 3.10), further i need to capture the output of the program in a python 
variable. The program itself runs about 2 minutes, but it can also freeze in 
case of new bugs.

For production i run the program with stdout=subprocess.PIPE and i can fetch 
than the output later. For just testing if the program works, i run with 
stdout=subprocess.STDOUT and I see all program output on the console, but my 
program afterwards crashes since there is nothing captured in the python 
variable. So I think I need to have the functionality of subprocess.PIPE and 
subprcess.STDOUT sametime.

What I tried until now:
1. Poll the the output and use Popen instead:

# Start the subprocess
process = subprocess.Popen(['./test.sh'], stdout=subprocess.PIPE, 
stderr=subprocess.PIPE)

captured_output = b''
process_running = True
while process_running:
 process_running = (process.poll() is not None)
 for pipe in [ process.stdout, process.stderr ]:
 while line := pipe.readline():
 print(line)
 captured_output += line

print(captured_output)
return_code = process.returncode

=> But this is discouraged by the python doc, since it says that polling this 
way is prone to deadlocks. Instead it proposes the use of the communicate() 
function.

2. Use communicate() with timeout.
=> This works not at all since when the timeout occurs an exception is thrown 
and communicate returns at all.

3. Use threading instead
=> For being that simple and universal like subprocess you will more or less 
reimplement subprocess with threading, like its done in subprocess.py. Just for a 
debug output the effort is much to high.

###
Do you have further ideas for implementing such a behavior?
Do you think that a feature request should be done of I'm omitting something 
obvious?

Thanks you in advance for your suggestions,
Horst.



I agree with @'thomas Passin' but I solved in a different way, I made 
the Readline() not blocking. even if I believe his idea better than

mine:

os.set_blocking(process.stdout.fileno(), False)
os.set_blocking(process.stderr.fileno(), False)

--
https://mail.python.org/mailman/listinfo/python-list


Re: Do subprocess.PIPE and subprocess.STDOUT sametime

2023-05-09 Thread Thomas Passin

On 5/9/2023 2:13 PM, Horst Koiner wrote:

Hi @all,
i'm running a program which is still in development with subprocess.run (Python 
version 3.10), further i need to capture the output of the program in a python 
variable. The program itself runs about 2 minutes, but it can also freeze in 
case of new bugs.

For production i run the program with stdout=subprocess.PIPE and i can fetch 
than the output later. For just testing if the program works, i run with 
stdout=subprocess.STDOUT and I see all program output on the console, but my 
program afterwards crashes since there is nothing captured in the python 
variable. So I think I need to have the functionality of subprocess.PIPE and 
subprcess.STDOUT sametime.

What I tried until now:
1. Poll the the output and use Popen instead:

# Start the subprocess
process = subprocess.Popen(['./test.sh'], stdout=subprocess.PIPE, 
stderr=subprocess.PIPE)

captured_output = b''
process_running = True
while process_running:
 process_running = (process.poll() is not None)
 for pipe in [ process.stdout, process.stderr ]:
 while line := pipe.readline():
 print(line)
 captured_output += line

print(captured_output)
return_code = process.returncode

=> But this is discouraged by the python doc, since it says that polling this 
way is prone to deadlocks. Instead it proposes the use of the communicate() 
function.

2. Use communicate() with timeout.
=> This works not at all since when the timeout occurs an exception is thrown 
and communicate returns at all.

3. Use threading instead
=> For being that simple and universal like subprocess you will more or less 
reimplement subprocess with threading, like its done in subprocess.py. Just for a 
debug output the effort is much to high.

###
Do you have further ideas for implementing such a behavior?
Do you think that a feature request should be done of I'm omitting something 
obvious?


I'm not sure if this exactly fits your situation, but if you use 
subprocess with pipes, you can often get a deadlock because the stdout 
(or stderr, I suppose) pipe has a small capacity and fills up quickly 
(at least on Windows), then it blocks until it is emptied by a read. 
But if you aren't polling, you don't know there is something to read so 
the pipe never gets emptied.  And if you don't read it before the pipe 
has filled up, you may lose data.


I solved that by running communicate() on a separate thread.  Let the 
communicate block the thread until the process has completed, then have 
the thread send the result back to the main program.  Of course, this 
won't work if your process doesn't end since you won't get results until 
the process ends.


--
https://mail.python.org/mailman/listinfo/python-list


Re: Do subprocess.PIPE and subprocess.STDOUT sametime

2023-05-09 Thread Mats Wichmann

On 5/9/23 12:13, Horst Koiner wrote:

Hi @all,
i'm running a program which is still in development with subprocess.run (Python 
version 3.10), further i need to capture the output of the program in a python 
variable. The program itself runs about 2 minutes, but it can also freeze in 
case of new bugs.

For production i run the program with stdout=subprocess.PIPE and i can fetch 
than the output later. For just testing if the program works, i run with 
stdout=subprocess.STDOUT and I see all program output on the console, but my 
program afterwards crashes since there is nothing captured in the python 
variable. So I think I need to have the functionality of subprocess.PIPE and 
subprcess.STDOUT sametime.


I'm not sure you quite understood what subprocess.STDOUT is for.  If you 
say nothing stdout is not captured. STDOUT is used as a value for stderr 
to mean send it the same place as stdout, which is useful if you set 
stdout to something unusual, then you don't have to retype it if you 
want stderr going the same place.  The subprocess module, afaik, doesn't 
even have a case for stdout=STDOUT.




What I tried until now:
1. Poll the the output and use Popen instead:

# Start the subprocess
process = subprocess.Popen(['./test.sh'], stdout=subprocess.PIPE, 
stderr=subprocess.PIPE)

captured_output = b''
process_running = True
while process_running:
 process_running = (process.poll() is not None)
 for pipe in [ process.stdout, process.stderr ]:
 while line := pipe.readline():
 print(line)
 captured_output += line

print(captured_output)
return_code = process.returncode

=> But this is discouraged by the python doc, since it says that polling this 
way is prone to deadlocks. Instead it proposes the use of the communicate() 
function.

2. Use communicate() with timeout.
=> This works not at all since when the timeout occurs an exception is thrown 
and communicate returns at all.


Well, sure ... if you set timeout, then you need to be prepared to catch 
the TimeoutExpired exception and deal with it. That should be entirely 
normal.




3. Use threading instead
=> For being that simple and universal like subprocess you will more or less 
reimplement subprocess with threading, like its done in subprocess.py. Just for a 
debug output the effort is much to high.


Not sure I get what this is asking/suggesting.  If you don't want to 
wait for the subprocess to run, you can use async - that's been fully 
implemented.


https://docs.python.org/3/library/asyncio-subprocess.html



--
https://mail.python.org/mailman/listinfo/python-list


Do subprocess.PIPE and subprocess.STDOUT sametime

2023-05-09 Thread Horst Koiner
Hi @all,
i'm running a program which is still in development with subprocess.run (Python 
version 3.10), further i need to capture the output of the program in a python 
variable. The program itself runs about 2 minutes, but it can also freeze in 
case of new bugs.

For production i run the program with stdout=subprocess.PIPE and i can fetch 
than the output later. For just testing if the program works, i run with 
stdout=subprocess.STDOUT and I see all program output on the console, but my 
program afterwards crashes since there is nothing captured in the python 
variable. So I think I need to have the functionality of subprocess.PIPE and 
subprcess.STDOUT sametime.

What I tried until now:
1. Poll the the output and use Popen instead:

# Start the subprocess
process = subprocess.Popen(['./test.sh'], stdout=subprocess.PIPE, 
stderr=subprocess.PIPE)

captured_output = b''
process_running = True
while process_running:
process_running = (process.poll() is not None)
for pipe in [ process.stdout, process.stderr ]:
while line := pipe.readline():
print(line)
captured_output += line

print(captured_output)
return_code = process.returncode 

=> But this is discouraged by the python doc, since it says that polling this 
way is prone to deadlocks. Instead it proposes the use of the communicate() 
function.

2. Use communicate() with timeout.
=> This works not at all since when the timeout occurs an exception is thrown 
and communicate returns at all.

3. Use threading instead
=> For being that simple and universal like subprocess you will more or less 
reimplement subprocess with threading, like its done in subprocess.py. Just for 
a debug output the effort is much to high.

###
Do you have further ideas for implementing such a behavior?
Do you think that a feature request should be done of I'm omitting something 
obvious?

Thanks you in advance for your suggestions,
Horst.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python-pickle error

2023-05-09 Thread Tony Flury via Python-list

Charles,

by your own admission, you deleted your pkl file,

And your code doesn't write that pkl file (pickle.dumps(...) doesn't 
write a file it creates a new string and at no point will it write to 
the file :


What you need is this :

import pickle
number=2
my_pickled_object=pickle.dumps(number)
with open('file.pkl', 'w') as file:
file.write(my_pickled_object)
print("this is my pickled object",{my_pickled_object},)

del number # you can do this if you really want to test pickle.

with open('file.pkl', 'r') as file:
number=pickle.load(file)

my_unpickled_object=pickle.loads(my_pickled_object)
print("this is my unpickled object",{my_unpickled_object},)

Note :  that the whole point of the pickle format is that you don't need 
to open and write/read files in binary format.



On 19/04/2023 17:14, charles wiewiora wrote:

Hello,
I am experincing problems with the pickle moducle
the folowing code was working before,

import pickle
number=2
my_pickeld_object=pickle.dumps(number)
print("this is my pickled object",{my_pickeld_object},)
with open('file.pkl', 'rb') as file:
 number=pickle.load(file)
my_unpickeled_object=pickle.loads(my_pickeld_object)
print("this is my unpickeled object",{my_unpickeled_object},)

but now i get error

Traceback (most recent call last):
   File "C:\Users\lukwi\Desktop\python\tester2.py", line 5, in 
 with open('file.pkl', 'rb') as file:
FileNotFoundError: [Errno 2] No such file or directory: 'file.pkl'

im get this problem after this,
a .pkl came into my Python script files
i though this could be a spare file made from python becauce i was doing this 
first,

import pickle
number=2
my_pickeld_object=pickle.dumps(number)
print("this is my pickled object",{my_pickeld_object},)
with open('file.pkl', 'rb') as file:
 number=pickle.load(file)

so i stupidly deleted the file

do you know how to fix this?
i reinstalled it but it didn't work
this is on widnows and on version 3.11.3 on python

thank you


--
Anthony Flury
email : anthony.fl...@btinternet.com

--
https://mail.python.org/mailman/listinfo/python-list


Re: What do these '=?utf-8?' sequences mean in python?

2023-05-09 Thread Cameron Simpson

On 08May2023 12:19, jak  wrote:
In reality you should also take into account the fact that if the 
header

contains a 'b' instead of a 'q' as a penultimate character, then the
rest of the package is converted on the basis64

"=?utf-8?Q?"  --> "=?utf-8?B?"


Aye. Specification:

https://datatracker.ietf.org/doc/html/rfc2047

You should reach for jak's suggested email.header suggestion _before_ 
parsing the subject line. Details:


https://docs.python.org/3/library/email.header.html#module-email.header

Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list