Re: MacOS 10.9.2: threading error using python.org 2.7.6 distribution

2014-04-28 Thread Matthew Pounsett
On Sunday, 27 April 2014 10:33:38 UTC-4, Chris Angelico  wrote:
 In most contexts, thread unsafe simply means that you can't use the 
 same facilities simultaneously from two threads (eg a lot of database
 connection libraries are thread unsafe with regard to a single
 connection, as they'll simply write to a pipe or socket and then read
 a response from it). But processes and threads are, on many systems,
 linked. Just the act of spinning off a new thread and then forking can
 potentially cause problems. Those are the exact sorts of issues that
 you'll see when you switch OSes, as it's the underlying thread/process
 model that's significant. (Particularly of note is that Windows is
 *very* different from Unix-based systems, in that subprocess
 management is not done by forking. But not applicable here.)
 

Thanks, I'll keep all that in mind.  I have to wonder how much of a problem it 
is here though, since I was able to demonstrate a functioning fork inside a new 
thread further up in the discussion.

I have a new development that I find interesting, and I'm wondering if you 
still think it's the same problem.

I have taken that threading object and turned it into a normal function 
definition.  It's still forking the external tool, but it's doing so in the 
main thread, and it is finished execution before any other threads are created. 
  And I'm still getting the same error.

Turns out it's not coming from the threading module, but from the subprocess 
module instead.  Specifically, like 709 of 
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py
which is this:

try:
self._execute_child(args, executable, preexec_fn, close_fds,
cwd, env, universal_newlines,
startupinfo, creationflags, shell, to_close,
p2cread, p2cwrite,
c2pread, c2pwrite,
errread, errwrite)
except Exception:

I get the Warning: No stack to get attribute from twice when that 
self._execute_child() call is made.  I've tried stepping into it to narrow it 
down further, but I'm getting weird behaviour from the debugger that I've never 
seen before once I do that.  It's making it hard to track down exactly where 
the error is occurring.

Interestingly, it's not actually raising an exception there.  The except block 
is not being run.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: MacOS 10.9.2: threading error using python.org 2.7.6 distribution

2014-04-28 Thread Chris Angelico
On Tue, Apr 29, 2014 at 8:50 AM, Matthew Pounsett
matt.pouns...@gmail.com wrote:
 Thanks, I'll keep all that in mind.  I have to wonder how much of a problem 
 it is here though, since I was able to demonstrate a functioning fork inside 
 a new thread further up in the discussion.


Yeah, it's really hard to pin down sometimes. I once discovered a
problem whereby I was unable to spin off subprocesses that did certain
things, but I could do a trivial subprocess (I think I fork/exec'd to
the echo command or something) and that worked fine. Turned out to be
a bug in one of my signal handlers, but the error was being reported
at the point of the forking.

 I have a new development that I find interesting, and I'm wondering if you 
 still think it's the same problem.

 I have taken that threading object and turned it into a normal function 
 definition.  It's still forking the external tool, but it's doing so in the 
 main thread, and it is finished execution before any other threads are 
 created.   And I'm still getting the same error.


Interesting. That ought to eliminate all possibility of
thread-vs-process issues. Can you post the smallest piece of code that
exhibits the same failure?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: MacOS 10.9.2: threading error using python.org 2.7.6 distribution

2014-04-27 Thread Matthew Pounsett
On Friday, 25 April 2014 10:05:03 UTC-4, Chris Angelico  wrote:
 First culprit I'd look at is the mixing of subprocess and threading.
 It's entirely possible that something goes messy when you fork from a
 thread.

I liked the theory, but I've run some tests and can't reproduce the error that 
way.  I'm using all the elements in my test code that the real code runs, and I 
can't get the same error.  Even when I deliberately break things I'm getting a 
proper exception with stack trace.

class MyThread(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)

def run(self):
logger = logging.getLogger(thread)
p1 = subprocess.Popen( shlex.split( 'echo MyThread calls echo.'),
stdout=subprocess.PIPE, universal_newlines=True)
logger.debug( p1.communicate()[0].decode('utf-8', 'ignore' ))
logger.debug( MyThread runs and exits. )

def main():
console = logging.StreamHandler()
console.setFormatter(
logging.Formatter('%(asctime)s [%(name)-12s] %(message)s', '%T'))
logger = logging.getLogger()
logger.addHandler(console)
logger.setLevel(logging.NOTSET)

try:
t = MyThread()
#t = RTF2TXT(../data/SRD/rtf/, Queue.Queue())
t.start()
except Exception as e:
logger.error( Failed with {!r}.format(e))

if __name__ == '__main__':
main()


 Separately: You're attempting a very messy charset decode there. You
 attempt to decode as UTF-8, errors ignored, and if that fails, you log
 an error... and continue on with the original bytes. You're risking
 shooting yourself in the foot there; I would recommend you have an
 explicit fall-back (maybe re-decode as Latin-1??), so the next code is
 guaranteed to be working with Unicode. Currently, it might get a
 unicode or a str.

Yeah, that was a logic error on my part that I hadn't got around to noticing, 
since I'd been concentrating on the stuff that was actively breaking.  That 
should have been in an else: block on the end of the try.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: MacOS 10.9.2: threading error using python.org 2.7.6 distribution

2014-04-27 Thread Matthew Pounsett
On Friday, 25 April 2014 14:58:56 UTC-4, Ned Deily  wrote:
 FWIW, the Python 2 version of subprocess is known to be thread-unsafe.  
 There is a Py2 backport available on PyPI of the improved Python 3 
 subprocess module:

Since that't the only thread that calls anything in subprocess, and I'm only 
running one instance of the thread, I'm not too concerned about how threadsafe 
subprocess is.  In this case it shouldn't matter.  Thanks for the info though.. 
that might be handy at some future point.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: MacOS 10.9.2: threading error using python.org 2.7.6 distribution

2014-04-27 Thread Chris Angelico
On Mon, Apr 28, 2014 at 12:16 AM, Matthew Pounsett
matt.pouns...@gmail.com wrote:
 On Friday, 25 April 2014 10:05:03 UTC-4, Chris Angelico  wrote:
 First culprit I'd look at is the mixing of subprocess and threading.
 It's entirely possible that something goes messy when you fork from a
 thread.

 I liked the theory, but I've run some tests and can't reproduce the error 
 that way.  I'm using all the elements in my test code that the real code 
 runs, and I can't get the same error.  Even when I deliberately break things 
 I'm getting a proper exception with stack trace.


In most contexts, thread unsafe simply means that you can't use the
same facilities simultaneously from two threads (eg a lot of database
connection libraries are thread unsafe with regard to a single
connection, as they'll simply write to a pipe or socket and then read
a response from it). But processes and threads are, on many systems,
linked. Just the act of spinning off a new thread and then forking can
potentially cause problems. Those are the exact sorts of issues that
you'll see when you switch OSes, as it's the underlying thread/process
model that's significant. (Particularly of note is that Windows is
*very* different from Unix-based systems, in that subprocess
management is not done by forking. But not applicable here.)

You may want to have a look at subprocess32, which Ned pointed out. I
haven't checked, but I would guess that its API is identical to
subprocess's, so it should be a drop-in replacement (import
subprocess32 as subprocess). If that produces the exact same results,
then it's (probably) not thread-safety that's the problem.

 Separately: You're attempting a very messy charset decode there. You
 attempt to decode as UTF-8, errors ignored, and if that fails, you log
 an error... and continue on with the original bytes. You're risking
 shooting yourself in the foot there; I would recommend you have an
 explicit fall-back (maybe re-decode as Latin-1??), so the next code is
 guaranteed to be working with Unicode. Currently, it might get a
 unicode or a str.

 Yeah, that was a logic error on my part that I hadn't got around to noticing, 
 since I'd been concentrating on the stuff that was actively breaking.  That 
 should have been in an else: block on the end of the try.


Ah good. Keeping bytes versus text separate is something that becomes
particularly important in Python 3, so I always like to encourage
people to get them straight even in Py2. It'll save you some hassle
later on.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


MacOS 10.9.2: threading error using python.org 2.7.6 distribution

2014-04-25 Thread Matthew Pounsett

I've run into a threading error in some code when I run it on MacOS that works 
flawlessly on a *BSD system running the same version of python.  I'm running 
the python 2.7.6 for MacOS distribution from python.org's downloads page.

I have tried to reproduce the error with a simple example, but so far haven't 
been able to find the element or my code that triggers the error.  I'm hoping 
someone can suggest some things to try and/or look at.  Googling for pyton 
and the error returns exactly two pages, neither of which are any help.

When I run it through the debugger, I'm getting the following from inside 
threading.start().  python fails to provide a stack trace when I step into 
_start_new_thread(), which is a pointer to thread.start_new_thread().  It looks 
like threading.__bootstrap_inner() may be throwing an exception which 
thread.start_new_thread() is unable to handle, and for some reason the stack is 
missing so I get no stack trace explaining the error.

It looks like thread.start_new_thread() is in the binary object, so I can't 
actually step into it and find where the error is occurring.

 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py(745)start()
- _start_new_thread(self.__bootstrap, ())
(Pdb) s
 /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py(750)start()
- self.__started.wait()
(Pdb) Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from
Warning: No stack to get attribute from

My test code (which works) follows the exact same structure as the failing 
code, making the same calls to the threading module's objects' methods:


import threading

class MyThread(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)

def run(self):
print MyThread runs and exits.

def main():
try:
t = MyThread()
t.start()
except Exception as e:
print Failed with {!r}.format(e)

if __name__ == '__main__':
main()


The actual thread object that's failing looks like this:

class RTF2TXT(threading.Thread):

Takes a directory path and a Queue as arguments.  The directory should be
a collection of RTF files, which will be read one-by-one, converted to
text, and each output line will be appended in order to the Queue.

def __init__(self, path, queue):
threading.Thread.__init__(self)
self.path = path
self.queue = queue

def run(self):
logger = logging.getLogger('RTF2TXT')
if not os.path.isdir(self.path):
raise TypeError, supplied path must be a directory
for f in sorted(os.listdir(self.path)):
ff = os.path.join(self.path, f)
args = [ UNRTF_BIN, '-P', '.', '-t', 'unrtf.text',  ff ]
logger.debug(Processing file {} with args {!r}.format(f, args))
p1 = subprocess.Popen( args, stdout=subprocess.PIPE,
universal_newlines=True)
output = p1.communicate()[0]
try:
output = output.decode('utf-8', 'ignore')
except Exception as err:
logger.error(Failed to decode output: {}.format(err))
logger.error(Output was: {!r}.format(output))

for line in output.split(\n):
line = line.strip()
self.queue.put(line)
self.queue.put(EOF)

Note: I only run one instance of this thread.  The Queue object is used to pass 
work off to another thread for later processing.

If I insert that object into the test code and run it instead of MyThread(), I 
get the error.  I can't see anything in there that should cause problems for 
the threading module though... especially since this runs fine on another 
system with the same version of python.

Any thoughts on what's going on here?



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: MacOS 10.9.2: threading error using python.org 2.7.6 distribution

2014-04-25 Thread Chris Angelico
On Fri, Apr 25, 2014 at 11:43 PM, Matthew Pounsett
matt.pouns...@gmail.com wrote:
 If I insert that object into the test code and run it instead of MyThread(), 
 I get the error.  I can't see anything in there that should cause problems 
 for the threading module though... especially since this runs fine on another 
 system with the same version of python.

 Any thoughts on what's going on here?

First culprit I'd look at is the mixing of subprocess and threading.
It's entirely possible that something goes messy when you fork from a
thread.

Separately: You're attempting a very messy charset decode there. You
attempt to decode as UTF-8, errors ignored, and if that fails, you log
an error... and continue on with the original bytes. You're risking
shooting yourself in the foot there; I would recommend you have an
explicit fall-back (maybe re-decode as Latin-1??), so the next code is
guaranteed to be working with Unicode. Currently, it might get a
unicode or a str.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: MacOS 10.9.2: threading error using python.org 2.7.6 distribution

2014-04-25 Thread Ned Deily
In article 
captjjmpxuj9n3cdqch0ojavksfvrqjwhh1gst3fafkcgyw5...@mail.gmail.com,
 Chris Angelico ros...@gmail.com wrote:

 On Fri, Apr 25, 2014 at 11:43 PM, Matthew Pounsett
 matt.pouns...@gmail.com wrote:
  If I insert that object into the test code and run it instead of 
  MyThread(), I get the error.  I can't see anything in there that should 
  cause problems for the threading module though... especially since this 
  runs fine on another system with the same version of python.
 
  Any thoughts on what's going on here?
 
 First culprit I'd look at is the mixing of subprocess and threading.
 It's entirely possible that something goes messy when you fork from a
 thread.

FWIW, the Python 2 version of subprocess is known to be thread-unsafe.  
There is a Py2 backport available on PyPI of the improved Python 3 
subprocess module:

http://bugs.python.org/issue20318
https://pypi.python.org/pypi/subprocess32/

-- 
 Ned Deily,
 n...@acm.org

-- 
https://mail.python.org/mailman/listinfo/python-list