Embedding multiple interpreters

2013-12-05 Thread Garthy


Hi!

I hope I've got the right list here- there were a few to choose from. :}

I am trying to embed Python with multiple interpreters into an existing 
application. I have things working fine with a single interpreter thus 
far. I am running into problems when using multiple interpreters [1] and 
I am presently trying to track down these issues. Can anyone familiar 
with the process of embedding multiple interpreters have a skim of the 
details below and let me know of any obvious problems? If I can get the 
essentials right, then presumably it's just a matter of my tracking down 
any problems with my code.


I am presently using Python 3.3.3.

What I am after:

- Each sub-interpreter will have its own dedicated thread. Each thread 
will have no more than one sub-interpreter. Basically, there is a 
one-to-one mapping between threads and interpreters (some threads are 
unrelated to Python though).


- The default interpreter in the main thread will never be used, 
although I can explicitly use it if it'll help in some way.


- Each thread is created and managed outside of Python. This can't be 
readily changed.


- I have a single internal module I need to be able to use for each 
interpreter.


- I load scripts into __main__ and create objects from it to bootstrap.

- I understand that for the most part only a single interpreter will be 
running at a time due to the GIL. This is unfortunate but not a major 
problem.


- I don't need to share objects between interpreters (if it is even 
possible- I don't know).


- My fallback if I can't do this is to implement each instance in a 
dedicated *process* rather than per-thread. However, there is a 
significant cost to doing this that I would rather not incur.


Could I confirm:

- There is one GIL in a given process, shared amongst all (sub) 
interpreters. There seems some disagreement on this one online, although 
I'm fairly confident that there is only the one GIL.


- I am using the mod_wsgi source for inspiration. Is there a better 
source for an example of embedding multiple interpreters?


A query re the doco:

http://docs.python.org/3/c-api/init.html#gilstate

"Python supports the creation of additional interpreters (using 
Py_NewInterpreter()), but mixing multiple interpreters and the 
PyGILState_*() API is unsupported."


Is this actually correct? mod_wsgi seems to do it. Have I misunderstood?

I've extracted what I have so far from my code into a form that can be 
followed more easily. Hopefully I have not made any mistakes in doing 
so. The essence of what my code calls should be as follows:


=== Global init, run once:

static PyThreadState *mtstate = NULL;

PyImport_AppendInittab("myinternalmodule", PyInit_myinternalmodule);
Py_SetProgramName((wchar_t *)"foo");
Pu_InitializeEx(0);
PyEval_InitThreads();
mtstate = PyThreadState_Get();
PyEval_ReleaseThread(mtstate);

=== Global shutdown, run once at end:

Py_Finalize();

=== Per-interpreter init in main thread before launching child thread:

(none thus far)

=== Init in dedicated thread for each interpreter:

// NB: Also protected by a single global non-Python mutex to be sure.

PyGILState_STATE gil = PyGILState_Ensure();
PyThreadState *save_tstate = PyThreadState_Swap(NULL);
state = Py_NewInterpreter();
PyThreadState_Swap(save_tstate);

PyObject *mmodule = PyImport_AddModule("__main__");
Py_INCREF(mmodule);

PyImport_ImportModule("myinternalmodule");

PyGILState_Release(gil);

=== Shutdown in dedicated thread for each interpreter:

// NB: Also protected by the same single global non-Python mutex as in 
the init.


PyGILState_STATE gil = PyGILState_Ensure();
PyThreadState *save_tstate = PyThreadState_Swap(state);
Py_EndInterpreter(state);
PyThreadState_Swap(save_tstate);
PyGILState_Release(gil);

=== Placed at top of scope where calls made to Python C API:

SafeLock lock;

=== SafeLock implementation:

class SafeLock
{
  public:
SafeLock() {gil = PyGILState_Ensure();}
~SafeLock() {PyGILState_Release(gil);}

  private:
PyGILState_STATE gil;
};

===

Does this look roughly right? Have I got the global and per-interpreter 
init and shutdown right? Am I locking correctly in SafeLock- is 
PyGILState_Ensure() and PyGILState_Release() sufficient?


Is there an authoritative summary of the global and per-interpreter init 
and shutdown somewhere that I have missed? Any resource I should be reading?


Cheers,
Garth

[1] It presently crashes in Py_EndInterpreter() after running through a 
series of tests during the shutdown of the 32nd interpreter I create. I 
don't know if this is significant, but the tests pass for the first 31 
interpreters.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Embedding multiple interpreters

2013-12-06 Thread Garthy


Hi Chris (and Michael),

On 06/12/13 15:51, Chris Angelico wrote:

On Fri, Dec 6, 2013 at 4:16 PM, Michael Torrie  wrote:

On 12/05/2013 07:34 PM, Garthy wrote:

- My fallback if I can't do this is to implement each instance in a
dedicated *process* rather than per-thread. However, there is a
significant cost to doing this that I would rather not incur.


What cost is this? Are you speaking of cost in terms of what you the
programmer would have to do, cost in terms of setting things up and
communicating with the process, or the cost of creating a process vs a
thread?  If it's the last, on most modern OS's (particularly Linux),
it's really not that expensive.  On Linux the cost of threads and
processes are nearly the same.


If you want my guess, the cost of going to multiple processes would be
to do with passing data back and forth between them. Why is Python
being embedded in another application? Sounds like there's data moving
from C to Python to C, ergo breaking that into separate processes
means lots of IPC.


An excellent guess. :)

One characteristic of the application I am looking to embed Python in is 
that there are a fairly large number calls from the app into Python, and 
for each, generally many back to the app. There is a healthy amount of 
data flowing back and forth each time. An implementation with an 
inter-process roundtrip each time (with a different scripting language) 
proved to be too limiting, and needlessly complicated the design of the 
app. As such, more development effort has gone into making things work 
better with components that work well running across thread boundaries 
than process boundaries.


I am confident at this point I could pull things off with a Python 
one-interpreter-per-process design, but I'd then need to visit the IPC 
side of things again and put up with the limitations that arise. 
Additionally, the IPC code has has less attention and isn't as capable. 
I know roughly how I'd proceed if I went with this approach, but it is 
the least desirable outcome of the two.


However, if I could manage to get a thread-based solution going, I can 
put the effort where it is most productive, namely into making sure that 
the thread-based solution works best. This is my preferred outcome and 
current goal. :)


Cheers,
Garth


--
https://mail.python.org/mailman/listinfo/python-list


Re: Embedding multiple interpreters

2013-12-06 Thread Garthy


Hi Gregory,

On 06/12/13 17:28, Gregory Ewing wrote:
> Garthy wrote:
>> I am running into problems when using multiple interpreters [1] and I
>> am presently trying to track down these issues. Can anyone familiar
>> with the process of embedding multiple interpreters have a skim of the
>> details below and let me know of any obvious problems?
>
> As far as I know, multiple interpreters in one process is
> not really supported. There *seems* to be partial support for
> it in the code, but there is no way to fully isolate them
> from each other.

That's not good to hear.

Is there anything confirming that it's an incomplete API insofar as 
multiple interpreters are concerned? Wouldn't this carry consequences 
for say mod_wsgi, which also does this?


> Why do you think you need multiple interpreters, as opposed
> to one interpreter with multiple threads? If you're trying
> to sandbox the threads from each other and/or from the rest
> of the system, be aware that it's extremely difficult to
> securely sandbox Python code. You'd be much safer to run
> each one in its own process and rely on OS-level protections.

To allow each script to run in its own environment, with minimal chance 
of inadvertent interaction between the environments, whilst allowing 
each script the ability to stall on conditions that will be later met by 
another thread supplying the information, and to fit in with existing 
infrastructure.


>> - I don't need to share objects between interpreters (if it is even
>> possible- I don't know).
>
> The hard part is *not* sharing objects between interpreters.
> If nothing else, all the builtin type objects, constants, etc.
> will be shared.

I understand. To clarify: I do not need to pass any Python objects I 
create or receive back and forth between different interpreters. I can 
imagine some environments would not react well to this.


Cheers,
Garth

--
https://mail.python.org/mailman/listinfo/python-list


Re: Embedding multiple interpreters

2013-12-06 Thread Garthy

Hi all,

A small update here:

On 06/12/13 13:04, Garthy wrote:
> [1] It presently crashes in Py_EndInterpreter() after running through a
> series of tests during the shutdown of the 32nd interpreter I create. I
> don't know if this is significant, but the tests pass for the first 31
> interpreters.

This turned out to be a red herring, so please ignore this bit. I had a 
code path that failed to call Py_INCREF on Py_None which was held in a 
PyObject that was later Py_DECREF'd. This had some interesting 
consequences, and not surprisingly led to some double-frees. ;)


I was able to get much further with this fix, although I'm still having 
some trouble getting multiple interpreters running together 
simultaneously. Advice and thoughts still very much welcomed on the rest 
of the email. :)


Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list


Re: Embedding multiple interpreters

2013-12-06 Thread Garthy


Hi Chris (and Michael),

On 06/12/13 15:46, Michael Torrie wrote:
> On 12/05/2013 07:34 PM, Garthy wrote:
>> - My fallback if I can't do this is to implement each instance in a
>> dedicated *process* rather than per-thread. However, there is a
>> significant cost to doing this that I would rather not incur.
>
> What cost is this? Are you speaking of cost in terms of what you the
> programmer would have to do, cost in terms of setting things up and
> communicating with the process, or the cost of creating a process vs a
> thread?  If it's the last, on most modern OS's (particularly Linux),
> it's really not that expensive.  On Linux the cost of threads and
> processes are nearly the same.

An excellent guess. :)

One characteristic of the application I am looking to embed Python in is 
that there are a fairly large number calls from the app into Python, and 
for each, generally many back to the app. There is a healthy amount of 
data flowing back and forth each time. An implementation with an 
inter-process roundtrip each time (with a different scripting language) 
proved to be too limiting, and needlessly complicated the design of the 
app. As such, more development effort has gone into making things work 
better with components that work well running across thread boundaries 
than process boundaries.


I am confident at this point I could pull things off with a Python 
one-interpreter-per-process design, but I'd then need to visit the IPC 
side of things again and put up with the limitations that arise. 
Additionally, the IPC code has has less attention and isn't as capable. 
I know roughly how I'd proceed if I went with this approach, but it is 
the least desirable outcome of the two.


However, if I could manage to get a thread-based solution going, I can 
put the effort where it is most productive, namely into making sure that 
the thread-based solution works best. This is my preferred outcome and 
current goal. :)


Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list


Re: Embedding multiple interpreters

2013-12-06 Thread Garthy


Hi all,

A small update here:

On 06/12/13 13:04, Garthy wrote:
> [1] It presently crashes in Py_EndInterpreter() after running through a
> series of tests during the shutdown of the 32nd interpreter I create. I
> don't know if this is significant, but the tests pass for the first 31
> interpreters.

This turned out to be a red herring, so please ignore this bit. I had a 
code path that failed to call Py_INCREF on Py_None which was held in a 
PyObject that was later Py_DECREF'd. This had some interesting 
consequences, and not surprisingly led to some double-frees. ;)


I was able to get much further with this fix, although I'm still having 
some trouble getting multiple interpreters running together 
simultaneously. Advice and thoughts still very much welcomed on the rest 
of the email. :)


Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list


Re: Embedding multiple interpreters

2013-12-06 Thread Garthy


Hi Gregory,

On 06/12/13 17:28, Gregory Ewing wrote:
> Garthy wrote:
>> I am running into problems when using multiple interpreters [1] and I
>> am presently trying to track down these issues. Can anyone familiar
>> with the process of embedding multiple interpreters have a skim of the
>> details below and let me know of any obvious problems?
>
> As far as I know, multiple interpreters in one process is
> not really supported. There *seems* to be partial support for
> it in the code, but there is no way to fully isolate them
> from each other.

That's not good to hear.

Is there anything confirming that it's an incomplete API insofar as 
multiple interpreters are concerned? Wouldn't this carry consequences 
for say mod_wsgi, which also does this?


> Why do you think you need multiple interpreters, as opposed
> to one interpreter with multiple threads? If you're trying
> to sandbox the threads from each other and/or from the rest
> of the system, be aware that it's extremely difficult to
> securely sandbox Python code. You'd be much safer to run
> each one in its own process and rely on OS-level protections.

To allow each script to run in its own environment, with minimal chance 
of inadvertent interaction between the environments, whilst allowing 
each script the ability to stall on conditions that will be later met by 
another thread supplying the information, and to fit in with existing 
infrastructure.


>> - I don't need to share objects between interpreters (if it is even
>> possible- I don't know).
>
> The hard part is *not* sharing objects between interpreters.
> If nothing else, all the builtin type objects, constants, etc.
> will be shared.

I understand. To clarify: I do not need to pass any Python objects I 
create or receive back and forth between different interpreters. I can 
imagine some environments would not react well to this.


Cheers,
Garth

PS. Apologies if any of these messages come through more than once. Most 
lists that I've posted to set reply-to meaning a normal reply can be 
used, but python-list does not seem to. The replies I have sent manually 
to [email protected] instead don't seem to have appeared. I'm not 
quite sure what is happening- apologies for any blundering around on my 
part trying to figure it out.


--
https://mail.python.org/mailman/listinfo/python-list


Re: Embedding multiple interpreters

2013-12-06 Thread Garthy


Hi Chris,

On 06/12/13 19:03, Chris Angelico wrote:
> On Fri, Dec 6, 2013 at 6:59 PM, Garthy
>   wrote:
>> Hi Chris (and Michael),
>
> Hehe. People often say that to me IRL, addressing me and my brother.
> But he isn't on python-list, so you clearly mean Michael Torrie, yet
> my brain still automatically thought you were addressing Michael
> Angelico :)

These strange coincidences happen from time to time- it's entertaining 
when they do. :)


>> To allow each script to run in its own environment, with minimal 
chance of

>> inadvertent interaction between the environments, whilst allowing each
>> script the ability to stall on conditions that will be later met by 
another

>> thread supplying the information, and to fit in with existing
>> infrastructure.
>
> Are the scripts written cooperatively, or must you isolate one from
> another? If you need to isolate them for trust reasons, then there's
> only one solution, and that's separate processes with completely
> separate interpreters. But if you're prepared to accept that one
> thread of execution is capable of mangling another's state, things are
> a lot easier. You can protect against *inadvertent* interaction much
> more easily than malicious interference. It may be that you can get
> away with simply running multiple threads in one interpreter;
> obviously that would have problems if you need more than one CPU core
> between them all (hello GIL), but that would really be your first
> limit. One thread could fiddle with __builtins__ or a standard module
> and thus harass another thread, but you would know if that's what's
> going on.

I think the ideal is completely sandboxed, but it's something that I 
understand I may need to make compromises on. The bare minimum would be 
protection against inadvertent interaction. Better yet would be a setup 
that made such interaction annoyingly difficult, and the ideal would be 
where it was impossible to interfere. My approaching this problem with 
interpreters was based on an assumption that it might provide a 
reasonable level of isolation- perhaps not ideal, but hopefully good enough.


The closest analogy for understanding would be browser plugins: Scripts 
from multiple authors who for the most part aren't looking to create 
deliberate incompatibilities or interference between plugins. The 
isolation is basic, and some effort is made to make sure that one plugin 
can't cripple another trivially, but the protection is not exhaustive.


Strangely enough, the GIL restriction isn't a big one in this case. For 
the application, the common case is actually one script running at a 
time, with other scripts waiting or not running at that time. They do 
sometimes overlap, but this isn't the common case. If it turned out that 
only one script could be progressing at a time, it's an annoyance but 
not a deal-breaker. If it's suboptimal (as seems to be the case), then 
it's actually not a major issue.


With the single interpreter and multiple thread approach suggested, do 
you know if this will work with threads created externally to Python, 
ie. if I can create a thread in my application as normal, and then call 
something like PyGILState_Ensure() to make sure that Python has the 
internals it needs to work with it, and then use the GIL (or similar) to 
ensure that accesses to it remain thread-safe? If the answer is yes I 
can integrate such a thing more easily as an experiment. If it requires 
calling a dedicated "control" script that feeds out threads then it 
would need a fair bit more mucking about to integrate- I'd like to avoid 
this if possible.


Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list


Re: Embedding multiple interpreters

2013-12-06 Thread Garthy


Hi Chris,

On 06/12/13 19:57, Chris Angelico wrote:
> On Fri, Dec 6, 2013 at 7:21 PM, Garthy
>   wrote:
>> PS. Apologies if any of these messages come through more than once. Most
>> lists that I've posted to set reply-to meaning a normal reply can be 
used,

>> but python-list does not seem to. The replies I have sent manually to
>> [email protected] instead don't seem to have appeared. I'm not 
quite

>> sure what is happening- apologies for any blundering around on my part
>> trying to figure it out.
>
> They are coming through more than once. If you're subscribed to the
> list, sending to [email protected] should be all you need to do -
> where else are they going?

I think I've got myself sorted out now. The mailing list settings are a 
bit different from what I am used to and I just need to reply to 
messages differently than I normally do.


First attempt for three emails each went to the wrong place, second 
attempt for each appeared to have disappeared into the ether and I 
assumed non-delivery, but I was incorrect and they all actually arrived 
along with my third attempt at each.


Apologies to all for the inadvertent noise.

Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list


Re: Embedding multiple interpreters

2013-12-06 Thread Garthy


Hi Chris,

On 06/12/13 22:27, Chris Angelico wrote:
> On Fri, Dec 6, 2013 at 8:35 PM, Garthy
>   wrote:
>> I think the ideal is completely sandboxed, but it's something that I
>> understand I may need to make compromises on. The bare minimum would be
>> protection against inadvertent interaction. Better yet would be a 
setup that
>> made such interaction annoyingly difficult, and the ideal would be 
where it

>> was impossible to interfere.
>
> In Python, "impossible to interfere" is a pipe dream. There's no way
> to stop Python from fiddling around with the file system, and if
> ctypes is available, with memory in the running program. The only way
> to engineer that kind of protection is to prevent _the whole process_
> from doing those things (using OS features, not Python features),
> hence the need to split the code out into another process (which might
> be chrooted, might be running as a user with no privileges, etc).

Absolutely- it would be an impractical ideal. If it was my highest and 
only priority, CPython might not be the best place to start. But there 
are plenty of other factors that make Python very desirable to use 
regardless. :) Re file and ctype-style functionality, that is something 
I'm going to have to find a way to limit somewhat. But first things 
first: I need to see what I can accomplish re initial embedding with a 
reasonable amount of work.


> A setup that makes such interaction "annoyingly difficult" is possible
> as long as your users don't think Ruby. For instance:
>
> # script1.py
> import sys
> sys.stdout = open("logfile", "w")
> while True: print("Blah blah")
>
> # script2.py
> import sys
> sys.stdout = open("otherlogfile", "w")
> while True: print("Bleh bleh")
>
>
> These two scripts won't play nicely together, because each has
> modified global state in a different module. So you'd have to set that
> as a rule. (For this specific example, you probably want to capture
> stdout/stderr to some sort of global log file anyway, and/or use the
> logging module, but it makes a simple example.)

Thanks for the example. Hopefully I can minimise the cases where this 
would potentially be a problem. Modifying the basic environment and the 
source is something I can do readily if needed.


Re stdout/stderr, on that subject I actually wrote a replacement log 
catcher for embedded Python a few years back. I can't remember how on 
earth I did it now, but I've still got the code that did it somewhere.


> Most Python scripts
> aren't going to do this sort of thing, or if they do, will do very
> little of it. Monkey-patching other people's code is a VERY rare thing
> in Python.

That's good to hear. :)

>> The closest analogy for understanding would be browser plugins: 
Scripts from
>> multiple authors who for the most part aren't looking to create 
deliberate
>> incompatibilities or interference between plugins. The isolation is 
basic,
>> and some effort is made to make sure that one plugin can't cripple 
another

>> trivially, but the protection is not exhaustive.
>
> Browser plugins probably need a lot more protection - maybe it's not
> exhaustive, but any time someone finds a way for one plugin to affect
> another, the plugin / browser authors are going to treat it as a bug.
> If I understand you, though, this is more akin to having two forms on
> one page and having JS validation code for each. It's trivially easy
> for one to check the other's form objects, but quite simple to avoid
> too, so for the sake of encapsulation you simply stay safe.

There have been cases where browser plugins have played funny games to 
mess with the behaviour of other plugins (eg. one plugin removing 
entries from the configuration of another). It's certainly not ideal, 
but it comes from the environment being not entirely locked down, and 
one plugin author being inclined enough to make destructive changes that 
impact another. I think the right effort/reward ratio will mean I end up 
in a similar place.


I know it's not the best analogy, but it was one that readily came to 
mind. :)


>> With the single interpreter and multiple thread approach suggested, 
do you
>> know if this will work with threads created externally to Python, 
ie. if I

>> can create a thread in my application as normal, and then call something
>> like PyGILState_Ensure() to make sure that Python has the internals 
it needs
>> to work with it, and then use the GIL (or similar) to ensure that 
accesses

>> to it remain thread-safe?
>
> Now that's something I can't help with. The only time I embedded
> Python seriously

Re: Embedding multiple interpreters

2013-12-06 Thread Garthy


Hi Tim,

On 06/12/13 20:47, Tim Golden wrote:

On 06/12/2013 09:27, Chris Angelico wrote:

On Fri, Dec 6, 2013 at 7:21 PM, Garthy
  wrote:

PS. Apologies if any of these messages come through more than once. Most
lists that I've posted to set reply-to meaning a normal reply can be used,
but python-list does not seem to. The replies I have sent manually to
[email protected] instead don't seem to have appeared. I'm not quite
sure what is happening- apologies for any blundering around on my part
trying to figure it out.


They are coming through more than once. If you're subscribed to the
list, sending to [email protected] should be all you need to do -
where else are they going?



I released a batch from the moderation queue from Garthy first thing
this [my] morning -- ie about 1.5 hours ago. I'm afraid I didn't check
first as to whether they'd already got through to the list some other way.


I had to make a call between re-sending posts that might have gone 
missing, or seemingly not responding promptly when people had taken the 
time to answer my complex query. I made a call to re-send, and it was 
the wrong one. The fault for the double-posting is entirely mine.


Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list


Re: Embedding multiple interpreters

2013-12-06 Thread Garthy


Hi Gregory,

On 07/12/13 08:53, Gregory Ewing wrote:
> Garthy wrote:
>> The bare minimum would be protection against inadvertent interaction.
>> Better yet would be a setup that made such interaction annoyingly
>> difficult, and the ideal would be where it was impossible to interfere.
>
> To give you an idea of the kind of interference that's
> possible, consider:
>
> 1) You can find all the subclasses of a given class
> object using its __subclasses__() method.
>
> 2) Every class ultimately derives from class object.
>
> 3) All built-in class objects are shared between
> interpreters.
>
> So, starting from object.__subclasses__(), code in any
> interpreter could find any class defined by any other
> interpreter and mutate it.

Many thanks for the excellent example. It was not clear to me how 
readily such a small and critical bit of shared state could potentially 
be abused across interpreter boundaries. I am guessing this would be the 
first in a chain of potential problems I may run into.


> This is not something that is likely to happen by
> accident. Whether it's "annoyingly difficult" enough
> is something you'll have to decide.

I think it'd fall under "protection against inadvertent modification"- 
down the scale somewhat. It doesn't sound like it would be too difficult 
to achieve if the author was so inclined.


> Also keep in mind that it's fairly easy for Python
> code to chew up large amounts of memory and/or CPU
> time in an uninterruptible way, e.g. by
> evaluating 5**1. So even a thread that's
> keeping its hands entirely to itself can still
> cause trouble.

Thanks for the tip. The potential for deliberate resource exhaustion is 
unfortunately something that I am likely going to have to put up with in 
order to keep things in the same process.


Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list


Re: Embedding multiple interpreters

2013-12-06 Thread Garthy


Hi Gregory,

On 07/12/13 08:39, Gregory Ewing wrote:
> Garthy wrote:
>> To allow each script to run in its own environment, with minimal
>> chance of inadvertent interaction between the environments, whilst
>> allowing each script the ability to stall on conditions that will be
>> later met by another thread supplying the information, and to fit in
>> with existing infrastructure.
>
> The last time I remember this being discussed was in the context
> of allowing free threading. Multiple interpreters don't solve
> that problem, because there's still only one GIL and some
> objects are shared.

I am fortunate in my case as the normal impact of the GIL would be much 
reduced. The common case is only one script actively progressing at a 
time- with the others either not running or waiting for external input 
to continue.


But as you point out in your other reply, there are still potential 
concerns that arise from the smaller set of shared objects even across 
interpreters.


> But if all you want is for each plugin to have its own version
> of sys.modules, etc., and you're not concerned about malicious
> code, then it may be good enough.

I wouldn't say that I wasn't concerned about it entirely, but on the 
other hand it is not a hard requirement to which all other concerns are 
secondary.


Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list


Module missing when embedding?

2013-12-11 Thread Garthy


Hi all,

I am attempting to embed Python 3.3.3 into an application.

Following advice which suggests not using multiple interpreters, I am 
experimenting using a *single* interpreter and multiple threads.


So far I have been loading directly into "__main__", ie. something like 
this:


PyObject *main_module = PyImport_AddModule("__main__");

PyObject *dg = PyModule_GetDict(main_module);
PyObject *dl = PyModule_GetDict(main_module);
PyObject *rv = PyRun_String(str, Py_file_input, dg, dl);

A stripped down script looks like this:

import mymodule

class Foo:
  def bar(self):
mymodule.mycall("a")

mymodule is set up once as:

PyImport_AppendInittab("mymodule", PyInit_mymodule);
Py_SetProgramName((wchar_t *)"foo");
Py_InitializeEx(0);
PyEval_InitThreads();
mtstate = PyThreadState_Get();
PyEval_ReleaseThread(mtstate);

And per thread as:

PyImportModule_ImportModule("mymodule");

This works, and when an instance of Foo is created, calling bar() on it 
triggers the mymodule mycall call.


I want to load scripts into their own dedicated module- I don't want 
each thread loading into "__main__" if there is only one interpreter! 
Anyway, let's try:


PyObject *module = PyModule_New();

PyObject *dg = PyModule_GetDict(module);
PyObject *dl = PyModule_GetDict(module);
PyObject *rv = PyRun_String(str, Py_file_input, dg, dl);

No good. I get:

"__import__ not found"

on load. Trying again: Let's point dg at the "__main__" module instead:

PyObject *dg = PyModule_GetDict(main_module);
PyObject *dl = PyModule_GetDict(module);
PyObject *rv = PyRun_String(str, Py_file_input, dg, dl);

and it loads. Is this the right way to go about it or have I done 
something foolish?


Anyway, later on, I create an object of type Foo and call bar() on it, 
exactly as I did before. I get:


"global name 'mymodule' is not defined"

Darn. The offending line is:

mymodule.mycall("a")

Now, in the script I load, this line is okay:

import mymodule

and this works if I try it:

from mymodule import mycall

but this does not:

from mymodule import call_that_does_not_exist

As expected. This suggests that Python understands there is a "mymodule" 
module and it contains "mycall", and not "call_that_doesnt_exist".


However, as mentioned, when I call bar() on a Foo object, it tries to call:

mymodule.mycall("a")

which worked when it was loaded into "__main__", but now I get:

"global name 'mymodule' is not defined"

With "from mymodule import mycall" in the script, I try:

mycall("a")

instead, and I get:

"global name 'mycall' is not defined"

A trace in PyInit_mymodule confirms it is being run, ie. mymodule is 
being set up. The import calls seem to confirm that "mymodule" exists, 
and "mycall" exists within it. When loaded into __main__, it works as 
expected. When loaded into a different module, it doesn't.


I structured a pure Python test that had the main script load one 
module, which imported another module, and called it in the same way. It 
worked.


I'll also point out that whilst I'm set up to use multiple threads, I am 
only using two threads at the point of the errors. I do the global setup 
in the main thread, and then never use it again, and do one lot of 
per-thread setup in a child thread, after which the errors occur. I'm 
being pedantic about GIL locking in any case.


I've had to transcribe the above code by hand. Whilst I've checked and I 
think it's fine, there is a small chance of typos.


Any ideas about what I might be doing wrong? Anything I can try on the 
Python side or the C API side? My Python knowledge is a bit rusty so I 
may have missed something obvious on the Python side. If there are any 
resources online that show something similar to what I am doing, please 
share, and I'll do the legwork. More info available if needed- just ask.


Cheers,
Garth

PS. Finishing off a test suite to illustrate, will post soon. It doesn't 
appear to be a thread issue.


--
https://mail.python.org/mailman/listinfo/python-list


Re: Module missing when embedding?

2013-12-11 Thread Garthy


Hi all,

I've written a minimal set of tests illustrate the issue, which also 
confirms it is independent of threading. Details below. If you build it, 
you might need to tweak the Makefile to run on your system. The script 
"run" will run all eight tests. They pass only when we load everything 
into "__main__", and fail otherwise. I hope this helps highlight what I 
might be doing wrong.


Cheers,
Garth

= applepy.c:

#include 
#include 

#define SCRIPT \
  "import mymodule\n" \
  "\n"\
  "class ClassA:\n" \
  "  def foo(self):\n"\
  "mymodule.mycall(\"a\")\n"

static int success = 0;
static int tn;

static PyObject *DoMyCall(PyObject *selfr, PyObject *argsr)
{
  //fprintf(stderr, "In DoMyCall\n");
  success = 1;
  Py_INCREF(Py_None);
  return Py_None;
}

static PyMethodDef mycallMethods[] = {
  {"mycall", DoMyCall, METH_VARARGS, "foo"},
  {NULL, NULL, 0, NULL}
};

static struct PyModuleDef mycallModule = {
  PyModuleDef_HEAD_INIT,
  "mymodule", NULL, -1,
  mycallMethods
};

PyMODINIT_FUNC PyInit_mymodule()
{
  return PyModule_Create(&mycallModule);
}

static void check(PyObject *r, const char *msg)
{
  if (!r)
  {
fprintf(stderr, "FAILED: %s.\n", msg);
PyErr_PrintEx(0);
fprintf(stderr, "=== Test %d: Failed.\n", tn);
exit(1);
  }
}

int main(int argc, char *argv[])
{
  if (argc != 2)
  {
fprintf(stderr, "Usage: applepy \n");
exit(1);
  }

  tn = atoi(argv[1]);
  fprintf(stderr, "=== Test %d: Started.\n", tn);
  int test_into_main = tn & 1;
  int test_global_is_module = tn & 2;
  int test_threaded = tn & 4;

  fprintf(stderr, "Load into main:   %s\n", test_into_main ? "y" : "n");
  fprintf(stderr, "Global is module: %s\n", test_global_is_module ? "y" 
: "n");

  fprintf(stderr, "Threaded: %s\n", test_threaded ? "y" : "n");

  PyGILState_STATE gil;

  PyImport_AppendInittab("mymodule", PyInit_mymodule);
  Py_SetProgramName((wchar_t *)"program");
  Py_InitializeEx(0);

  if (test_threaded)
  {
PyEval_InitThreads();
PyThreadState *mtstate = PyThreadState_Get();
PyEval_ReleaseThread(mtstate);
gil = PyGILState_Ensure();
  }
  PyObject *main_module = PyImport_AddModule("__main__");
  PyObject *module = PyModule_New("hello");

  if (test_into_main)
module = main_module;

  PyObject *dg = PyModule_GetDict(test_global_is_module ? module : 
main_module);

  check(dg, "global dict");
  PyObject *dl = PyModule_GetDict(module);
  check(dl, "local dict");
  PyObject *load = PyRun_String(SCRIPT, Py_file_input, dg, dl);
  check(load, "load");

  PyObject *obj = PyObject_CallMethod(module, "ClassA", NULL);
  check(obj, "create object of ClassA");
  PyObject *func = PyObject_GetAttrString(obj, "foo");
  check(func, "obtain foo()");
  PyObject *args = PyTuple_New(0);
  check(args, "args for foo()");
  PyObject *rv = PyObject_CallObject(func, args);
  check(rv, "call foo()");

  if (test_threaded)
PyGILState_Release(gil);

  if (success)
fprintf(stderr, "=== Test %d: Success.\n", tn);
  else
fprintf(stderr, "=== Test %d: FAILED (completed but mycall() not 
called).\n", tn);


  return 0;
}

= Makefile:

PYINCLUDE=/opt/python/include/python3.3m
PYLIBS=/opt/python/lib

CPPFLAGS=-g -Wall
CC=gcc
LINK=gcc

applepy: applepy.o
	$(LINK) -o applepy applepy.o $(CPPFLAGS) -L$(PYLIBS) -lpython3.3m -ldl 
-lm -lpthread -lutil


applepy.o: applepy.c
$(CC) -c -o applepy.o applepy.c $(CPPFLAGS) -I$(PYINCLUDE)

clean:
rm -f applepy applepy.o

= run:

#!/bin/sh
make clean
make applepy || exit 1
for t in 0 1 2 3 4 5 6 7; do
  ./applepy $t
done

= Sample output:

rm -f applepy applepy.o
gcc -c -o applepy.o applepy.c -g -Wall -I/opt/python/include/python3.3m
gcc -o applepy applepy.o -g -Wall -L/opt/python/lib -lpython3.3m -ldl 
-lm -lpthread -lutil

=== Test 0: Started.
Load into main:   n
Global is module: n
Threaded: n
FAILED: call foo().
=== Test 0: Failed.
=== Test 1: Started.
Load into main:   y
Global is module: n
Threaded: n
=== Test 1: Success.
=== Test 2: Started.
Load into main:   n
Global is module: y
Threaded: n
FAILED: load.
=== Test 2: Failed.
=== Test 3: Started.
Load into main:   y
Global is module: y
Threaded: n
=== Test 3: Success.
=== Test 4: Started.
Load into main:   n
Global is module: n
Threaded: y
FAILED: call foo().
=== Test 4: Failed.
=== Test 5: Started.
Load into main:   y
Global is module: n
Threaded: y
=== Test 5: Success.
=== Test 6: Started.
Load into main:   n
Global is module: y
Threaded: y
FAILED: load.
=== Test 6: Failed.
=== Test 7: Started.
Load into main:   y
Global is module: y
Threaded: y
=== Test 7: Success.
--
https://mail.python.org/mailman/listinfo/python-list


Re: Module missing when embedding?

2013-12-11 Thread Garthy


Hi all,

The output I collected skipped the error messages for some reason. 
Updated test output with full errors follows.


Cheers,
Garth

---

rm -f applepy applepy.o
gcc -c -o applepy.o applepy.c -g -Wall -I/opt/python/include/python3.3m
gcc -o applepy applepy.o -g -Wall -L/opt/python/lib -lpython3.3m -ldl 
-lm -lpthread -lutil

=== Test 0: Started.
Load into main:   n
Global is module: n
Threaded: n
FAILED: call foo().
Traceback (most recent call last):
  File "", line 5, in foo
NameError: global name 'mymodule' is not defined
=== Test 0: Failed.
=== Test 1: Started.
Load into main:   y
Global is module: n
Threaded: n
=== Test 1: Success.
=== Test 2: Started.
Load into main:   n
Global is module: y
Threaded: n
FAILED: load.
Traceback (most recent call last):
  File "", line 1, in 
ImportError: __import__ not found
=== Test 2: Failed.
=== Test 3: Started.
Load into main:   y
Global is module: y
Threaded: n
=== Test 3: Success.
=== Test 4: Started.
Load into main:   n
Global is module: n
Threaded: y
FAILED: call foo().
Traceback (most recent call last):
  File "", line 5, in foo
NameError: global name 'mymodule' is not defined
=== Test 4: Failed.
=== Test 5: Started.
Load into main:   y
Global is module: n
Threaded: y
=== Test 5: Success.
=== Test 6: Started.
Load into main:   n
Global is module: y
Threaded: y
FAILED: load.
Traceback (most recent call last):
  File "", line 1, in 
ImportError: __import__ not found
=== Test 6: Failed.
=== Test 7: Started.
Load into main:   y
Global is module: y
Threaded: y
=== Test 7: Success.
--
https://mail.python.org/mailman/listinfo/python-list


Re: Module missing when embedding?

2013-12-17 Thread Garthy


Hi all,

On 12/12/13 18:03, Garthy wrote:
> I am attempting to embed Python 3.3.3 into an application.

...

> Any ideas about what I might be doing wrong? Anything I can try on the
> Python side or the C API side? My Python knowledge is a bit rusty so I
> may have missed something obvious on the Python side. If there are any
> resources online that show something similar to what I am doing, please
> share, and I'll do the legwork. More info available if needed- just ask.

Thanks to anyone who may have looked at this or the subsequent test(s) I 
posted. The problem has since been solved by loading modules through 
another method.


Cheers,
Garth
--
https://mail.python.org/mailman/listinfo/python-list