Re: [pypy-dev] Compiling PyPy interpreter without GC

2015-03-18 Thread Maciej Fijalkowski
Hi John.

Can you describe the microVM and it's capabilities? Chances are it
captures things at the wrong level (I have a longer response in mind,
but I'll wait for you to describe it, in case I'm plain wrong)

What do you mean by provides a GC? Does it mean you just call malloc
and you never have to call free?

Generally speaking we don't suggest you translate pypy as a first
step, but instead write tests (equivalent to what's in
translator/c/test) and check aspects of translation one bit at a time.
That said, dependency on rweakref even when disabled is a bug, can you
post a full traceback?

Cheers,
fijal





On Wed, Mar 18, 2015 at 2:01 AM, John Zhang u5157...@uds.anu.edu.au wrote:
 Hi all,
 I'm working on developing a MicroVM backend for PyPy. It's a virtual
 machine under active research and development by my colleagues in ANU. It
 aims to capture GC, threading and JIT in the virtual machine, and frees up
 the burden of the language implementers.

 Since MicroVM provides GC, I need to remove GC from the PyPy
 interpreter. As I was trying to compile it with the following command:
 pypy $PYPY/rpython/bin/rpython \
   -O0 \
   --gc=none \
   --no-translation-rweakref \
   --annotate \
   --rtype \
   --translation-backendopt-none \
   $PYPY/pypy/goal/targetpypystandalone.py
 It gives off an error during annotation stage, saying that it's not able
 to find a module called '_rweakref'.
 Does anyone know what the problem might be, and how one might go and
 solve it?

 Appreciate greatly,
 John Zhang
 ___
 pypy-dev mailing list
 pypy-dev@python.org
 https://mail.python.org/mailman/listinfo/pypy-dev
___
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev


Re: [pypy-dev] GSoC 2015: Implement copy-on-write list slicing

2015-03-18 Thread Armin Rigo
Hi Tushar,

In the past few years, we have found GSoC to be tricky to handle.  As
a result of this, we're likely to have a high bar for student
acceptance this year.  The main criteria will be whether you have
already contributed to PyPy in a significant way.  If you only come up
with a GSoC proposal, no matter how cool, it will likely be rejected.

The first step is to get involved with the community, which means showing
up on irc (#pypy on irc.freenode.net).  This is what we say in general
to people that want to contribute to PyPy.  We can move on to discuss
GSoC only after we have seen concrete results from this first step.  I
fear that it might be late for getting involved in a significant way,
though...


A bientôt,

Armin.
___
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev


[pypy-dev] Compiling PyPy interpreter without GC

2015-03-18 Thread John Zhang

Hi all,
I'm working on developing a MicroVM backend for PyPy. It's a 
virtual machine under active research and development by my colleagues 
in ANU. It aims to capture GC, threading and JIT in the virtual machine, 
and frees up the burden of the language implementers.


Since MicroVM provides GC, I need to remove GC from the PyPy 
interpreter. As I was trying to compile it with the following command:

pypy $PYPY/rpython/bin/rpython \
  -O0 \
  --gc=none \
  --no-translation-rweakref \
  --annotate \
  --rtype \
  --translation-backendopt-none \
  $PYPY/pypy/goal/targetpypystandalone.py
It gives off an error during annotation stage, saying that it's not 
able to find a module called '_rweakref'.
Does anyone know what the problem might be, and how one might go 
and solve it?


Appreciate greatly,
John Zhang
___
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev


Re: [pypy-dev] GSoC ideas -- AArch64 JIT backend?

2015-03-18 Thread Dima Tisnek
Please take with a grain of salt, as I'm not a pypy dev.

In general:
yes it's a great idea!
arm64 definitely fits the bill in terms of hardware and arm64 devices
could really use pypy.
I have some reservations in terms how far this can be pushed in a
single summer, but hey, it's good to be ambitious!

In terms of development and testing:
Given enough host ram, you could try to translate pypy under arm64
qemu directly.
It's going to be slow, by about factor of 10, but set up is much easier.

Other thoughts:
ARM memory [synchronisation] model is different than x86, I bet the
difference is present in 64-bit version too.
Given that most popular arm hardware is now multicore, the issue
cannot be avoided.
You'd need a core dev to provide guidance here.

d.

On 16 March 2015 at 18:42, Manuel Jacob m...@manueljacob.de wrote:
 Hi,

 I decided that this year is a good time to apply for GSoC.  Because I never
 worked on anything JIT-related, this could be a chance to get started with
 it.  There was some discussion about possible improvements for the ARM
 backend.  Also, some newer ARM processors support a 64-bit execution mode
 called AArch64.  I think it makes sense to implement a JIT backend for
 AArch64 sooner or later.  There are just a few affordable AArch64
 development boards available at the moment, but I was able to cross-compile
 a non-jitted version of PyPy and run it in QEMU.

 Do you think that implementing an AArch64 JIT backend is a good GSoC
 project?  If not, can you think of other JIT-related projects that fit
 better in the scope of GSoC?

 -Manuel
 ___
 pypy-dev mailing list
 pypy-dev@python.org
 https://mail.python.org/mailman/listinfo/pypy-dev
___
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev


Re: [pypy-dev] UTF8 string passing in cffi and PyPy internal string optimizations

2015-03-18 Thread Amaury Forgeot d'Arc
Hi,

2015-03-17 18:27 GMT+01:00 Eleytherios Stamatogiannakis est...@gmail.com:

 Hello,

 I'm sending the following here as they involve both cffi and PyPy.

 For the last few years i have been trying to find the most efficient way
 to pass UTF8 strings between PyPy and C code using cffi.

 Right now when PyPy receives a utf8 string (from a C function) it has to
 do 2 copies:

 1. convert the cdata string to a pypy byte string via ffi.string
 2. convert ffi.string to a unicode string

 When pypy sends a utf8 string it also does 2 copies:

 1. convert pypy unicode string to utf8-encoded byte string
 2. copy the byte string into a cdata string.

 From what i understand, there is a cffi optimization dealing with windows
 unicode (via set_unicode) where on windows platforms and when using the
 native windows unicode strings, cffi avoids doing one of the copies in both
 of above cases.

 On linux where the default unicode format for C libraries nowadays is
 UTF8, there is no such optimization, so we have to do the two copies in all
 string passing.

 PyPy at some point was going towards using utf8 string internally, but i
 don't know if this is still the plan or not. Using utf8 strings would
 optimize away one of the two copies on the linux platform (utf8
 encoding/decoding would become a nop operator).

 All of the above is the current status of cffi and pypy string handling as
 i understand it. So my proposal to reduce the string copies to a minimum is
 this:

 1. If PyPy doesn't go towards using utf8 strings internally, maybe we need
 some special C type that denotes that the string is utf8 and pypy/cffi
 should do the conversion from-to it automatically. Something like wchar_t
 in windows but denoting a utf8 string. CFFI can define a special type
 (__utf8char_t?) for these strings.


This is a first step towards SWIG's typemaps:
http://www.swig.org/Doc3.0/Typemaps.html#Typemaps_nn4

That's also something I wanted to have in another projects: automatic
conversion to PYTHON_HANDLE, for example.

But typemaps are a tough thing, and they would likely differ between
CPython and PyPy.
Armin, what do you think?


Alternatively, an encoding parameter could be added in ffi.string, so that
 it'll do both the cdata and encoding conversions in one step.

 2. If PyPy does go towards using utf8 string internally. Then it could
 call C functions that do not mutate the pypy strings and do not store
 pointers to them, by passing the strings directly. This could be
 accomplished by using a cffi annotation for these kind of
 non-string-mutating C functions.


Even utf8 is used internally (it is the case already in the py3k branch, as
a cached attribute), I'm not sure I would like fuctions like strlen() to
silently accept unicode strings...



 Above ideas are based on my understanding of the current status and the
 future directions of PyPy. If i have misunderstood something i would be
 glad to be set right :).

 Kind regards,

 l.

 ___
 pypy-dev mailing list
 pypy-dev@python.org
 https://mail.python.org/mailman/listinfo/pypy-dev




-- 
Amaury Forgeot d'Arc
___
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev


[pypy-dev] Some summary and questions about the 'small function' problem

2015-03-18 Thread 黄若尘
Hi Fijal, 

   This is Ruochen Huang, I want to begin to write my proposal and I think 
actually there is not so much time left. I tried to make a summary of what I 
have understood until now and the questions I want to know. Please feel free to 
point out any incorrect things in my summary, and for the questions, if you 
think the question is meaningless, you can just skip it, or provide some 
possible document link or source code path if you think it will take too much 
time to explain it. 

   As far as I understood,
The ‘small function’ problem occurred when one trace try to call another trace. 
In source code level, it should be the situation that, inside one loop, there 
is a function invocation to a function which has another loop.
Let me take the small example we discussed before, function g() tried to call 
function f(a,b,c,d) in a big loop, and there is another loop inside f(a,b,c,d). 
So in current version of PyPy, the result is that, two traces were generated:
the trace for the loop in g(), let me call it T1, actually, g() tried to inline 
f(a,b,c,d), but since there is a loop in f, so the result is that T1 will 
inline only the first iteration of the loop in f, let’s say f was taken apart 
into f1(for the first iteration) and f’(for the rest iterations), so what T1 
exactly does is start the loop in g - do f1 - do some allocations of PyFrame 
(in order to call f’) - call_assembler for f’.
the trace for the loop in f’. let me call it T2. T2 firstly unpack the PyFrame 
prepared by T1, then do a preamble work, which means f’ is once again taken 
apart into f2 (for the 1st iteration in f’, and it actually is also the 2nd 
iteration in original f), and f’’(the iterations from 3rd iteration to the last 
iteration), for f2 and f’’, there is a label in the head of them, respectively. 
So finally we can say T2 consist of 3 parts: T2a (for PyFrame unpacking), 
T2b(with label1, do f2), T2c(with label2, do f’’).
As mentioned above, we have T1 - T2a - T2b - T3c, from the viewpoint of the 
loop in f, f is distributed into: T1(f1) - T2a - T2b(f2) - T2c(f’’), which 
means the loop in f was peeled twice, so T2b might be not needed, further more, 
the work for PyFrame before call_assembler in T1, and the unpacking work in T2a 
is a waste. I can’t understand why it’s a waste very well, but I guess it’s 
because T2c(f’') actually do the similar thing as f1 in T1, (or, T2c is already 
*inside* the loop) Anyway, T2b is also not needed, so we want to have T1 - 
T2c, and since the work in PyFrame in T2a is eliminated, the allocation for 
PyFrame in T1 can also be eliminated. So ideally we want to have T1’ (without 
PyFrame allocation) - T2c.

Some questions until now:
What’s the bridge you mentioned? To be honest I have only a very slight 
understand of bridge, I know it is executed when some guard failed, but as far 
as I knew, in normal trace JIT compiler, only one path of a loop will be 
traced, any guard failure will make the execution escape from the native code 
and return to VM, but I guess the bridge is a special kind of trace (native 
code), is it right?
Could you please explain more about why T2b is not needed? I guess the answer 
may be related to the “virtualizable” optimization for PyFrame, so what if 
PyFrame is not virtualizable? I mean, if in that situation, does the problem 
disappear? or become easier to solve?
What’s the difficulties in solving this problem? I’m sorry I’m not so familiar 
with the details of RPython JIT, but in my opinion, we need just to make the 
JIT know that, 
when tries to inline a function, and encounter a loop so the inline work has to 
stop, it’s time to do optimization O.
what O does is to delete the allocation instructions about PyFrame before 
call_assembler, and them tell call_assembler to jump to 2rd label of target 
trace. (In our example is T2c).
   So It may seem not so difficult to solve.

Best Regards,
Ruochen Huang


___
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev


Re: [pypy-dev] Compiling PyPy interpreter without GC

2015-03-18 Thread John Zhang
Hi Carl,
Great! It worked!
So the option disables all modules, and IO as well?

Cheers,
John Zhang
 On 19 Mar 2015, at 4:18 am, Carl Friedrich Bolz cfb...@gmx.de wrote:
 
 On 18/03/15 01:01, John Zhang wrote:
 Hi all,
 I'm working on developing a MicroVM backend for PyPy. It's a
 virtual machine under active research and development by my colleagues
 in ANU. It aims to capture GC, threading and JIT in the virtual machine,
 and frees up the burden of the language implementers.
 
 Since MicroVM provides GC, I need to remove GC from the PyPy
 interpreter. As I was trying to compile it with the following command:
 pypy $PYPY/rpython/bin/rpython \
   -O0 \
   --gc=none \
   --no-translation-rweakref \
   --annotate \
   --rtype \
   --translation-backendopt-none \
   $PYPY/pypy/goal/targetpypystandalone.py
 
 Hey John,
 
 Try the following:
 rpython -O0 --gc=none --no-translation-rweakref --annotate --rtype 
 --translation-backendopt-none targetpypystandalone.py --no-allworkingmodules 
 --withoutmod-_io
 
 Cheers,
 
 Carl

___
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev