On Wednesday, October 5, 2011, mark florisson wrote: > On 5 October 2011 08:16, Stefan Behnel <[email protected] <javascript:;>> > wrote: > > mark florisson, 04.10.2011 23:19: > >> > >> So I propose that after fused types gets merged we try to move as many > >> utility codes as possible to their utility code files (unless they are > >> used in pending pull requests or other branches). Preferably this will > >> be done in one or a few commits. How should we split up the work > > > > I would propose that new utility code gets moved out into utility files > > right away (if doable, given the current state of the infrastructure), > and > > that existing utility code gets moves when it gets modified or when > someone > > feels like it. Until we really get to the point of wanting to create a > > separate shared library etc., there's no need to hurry with the move. > > > > > >> We could actually move things before fused types get merged, as long > >> as we don't touch binding_cfunc_utility_code. > > > > Another reason not to hurry, right? > > > > > >> Before we go there, Stefan, do we still want to implement the header > >> .ini style which can list dependencies and such? > > > > I think we'll eventually need that, but that also depends a bit on the > > question whether we want to (or can) build a shared library or not. See > > below. > > > > > >> Another issue is that Cython compile time is increasing with the > >> addition of control flow and cython utilities. If you use fused types > >> you're also going to combinatorially add more compile time. > > > > I don't see that locally - a compiled Cython is hugely fast for me. In > > comparison, the C compiler literally takes ages to compile the result. An > > external shared library may or may not help with both - in particular, it > is > > not clear to me what makes the C compiler slow. If the compile time is > > dominated by the number of inlined functions (which is not unlikely), a > > shared library + header file will not make a difference. > > > > Have you tried with the memoryviews merged? e.g. if I have this code: > > from libc.stdlib cimport malloc > cdef int[:] slice = <int[:10]> <int *> malloc(sizeof(int) * 10) > > [0] [14:45] ~ ➤ time cython test.pyx > cython test.pyx 2.61s user 0.08s system 99% cpu 2.695 total > [0] [14:45] ~ ➤ time zsh compile > zsh compile 1.88s user 0.06s system 99% cpu 1.946 total > > where 'compile' is the script that invoked the same gcc command > distutils uses. As you can see it took more than 2.5 seconds to > compile this code (simply because the memoryview utilities get > included). The C compiler does it quite a lot faster here. This > obviously depends largely on your code, you get probably have it the > other way around as well. >
Anything we can do to cache/dedupe things here would be great. > >> I'm sure > >> this came up earlier, but I really think we should have a libcython > >> and a cython.h. libcython (a shared library) should contain any common > >> Cython-specific code not meant to be inlined, and cython.h any types, > >> macros and inline functions etc. > > > > This has a couple of implications though. In order to support this on the > > user side, we have to build one shared library per installed package in > > order to avoid any Cython versioning issues. Just installing a versioned > > "libcython_x.y.z.so" globally isn't enough, especially during > development, > > but also at deployment time. Different packages may use different CFLAGS > or > > Cython options, which may have an impact on the result. Encoding all > > possible factors in the file name will be cumbersome and may mean that we > > still end up with a number of installed Cython libraries that correlates > > with the number of installed Cython based packages. > > Hm, I think the CFLAGS are important so long as they are compatible > with Python. When the user compiles a Cython extension module with > extra CFLAGS, this doesn't affect libpython. Similarly, the Cython > utilities are really not the user's responsibility, so libcython > doesn't need to be compiled with the same flags as the extension > module. If still wanted, the user could either recompile python with > different CFLAGS (which means libcython will get those as well), or > not use libcython at all. CFLAGS should really only pertain to user > code, not to the Cython library, which the user shouldn't be concerned > about. > > > Next, we may not know at build time which set of Cython modules is in the > > package. This may be less of an issue if we rely on "cythonize()" in > > setup.py to compile all modules before hand (assuming that the user > doesn't > > call it twice, once for *.pyx, once for *.py, for example), but even if > we > > know all modules, we'd still have to figure out the complete set of > utility > > code used by all modules in order to build an adapted library with only > the > > necessary code used in the package. So we'd always end up with a complete > > library with all utility code, which is only really interesting for > larger > > packages with several Cython modules. > > I agree with Robert that a CEP would be needed for this, both for > clearing > > up the implications and actual use cases (I know that Sage is a > reasonable > > use case, but it's also a rather special case). > > > > > >> This will decrease Cython and C > >> compile time, and will also make executables smaller. > > > > I don't see how this actually impacts executables. However, a > self-contained > > executable is a value in itself. > > > > > >> This could be > >> enabled using a command line option to Cython, as well as with > >> distutils, eventually we may decide to make it the default (lets > >> figure that out later). Preferably libcython.so would be installed > >> alongside libpython.so and cython.h inside the Python include > >> directory. > > > > I don't see this happening. It's easy for Python (there is only one > Python > > running at a time, with one libpython loaded), but it's a lot less safe > for > > different versions of a Cython library that are used by different modules > > inside of the running Python. For example, we'd have to version all > visible > > symbols in operating systems with flat namespaces, in order to support > > loading multiple versions of the library. > > > > > >> Lastly, I think we also should figure out a way to serialize Entry > >> objects from CythonUtilities, which could easily and swiftly be loaded > >> when creating the cython scope. It's quite a pain to declare all > >> entries for utilities you write manually > > > > Why would you declare them manually? I thought everything would be moved > out > > into the utility code files? > > > > Right, the code is in the utility files. However, the cython scope > needs to have the entries of the classes and functions of the > utilities. e.g. the user may write > > cimport cython > > cdef cython.array myobject > > For this to work, we need an 'array' entry, which we don't have yet, > as the utility code will be parsed at code generation time if an entry > of that utility code (which doesn't exist yet!) is used. > > >> so what I mostly did was > >> parse the utility up to and including AnalyseDeclarationsTransform, > >> and then retrieve the entries from there. > > > > Sounds like a drawback regarding the processing time, but may still be a > > reasonable way to do it. I would expect that it won't be hard to pickle > the > > resulting dict of entries into a cache file and rebuild it only when one > of > > the utility files changes. > > Exactly. I'm not sure about pickle though, but the details don't > matter. Pickle is certainly easy as long as you don't change your > interface (which we most certainly will, though). > > We can version the cache to handle this. - Robert
_______________________________________________ cython-devel mailing list [email protected] http://mail.python.org/mailman/listinfo/cython-devel
