Re: [Cython] RFC: an inline_ function that dumps c/c++ code to the code emitter

Robert Bradshaw Mon, 22 Aug 2016 21:45:33 -0700

On Mon, Aug 22, 2016 at 4:48 PM, Jason Newton <nev...@gmail.com> wrote:
>
> Bare in mind that was then, and while I'm not Cython developer level I do
> have a better idea of how Cython works now, possibly better than your
> average user.  I've thought about this problem and the state of the overall
> C++ python binding capability alot as I've had to reevaluate it possibly
> more than most, and I've used them more deeply than playcode.


Good to know.

>> > My point was that multifile multi-level wrapper that I mentioned earlier
>> > -
>> > if you're saying that those projects did Cython extensions wrong, then
>> > I'm
>> > incorrect at faulting Cython and should fault the libraries using it.  I
>> > didn't say staying close to python caused $blurb.
>> >
>> > I don't know in a situation as confusing as to all the binding projects
>> > if
>> > this should be taken as validation of philosophy either - I think it is
>> > reasonable to consider the attrition of these projects as a function of
>> > manpower, number of early on project supporters/authors, and if a
>> > project
>> > (like sage) indirectly, through dependency,  kept the project alive.
>> > And
>> > good old fashioned luck.  I noted most of them don't use distutils and
>> > something custom but less capable instead which maybe plays a roll in
>> > how
>> > mature/usable/smalltime they where/are.
>>
>> Certainly the success of a project depends on many external factors,
>> and even raw luck plays a part. But the approach and philosophy taken
>> to attack a problem and guard its API (and the users/contributors that
>> such decisions attract or repel) can't be discounted either,
>> especially when taken over a long timeframe.
>>
>> But if anyone wants to believe that Cython's become popular because of
>> pure luck despite wrongheaded guiding principles or philosophies,
>> it'll be difficult to persuade them otherwise.
>
> You snidely play with extremes, I never said pure luck was the main factor -
> I listed several factors of very high significance.  Further, Cython
> branched off of pyrex - why do you think that project failed (you might
> actually know)?  Did your philosophy carry separate the two or was it
> because you needed things to work (for sage), had a larger set of smart
> developers and improve it faster than pyrex would allow?  I think it's the
> bustling of life that got Cython where it is, I don't believe it's
> philosophy in this case - I think it was lack of feature competitiveness and
> Cython probably worked better (more bugs/cases fixed).

I wouldn't say that Pyrex failed--it gave birth to Cython. At a high
level where I'd say they diverged is that Pyrex was viewed as a tool
to write C extensions, whereas Cython was viewed as a Python compiler.
The latter meant prioritizing things like being an optimizing
compiler, full coverage of the Python API (e.g. generators),
numpy/buffer integration, and Python 3 support were higher priority,
as opposed to being non-goals for Pyrex.

But the key insight of using Python, or at least Python-inspired,
syntax to declare and invoke C constructs lives on. Which is quite
relevant to this thread.

The fact that Greg didn't seem to have the time/interest in corralling
an open source project factored in here as well. If you have anything
to add, Greg, always happy to have you chime in.

>> >> I agree that any efforts to trying to parsing C++ without building on
>> >> an actual compiler are fraught with danger. That's not the case with
>> >> generating C++ code, which is the direction we're going. In
>> >> particular, our goal is to understand C++ enough to invoke it, which
>> >> allows us to be much less pedantic.
>> >
>> > I understand and agree with the logic in stating it's a less complicated
>> > goal but what comparable success stories exist?  I strongly think
>> > "devils in
>> > the details" in correctly making that work and that they will be tough
>> > solvable problems.  And then you're going and promising on unfamiliar
>> > territory.  But what the ultimate takeaway for me is that you won't have
>> > it
>> > ready in any near term.  Do you have the skills and resources to
>> > implement
>> > this in under 2 years?  And then the other question is are you and the
>> > team
>> > reasonably confident you will have it working and usable by then.
>> > Otherwise
>> > you are not being pragmatic.
>> >
>> > On the other hand, if it was reasonably simple as many of your other
>> > points
>> > in future emails point out, I'd really like to know why you hadn't
>> > addressed
>> > them earlier.
>>
>> Other higher priority items for limited resources. And non-type
>> template args are not necessarily that simple given the way things are
>> structured now.
>
> There's always limited resources and priorities but hen it sounds like it's
> going to be a very (ever)? long time until everything is working great
> unless the need is becoming a clear driver or this email thread has
> reassigned priorities.  Do you think you can sketch out a time table for the
> known issues?  Maybe not in this thread but I'd appreciate a back of the
> envelope calculation - realistic time of usefulness is part of being
> pragmatic.

For a small project like this run completely by volunteers, the only
way to set a timetable is to to the work yourself :). However, this
discussion has re-motivated me to iron some of these issues out, and I
might (surprisingly) have some time on my hands to do so.

>> You're missing the point of (3). The fact that we're generating C
>> code, and not fortran or directly assembly, should mostly be an
>> implementation detail. It's not realistic to embed snippets without
>> caring about the surrounding context in all but the simplest of cases.
>> And there'd be feature creep here--you're in the middle of a C snippet
>> and want to report an error, or access a Python object, or ...
>
> I know you took a different philosophy than pyrex, but I don't understand
> why.  In that sage days presentation, you mentioned pyrex chose to
> distinguish itself as a way to write c extensions, emphasizing it (I don't
> know to the degree if it offered what wish cython offered, doubted).  It's
> pretty clear to me you wanted mostly accelerated python with some
> c-trampolining capability through externally defined functions.  I *really*
> think you should remember your root's goals though.

Yep. "C extension" is an implementation detail. (One can't always get
away from implementation details leaking though, but I rather like to
avoid prominently baking them in.)

>> The issue here that you want to inline C code snippets into your
>> inlined Cython code snippet?
>
> Multiple files is more a PITA to making the mako-templated strategy work - I
> didn't say impossible, just more a PITA. I learned the JIT-module
> compilation technique from OpenCL/PyCUDA and found it makes a metric ton of
> sense to do, the performance boost can be massive from using -DDEFINES and
> embedding runtime variables/constants right into C (recall the point of
> Cython seems to be for speed :-). This is one of the reasons why I'd like to
> stick with a python driven way of doing C++ bindings (which hopefully can
> use C++ directly).   It's alot easier for a single file to work and simply
> pass a string to something like inline which does all the file writing, and
> module placement.

I just had one idea to ease the pain--cython.inline[_module] could
take a "extra C source" parameter that would be #included in the file.
I'm much more comfortable with this over making changes to the
language itself.

>> I'm not saying that these projects died because of the string-based
>> approach they took, rather that if this "embed strings" approach were
>> so critical, so superior, it should at least kept one of them alive.
>> Or a new project could have formed around this approach (e.g. letting
>> all the executable code be C++, with Python syntax for the structure,
>> could be an interesting point in the design space). It has its pros
>> and cons. I think for Cython it's the wrong direction. But I'm in
>> favor of letting many flowers bloom--and we're in luck that these are
>> all open source to boot.
>
>
> I think it's because of their as-is-limitations, not of the technique they
> implemented but as a whole system.  I looked at the source for pyinline the
> other day and it looked like it only supports primitive types and would
> choke on C++ for instance and doesn't really offer convenience to extract
> data from python objects like Cython can so easily (the whole system,
> including it's custom C (no ++) lexer is ~700 lines).  I tend to think of a
> few/most of them are just passed the working proof of concept stage and were
> significantly lacking in overall capability.  Most python projects (on PyPi)
> are single contributors with short lifespans too which seems to be the case
> for those projects.  I'm not sure when/how Pyrex started out (1 guy at Uni
> who set up a full website?) but how did it happen with Cython?
>
>>
>> > I watched one of your old talks for Sage Days 29 and at the bottom of a
>> > slide you have "Cython is a very pragmatic project, driven by user
>> > needs".
>> > I'm calling foul.  Go watch that video again and tell me what's changed
>> > since 2011.
>>
>> Embedding code is not a need, it's a means to an end. This is why I've
>> been asking for concrete features that are the most important to try
>> to support.
>
> I have another idea/iteration to run by you then.  One of your chief
> quibbles, although I don't think it's your underlying one, is Cython must
> understand what's going on.  So how about we support a block of C/C++ code
> as a proper construct.  Same name but now, I guess braces maybe get involved
> (since semicolons do too).  Then to transform (mangle) the nested
> identifiers in accordance with scoping/shadowing rules we pull in the python
> bindings to llvm/Clang to parse that complicated code, work with the AST (or
> further down the rabbit-hole) appropriately and spit it back out.  Obviously
> you could detect and analyze embedded statements to your hearts content to
> decide to do something smart about that (you mentioned exception handling
> and return statement as concerning items).
>
> The potential hazards here are that LLVM/Clang is brought in (LLVM is
> already on ~ everybody's linux boxes as a graphics shader dependency) and
> it's [python] api changes somewhat often.  I've used it before with success
> for python driven C++ POD boiler plate (IO, size checks, visitors methods
> etc) autocoding.
>
> If you can understand the underlying C/C++ and get over that Cython will
> always run on a C/C++ compiler in our universe,  and make that a feature to
> embrace, not a detail to hide (who actually benefits in the later case?) -
> are you still against the significant convenience - and potentially only way
> to make things work when Cython is not supporting a C++ way of doing things
> (say things aren't finished/working yet)?

That would help solve the analysis, though enormous cost in complexity
(one now has two ASTs one must understand and reconcile, and a huge
amount of C++ knowledge must be baked in to understand the AST (or
even lower levels) and it'd be neigh impossible to do "partial
analysis" for features we don't yet understand, probably harder than
putting them into Cython itself modulo parsing) and making LLVM/Clang
a dependency for Cython is probably a non-starter as well.

But you're right, it's not my primary quibble, which is that a tight
alternating of C++ and Python statements seems, well, not very
Pythonic and would IMHO lead to hard to read and debug code. That and
I think the interface (including expectations and dependencies,
possibly non-local) between the two would grow a lot messier than a
question of how to resolve name mangling. I don't think snippets would
solve the fundamental differences like about scoping discrepancies
between Python and C++ (e.g. for RAII). And it would be a step back
for the nice auto-conversion facilities you care about, e.g.

    py_func(c_func(py_value))

would become

    c_value = py_value
    cdef some_type c_ret
    inline_cpp "c_ret = c_func(c_value)"
    py_func(c_ret)

It's just choosing a smaller unit (line vs. file) to swap between languages.

- Robert
_______________________________________________
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel

Re: [Cython] RFC: an inline_ function that dumps c/c++ code to the code emitter

Reply via email to