Re: [Cython] Cython 0.15 release candidate
Arfrever Frehtes Taifersar Arahesis, 25.07.2011 08:08: There are 4 NumPy-related test errors with Python 3.1 and 3.2. I assume that's NumPy 1.5? Could you provide the C compiler output from the test logs? This is on 64bit Linux, right? What gcc version? Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Control flow graph
Vitja Makarov, 15.02.2011 10:59: 2011/2/15 Stefan Behnelstefan...@behnel.de: Robert Bradshaw, 15.02.2011 08:21: On Mon, Feb 14, 2011 at 9:49 PM, Vitja Makarov wrote: 2011/2/15 Robert Bradshaw: On Sun, Feb 13, 2011 at 11:40 PM, Vitja Makarov wrote: Hi! In order to implement reaching definitions algorithm. I'm now working on control-flow (or data-flow) graph. Here is funny picture made with graphviz ;) http://piccy.info/view3/1099337/ca29d7054d09bd0503cefa25f5f49420/1200/ Cool. Any plans on handling exceptions? Sure, but I don't have much time for this :( Linear block inside try...except body should be split by assignments and each subblock should point to exception handling entry point. Would every possible failing sub-expression have to point to the exception handling point(s)? Not sure here as now graph node is list of NameNode ref and AssignmentNode. I'm not sure but it seems that every sub-expression could raise exception now, is there any flag? Well, in most cases (especially the interesting ones), this will be the function exit point, so it'll be easy. And in some cases, we may be able to infer that a specific exception that an expression (e.g. arithmetics or a 'raise' statement) can raise will not get caught by a given except clause (although that's certainly a tricky optimisation). But in general, I think any subexpression that potentially raises an exception must point to the next exception handling point. Right, and each exception handling point should have reference to the next one or function exit point. I suppose it depends on whether you'll be handling more than assignment tracking. We *may* get away with a statement-level graph in that case, but I somehow doubt it already. For example, list comprehensions leak their variable in Py2 code, so it's important to know if they are executed or not, and they may appear in any kind of expression. This should be special case for scoped expression node. Now I build graph from scratch it include only name node references and assignments, and positions marks to draw gv. May be node tree should be translated into graph and then local variables graph could be created from it. On the other hand CreateAbstractGraph transformation could be used to create specialized graph. Note that building that graph is only a smaller part of the work. It needs to be queriable efficiently. These graphs can easily get huge, so if the graph needs to get traversed for each little piece of information, that'll seriously slow things down. The current (limited) support for control flow analysis initially crashed for a beautiful code example that had lots of if-blocks in a row, because it was using recursive traversal. I refactored it back then, but he have to make sure the new implementation stays scalable. It's worth reading a bit of literature here. AFAIR, I posted a couple of sources to the list a while back. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] switching mailing lists
Robert Bradshaw, 10.02.2011 20:30: On Thu, Feb 10, 2011 at 1:39 AM, Stefan Behnel wrote: Dag Sverre Seljebotn, 10.02.2011 10:08: How about librelist.com? Here's a blog post with some more background: http://zedshaw.com/blog/2009-12-03.html I like their philosophy and I like their archives. I also like that you can actually *get* their archives. Yes, this is nice. They do seem to have a couple of lists there already, including some Python or Ruby related lists. That doesn't guarantee librelist.com is there to stay forever, but I wouldn't mind giving them a try. BTW, we should first set up the list, then re-subscribe gmane, then announce the new list to have people subscribe (or bulk subscribe them, in case that works). Google groups works well for my workflow, but I know you don't like it. I could go with librelist. The interface seems nice and clean and it's fast enough. Most of the lists it hosts are empty or dead... Do you know what the moderation options are like? Hopefully they'll be around for a while, as this is the 2nd (3rd?) time we've had to move lists... One caveat I found: you can't name a list cython-...@librelist.com, because they don't allow hyphens in list names (reserved as command separators). They only allow dots as special characters, which is somewhat unusual IMHO. Then the list would become cython@librelist.com, or cythondev@... or simply cython@ The latter would be fine with me, given that we have a separate cython-users mailing list anyway. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] switching mailing lists
Nathaniel Smith, 11.02.2011 19:25: On Fri, Feb 11, 2011 at 10:00 AM, Robert Bradshaw wrote: I think we should definitely have dev or devel in the name. Very strange that they don't allow a hyphen: not a blocker, but as we have not commitment yet it's worth keeping our eyes open for an equivalent service that does. Of course you could also just find mailman hosting somewhere -- I host some lists myself, and it works well and people are familiar with it. It's also easy to import archives[1] and the default archive pages have links to download the raw archive. Perhaps the scipy.org folks would be willing to help? They already host a number of lists, not just for scipy[2]. [1] http://wiki.list.org/pages/viewpage.action?pageId=4030624 [2] http://mail.scipy.org/mailman/listinfo That's a good idea. SciPy is related enough to ask them, and I doubt that scipy.org is going away any time soon. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] switching mailing lists
Robert Bradshaw, 11.02.2011 19:50: On Fri, Feb 11, 2011 at 10:25 AM, Nathaniel Smith wrote: On Fri, Feb 11, 2011 at 10:00 AM, Robert Bradshaw rober...@math.washington.edu wrote: I think we should definitely have dev or devel in the name. Very strange that they don't allow a hyphen: not a blocker, but as we have not commitment yet it's worth keeping our eyes open for an equivalent service that does. Of course you could also just find mailman hosting somewhere -- I host some lists myself, and it works well and people are familiar with it. It's also easy to import archives[1] and the default archive pages have links to download the raw archive. Perhaps the scipy.org folks would be willing to help? They already host a number of lists, not just for scipy[2]. [1] http://wiki.list.org/pages/viewpage.action?pageId=4030624 [2] http://mail.scipy.org/mailman/listinfo I like that idea a lot better than going with a random, un-related list hosting service. (I had thought about python.org too, would that be an appropriate fit?) SciPy is certainly very related, but I don't want to pigeon-hole ourselves as being only for scientific computation (even if this is our biggest userbase). :) You read my mind here. However, my first thought about python.org was that python-dev and cython-dev are pretty close, especially given that python is often referred to as cpython. This may trigger some confusion. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] switching mailing lists
Robert Bradshaw, 11.02.2011 20:02: On Fri, Feb 11, 2011 at 10:58 AM, Stefan Behnel wrote: Robert Bradshaw, 11.02.2011 19:50: On Fri, Feb 11, 2011 at 10:25 AM, Nathaniel Smith wrote: On Fri, Feb 11, 2011 at 10:00 AM, Robert Bradshaw rober...@math.washington.eduwrote: I think we should definitely have dev or devel in the name. Very strange that they don't allow a hyphen: not a blocker, but as we have not commitment yet it's worth keeping our eyes open for an equivalent service that does. Of course you could also just find mailman hosting somewhere -- I host some lists myself, and it works well and people are familiar with it. It's also easy to import archives[1] and the default archive pages have links to download the raw archive. Perhaps the scipy.org folks would be willing to help? They already host a number of lists, not just for scipy[2]. [1] http://wiki.list.org/pages/viewpage.action?pageId=4030624 [2] http://mail.scipy.org/mailman/listinfo I like that idea a lot better than going with a random, un-related list hosting service. (I had thought about python.org too, would that be an appropriate fit?) SciPy is certainly very related, but I don't want to pigeon-hole ourselves as being only for scientific computation (even if this is our biggest userbase). :) You read my mind here. However, my first thought about python.org was that python-dev and cython-dev are pretty close, especially given that python is often referred to as cpython. This may trigger some confusion. We could go with cython-devel. The fact that the first letter is different is more comforting to me than a different suffix in these days of address auto-completion. Should I go ahead and ask? +1 Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Fwd: [codespeak-ann] termination of codespeak.net services end FEB 2011
Dag Sverre Seljebotn, 10.02.2011 10:08: How about librelist.com? Here's a blog post with some more background: http://zedshaw.com/blog/2009-12-03.html Ok, here's a good reason *not* to use librelist.com: http://librelist.com/browser/meta/2011/2/11/hyphens-in-list-names/ (note that the archives are still catching up, I left the last words to Zed Shaw here) Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] switching mailing lists
Robert Bradshaw, 11.02.2011 22:42: We now have cython-de...@python.org I mass-subscribed the current regular subscribers to the new list, and invited the digest subscribers. Sorry for any inconvenience due to the mailing list switch. The migrated subscriptions also include the archives, however, I guess that at least mail-archive.com will consider it a new list as it receives all mail under a single address (and then likely splits it up by list header entries). Let's see what Gmane does. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
[Cython] Fwd: [codespeak-ann] termination of codespeak.net services end FEB 2011
Whoopsa ... Original-Nachricht Betreff: [codespeak-ann] termination of codespeak.net services end FEB 2011 Datum: Thu, 10 Feb 2011 09:45:09 +0100 Von: holger krekel holger(at)merlinux.eu An: codespeak-ann(at)codespeak.net hi codespeak.net users, (sorry if you get mail twice, i wanted to make sure ...) after 8 years of operation codespeak.net services are bound to terminate, starting END OF FEBRUARY 2011 Background: one of the original codespeak purposes was to offer subversion (then in version 0.17) for the PyPy and other projects but today this is not too interesting given the pletora of VCS hosting solutions. Also, there aren't too many admins besides me, the hosting is costing money, PyPy's repository has moved to Bitbucket and i am re-shuffling my priorities preparing for my soon to emerge father-hood. After February 2011 i probably won't be able to help much with any transition issues or questions. The host will keep on running for a while but i give no guaruantees. Some remarks regarding termination wrt to the FEB 2011 deadline: * the subversion repo will turn read-only (and will eventually be switched off). * Shell accounts will be restricted to those people who need it *and* mail me about it. Some time later they will be gone as well. * Mailing lists will be terminated as well unless i get a mail asking me to postpone termination for a specific time. You can go to your respective mailman admin page and extract a list of members. If you mail me i can also provide a list of members. * Any remaining web docs/pages will probably continue to exist for a while but i also prefer them to be moved away by end Feb 2011. Note that the codespeak svn repository contains a lot of projects. For migration you have two options: do a flat import just of your project checkout directory into a new version system. This is super-simple, obviously. If you want to preserve history for your project please mail me and i either provide you a full dump or a filtered dump only containing your project. So long and I hope you all had a good time and enjoyed the services and also have a good transition now. see you in other places, holger krekel ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] cython+pure python mode+mac+universal binary concept
Prashant Saxena, 10.02.2011 09:13: I have successfully built C-extensions using pure python mode on windows and linux platform. Now I have port same app on Mac. (Leopard 10.5.5 Intel, Xcode 3.1.4). Q. When creating c-extensions using cython on mac (details above) are there any specific instructions to follow to make sure that extensions work on both architecture (ppc and i686)? I would be using universal build of python as well as wxpython. Hi, note that this is a question about distutils, not so much about Cython. In any case, the Cython core developers list is the wrong place to discuss this. I think you should ask your question on the general Python list (i.e. the comp.lang.python newsgroup). Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
[Cython] switching mailing lists
Dag Sverre Seljebotn, 10.02.2011 10:08: (I think top-posting is OK in this one instance...) You're excused. ;) How about librelist.com? Here's a blog post with some more background: http://zedshaw.com/blog/2009-12-03.html I like their philosophy and I like their archives. I also like that you can actually *get* their archives. Maybe we can even manage to drop our existing archive there, but it seems like we'd have to ask if/how that would work. They do seem to have a couple of lists there already, including some Python or Ruby related lists. That doesn't guarantee librelist.com is there to stay forever, but I wouldn't mind giving them a try. BTW, we should first set up the list, then re-subscribe gmane, then announce the new list to have people subscribe (or bulk subscribe them, in case that works). Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] switching mailing lists
Stefan Behnel, 10.02.2011 10:39: I also like that you can actually *get* their archives. Maybe we can even manage to drop our existing archive there, but it seems like we'd have to ask if/how that would work. I asked: no import support. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Problem with generate_stararg_init_code()
Robert Bradshaw, 08.02.2011 22:01: On Tue, Feb 8, 2011 at 2:21 AM, Stefan Behnel wrote: Vitja Makarov, 08.02.2011 10:16: Trying to merge latest changes in argument parsing code I found that it still uses direct returns https://github.com/cython/cython/blob/master/Cython/Compiler/Nodes.py#L2624 Like this: if self.starstar_arg: self.starstar_arg.entry.xdecref_cleanup = 0 code.putln('%s = PyDict_New(); if (unlikely(!%s)) return %s;' % ( self.starstar_arg.entry.cname, self.starstar_arg.entry.cname, self.error_value())) code.put_gotref(self.starstar_arg.entry.cname) Or this: if self.starstar_arg: code.putln() code.putln(if (unlikely(!%s)) { % self.star_arg.entry.cname) code.put_decref_clear(self.starstar_arg.entry.cname, py_object_type) code.putln('return %s;' % self.error_value()) code.putln('}') else: code.putln(if (unlikely(!%s)) return %s; % ( self.star_arg.entry.cname, self.error_value())) That's not good because current scope and refnanny context is already created and should be freed. These aren't really critical bugs as they only deal with memory problems. Unless you want to rework them now, I think this is something that we should clean up as part of the DefNode/CFuncDefNode refactoring during the workshop. They are certainly bugs, so please file a ticket or add a note in the code at least. I agree that they are bugs. The (safe) quick fix (that could go into 0.14.2) would be to insert code to clean up the refnanny and the closure context before the return statements. This can be cleaned up for 0.15. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Problem with generate_stararg_init_code()
Vitja Makarov, 08.02.2011 10:16: Trying to merge latest changes in argument parsing code I found that it still uses direct returns https://github.com/cython/cython/blob/master/Cython/Compiler/Nodes.py#L2624 Like this: if self.starstar_arg: self.starstar_arg.entry.xdecref_cleanup = 0 code.putln('%s = PyDict_New(); if (unlikely(!%s)) return %s;' % ( self.starstar_arg.entry.cname, self.starstar_arg.entry.cname, self.error_value())) code.put_gotref(self.starstar_arg.entry.cname) Or this: if self.starstar_arg: code.putln() code.putln(if (unlikely(!%s)) { % self.star_arg.entry.cname) code.put_decref_clear(self.starstar_arg.entry.cname, py_object_type) code.putln('return %s;' % self.error_value()) code.putln('}') else: code.putln(if (unlikely(!%s)) return %s; % ( self.star_arg.entry.cname, self.error_value())) That's not good because current scope and refnanny context is already created and should be freed. These aren't really critical bugs as they only deal with memory problems. Unless you want to rework them now, I think this is something that we should clean up as part of the DefNode/CFuncDefNode refactoring during the workshop. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Switching from Py_UNICODE to Py_UCS4
Robert Bradshaw, 04.02.2011 19:50: On Sat, Jan 29, 2011 at 2:35 AM, Stefan Behnel wrote: Robert Bradshaw, 29.01.2011 10:01: On Fri, Jan 28, 2011 at 11:37 PM, Stefan Behnel wrote: there is a recent discussion on python-dev about a new memory layout for the unicode type in CPython 3.3(?), proposed by Martin von Löwis (so it's serious ;) http://comments.gmane.org/gmane.comp.python.devel/120784 That's an interesting PEP, I like it. Yep, after some discussion, I started liking it too. Even if it means I'll have to touch a lot of code in Cython again. ;) If nothing else, it gave me a new view on Py_UCS4 (basically a 32bit unsigned int), which I had completely lost from sight. It's public and undocumented and has been there basically forever, but it's a much nicer type to support than Py_UNICODE, which changes size based on build time options. Py_UCS4 is capable of representing any Unicode code point on any platform. So, I'm proposing to switch from the current Py_UNICODE support to Py_UCS4 internally (without breaking user code which can continue to use either of the two explicitly). This means that loops over unicode objects will infer Py_UCS4 as loop variable, as would indexing. It would basically become the native C type that 1 character unicode strings would coerce to and from. Coercion from Py_UCS4 to Py_UNICODE would raise an exception if the value is too large in the given CPython runtime, as would write access to unicode objects (in case anyone really does that) outside of the platform specific Py_UNICODE value range. Writing to unicode buffers will be dangerous and tricky anyway if the above PEP gets accepted. I am a bit concerned about the performance overhead of the Py_UCS4 to Py_UNICODE coercion (e.g. if constructing a Py_UNICODE* by hand), but maybe that's both uncommon and negligible. I think so. If users deal with Py_UNICODE explicitly, they'll likely type their respective variables anyway, so that there won't be an intermediate step through Py_UCS4. And on 32bit Unicode builds this isn't an issue at all. Coming back to this once more: if the PEP gets implemented, we will only know at C compile time (Py=3.3 or not) if the result of indexing (including for-loop iteration) is Py_UCS4 or Py_UNICODE. For Cython's type inference, Py_UCS4 is therefore the more correct guess. So my proposal stands to always infer Py_UCS4 instead of Py_UNICODE for indexing, even if we ignore surrogate pairs in narrow Python builds. I will implement this for now, so that we can see what it gives. One open question that I see is whether we should handle surrogate pairs automatically. They are basically a split of large Unicode code point values (65535) into two code points in specific ranges that are safe to detect. So we could allow a 2 'character' surrogate pair in a unicode string to coerce to one Py_UCS4 character and coerce that back into a surrogate pair string if the runtime uses 16 bit for Py_UNICODE. Note that this would only work for single characters, not for looping or indexing (without the PEP, that is). So it's somewhat inconsistent. It would work well for literals, though. Also, we'd have to support it for 'in' tests, as a Py_UCS4 value may simply not be in a Py_UNICODE buffer, even though the character is in the string. No, I don't think we should handle surrogate pairs automatically, at least without making it optional--this could be a significant performance impact with little benefit for most users. Using these higher characters is rare, but using them on a non USS4 build is probably even rarer. Well, basically they are the only way to use 'wide' Unicode characters on 16bit Unicode builds. I think a unicode string of length 2 should be able to coerce into a Py_UCS4 value at runtime instead of raising the current exception because it's too long. Sure, that's fine by me. This is now implemented for narrow builds. For the opposite direction, integer to unicode string, you already get a string of length 2 on narrow builds, that's how unichr()/chr() work in Python 2/3. So, in a way, it's actually more consistent with how narrow builds work today. OK. The only reason this isn't currently working in Cython is that Py_UNICODE is too small on narrow builds to represent the larger Unicode code points. If we switched to Py_UCS4, the problem would go away in narrow builds now and code could be written today that would easily continue to work efficiently in a post-PEP CPython as it wouldn't rely on the deprecated (and then inefficient) Py_UNICODE type anymore. What about supporting surrogate pairs in 'in' tests only on narrow platforms? I mean, we could simply duplicate the search code for that, depending on how large the code point value really is at runtime. That code will become a lot more involved anyway when the PEP gets implemented. Sure. This shouldn't have non-negligible performance overhead for the simple case
Re: [Cython] Switching from Py_UNICODE to Py_UCS4
Stefan Behnel, 06.02.2011 09:45: Robert Bradshaw, 04.02.2011 19:50: On Sat, Jan 29, 2011 at 2:35 AM, Stefan Behnel wrote: Robert Bradshaw, 29.01.2011 10:01: Also, this would be inconsistant with python-level slicing, indexing, and range, right? Yes, it does not match well with slicing and indexing. That's the problem with narrow builds in both CPython and Cython. Only the PEP can fix that by basically dropping the restrictions of a narrow build. Lets let indexing do what indexing does. Ok. So you'd continue to get whatever CPython returns for indexing, i.e. Py_UNICODE in Py=3.2 and Py_UCS4 in Python versions that implement the PEP. That includes separate code points for surrogate pairs on narrow builds. ... on narrow builds of pre-PEP Python versions, that is. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Switching from Py_UNICODE to Py_UCS4
Robert Bradshaw, 06.02.2011 10:14: On Sun, Feb 6, 2011 at 12:45 AM, Stefan Behnelstefan...@behnel.de wrote: Robert Bradshaw, 04.02.2011 19:50: On Sat, Jan 29, 2011 at 2:35 AM, Stefan Behnel wrote: I am a bit concerned about the performance overhead of the Py_UCS4 to Py_UNICODE coercion (e.g. if constructing a Py_UNICODE* by hand), but maybe that's both uncommon and negligible. I think so. If users deal with Py_UNICODE explicitly, they'll likely type their respective variables anyway, so that there won't be an intermediate step through Py_UCS4. And on 32bit Unicode builds this isn't an issue at all. Coming back to this once more: if the PEP gets implemented, we will only know at C compile time (Py=3.3 or not) if the result of indexing (including for-loop iteration) is Py_UCS4 or Py_UNICODE. For Cython's type inference, Py_UCS4 is therefore the more correct guess. So my proposal stands to always infer Py_UCS4 instead of Py_UNICODE for indexing, even if we ignore surrogate pairs in narrow Python builds. I will implement this for now, so that we can see what it gives. Yes, that makes sense. Done. Also, this would be inconsistant with python-level slicing, indexing, and range, right? Yes, it does not match well with slicing and indexing. That's the problem with narrow builds in both CPython and Cython. Only the PEP can fix that by basically dropping the restrictions of a narrow build. Lets let indexing do what indexing does. Ok. So you'd continue to get whatever CPython returns for indexing, i.e. Py_UNICODE in Py=3.2 and Py_UCS4 in Python versions that implement the PEP. That includes separate code points for surrogate pairs on narrow builds. Yep, exactly. Note that indexing taking into account surrogate pairs can be O(n) rather than O(1) as well. Sure, that was almost certainly the reason why the way indexing works wasn't changed when surrogate pair support was implemented in the codecs, in print etc. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
[Cython] We missed a crash bug in 0.14.1
Hi, sadly, we missed the opportunity to fix a crash bug in 0.14.1. It's been there for ages, so it's not a regression, but the recent runs of the Py2.6 regression tests should have been red enough to tell us that it was there... http://trac.cython.org/cython_trac/ticket/658 Basically, when you put *args or **kwargs into a closure, and anything bad happens during argument unpacking (such as an illegal argument type), the closure fields will get DECREF-ed but not set back to 0, so that the deallocation code crashes afterwards. I don't think it's that uncommon to have args and kwargs in a closure, just think of a generic wrapper function. I opened a milestone 0.14.2 because I think this is worth fixing soon, maybe within a week or two, in case we want to wait for other problems to turn up. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Fwd: sage.math sysadmin
Robert Bradshaw, 30.01.2011 11:45: This is relevant to us, as we use the sage.math infrastructure. Thanks Joanna, and William, for supporting and letting us use your equipment. Big +1 from me. It continues to be a great help. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Switching from Py_UNICODE to Py_UCS4
Robert Bradshaw, 29.01.2011 10:01: On Fri, Jan 28, 2011 at 11:37 PM, Stefan Behnel wrote: there is a recent discussion on python-dev about a new memory layout for the unicode type in CPython 3.3(?), proposed by Martin von Löwis (so it's serious ;) http://comments.gmane.org/gmane.comp.python.devel/120784 That's an interesting PEP, I like it. Yep, after some discussion, I started liking it too. Even if it means I'll have to touch a lot of code in Cython again. ;) If nothing else, it gave me a new view on Py_UCS4 (basically a 32bit unsigned int), which I had completely lost from sight. It's public and undocumented and has been there basically forever, but it's a much nicer type to support than Py_UNICODE, which changes size based on build time options. Py_UCS4 is capable of representing any Unicode code point on any platform. So, I'm proposing to switch from the current Py_UNICODE support to Py_UCS4 internally (without breaking user code which can continue to use either of the two explicitly). This means that loops over unicode objects will infer Py_UCS4 as loop variable, as would indexing. It would basically become the native C type that 1 character unicode strings would coerce to and from. Coercion from Py_UCS4 to Py_UNICODE would raise an exception if the value is too large in the given CPython runtime, as would write access to unicode objects (in case anyone really does that) outside of the platform specific Py_UNICODE value range. Writing to unicode buffers will be dangerous and tricky anyway if the above PEP gets accepted. I am a bit concerned about the performance overhead of the Py_UCS4 to Py_UNICODE coercion (e.g. if constructing a Py_UNICODE* by hand), but maybe that's both uncommon and negligible. I think so. If users deal with Py_UNICODE explicitly, they'll likely type their respective variables anyway, so that there won't be an intermediate step through Py_UCS4. And on 32bit Unicode builds this isn't an issue at all. One open question that I see is whether we should handle surrogate pairs automatically. They are basically a split of large Unicode code point values (65535) into two code points in specific ranges that are safe to detect. So we could allow a 2 'character' surrogate pair in a unicode string to coerce to one Py_UCS4 character and coerce that back into a surrogate pair string if the runtime uses 16 bit for Py_UNICODE. Note that this would only work for single characters, not for looping or indexing (without the PEP, that is). So it's somewhat inconsistent. It would work well for literals, though. Also, we'd have to support it for 'in' tests, as a Py_UCS4 value may simply not be in a Py_UNICODE buffer, even though the character is in the string. No, I don't think we should handle surrogate pairs automatically, at least without making it optional--this could be a significant performance impact with little benefit for most users. Using these higher characters is rare, but using them on a non USS4 build is probably even rarer. Well, basically they are the only way to use 'wide' Unicode characters on 16bit Unicode builds. I think a unicode string of length 2 should be able to coerce into a Py_UCS4 value at runtime instead of raising the current exception because it's too long. For the opposite direction, integer to unicode string, you already get a string of length 2 on narrow builds, that's how unichr()/chr() work in Python 2/3. So, in a way, it's actually more consistent with how narrow builds work today. The only reason this isn't currently working in Cython is that Py_UNICODE is too small on narrow builds to represent the larger Unicode code points. If we switched to Py_UCS4, the problem would go away in narrow builds now and code could be written today that would easily continue to work efficiently in a post-PEP CPython as it wouldn't rely on the deprecated (and then inefficient) Py_UNICODE type anymore. What about supporting surrogate pairs in 'in' tests only on narrow platforms? I mean, we could simply duplicate the search code for that, depending on how large the code point value really is at runtime. That code will become a lot more involved anyway when the PEP gets implemented. Also, this would be inconsistant with python-level slicing, indexing, and range, right? Yes, it does not match well with slicing and indexing. That's the problem with narrow builds in both CPython and Cython. Only the PEP can fix that by basically dropping the restrictions of a narrow build. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Sage problem after fixing #654
Stefan Behnel, 27.01.2011 09:40: Robert Bradshaw, 27.01.2011 07:39: On Wed, Jan 26, 2011 at 10:29 PM, Stefan Behnel wrote: there is this code in Sage (wrapper_rr.pyx/.pxd): cdef class Wrapper_rr(Wrapper): cdef int _n_args cdef mpfr_t* _args ... # offending line: mpfr_init2(self._args[i], self.domain.prec()) The signature of mpfr_init2() is ctypedef __mpfr_struct* mpfr_t void mpfr_init2 (mpfr_t x, mp_prec_t prec) [...] sage/ext/interpreters/wrapper_rr.c:2737: error: incompatible types in assignment Any idea what might be going wrong here? mpfr_t is actually an __mpfr_struct[1]. (It's a clever hack to have stack-allocated, pass-by-reference semantics.) We only need to pull non-simple arguments into temp variables. Ah, that would explain it. But this also means that there may be cases where we can't help breaking existing code with such a change. Ok, this particular code really can't be worked around by Cython. From the POV of the compiler, the second function argument (a Python function call result) may potentially have side effects on the first one, so both must be coerced to a temp in the right order to be correct. 2732 /* sage/ext/interpreters/wrapper_rr.pyx:56 2733 * if self._args == NULL: raise MemoryError 2734 * for i in range(count): 2735 * mpfr_init2(self._args[i], self.domain.prec()) # 2736 * val = args['constants'] 2737 * self._n_constants = len(val) 2738 */ 2739 __pyx_t_8 = (((struct __pyx_obj_4sage_3ext_12interpreters_10wrapper_rr_Wrapper_rr *)__pyx_v_self)-_args[__pyx_v_i]); 2740 __pyx_t_2 = PyObject_GetAttr(((PyObject *)((struct __pyx_obj_4sage_3ext_12interpreters_10wrapper_rr_Wrapper_rr *)__pyx_v_self)-domain), __pyx_n_s__prec); if /*...*/ 2741 __Pyx_GOTREF(__pyx_t_2); 2742 __pyx_t_3 = PyObject_Call(__pyx_t_2, ((PyObject *)__pyx_empty_tuple), NULL); if /*...*/ 2743 __Pyx_GOTREF(__pyx_t_3); 2744 __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0; 2745 __pyx_t_9 = __Pyx_PyInt_from_py_mp_prec_t(__pyx_t_3); if /*...*/ 2746 __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0; 2747 mpfr_init2(__pyx_t_8, __pyx_t_9); And given that Cython cannot know that the pointer is actually not a pointer, it generates the expected code here which only the C compiler can detect as invalid. This needs fixing in Sage in one way or another. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython 0.14.1 release candidate
Dominic Sacré, 29.01.2011 00:38: On Fri, Jan 28, 2011 at 11:40 AM, Robert Bradshaw wrote: New candidate up at http://cython.org/release/Cython-0.14.1rc3.tar.gz Is it just me, or do the release candidates break isinstance() when testing against a tuple of multiple types? When I do isinstance(x, (Foo, Bar)) a Foo object will be recognized, but a Bar object will not. The same code used to work fine in older versions, including 0.14. Yes, looks like a bug: -- cdef class B: pass cdef class C: pass def test_custom_tuple(obj): test_custom_tuple(A()) True test_custom_tuple(B()) True test_custom_tuple(C()) False return isinstance(obj, (A,B)) -- Fails for B(), the second type check is simply dropped. I'm looking into it. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython 0.14.1 release candidate
Robert Bradshaw, 29.01.2011 01:55: On Fri, Jan 28, 2011 at 3:38 PM, Dominic Sacré wrote: On Fri, Jan 28, 2011 at 11:40 AM, Robert Bradshaw wrote: New candidate up at http://cython.org/release/Cython-0.14.1rc3.tar.gz Is it just me, or do the release candidates break isinstance() when testing against a tuple of multiple types? When I do isinstance(x, (Foo, Bar)) a Foo object will be recognized, but a Bar object will not. The same code used to work fine in older versions, including 0.14. Thanks for the report. I did fix one isinstance bug, perhaps I introduced another one (though I thought I tested this...). I'll make sure this works before release. My bad. There was code in Optimize.py, line 2003: if type_check_function not in tests: tests.append(type_check_function) test_nodes.append(...) Basically, it executes each type check function only once, but regardless of the type it is testing against. Works well for builtin types, doesn't work for extension types. https://github.com/cython/cython/commit/c14533e4a00e789df8d800fa9f4cc099faabb67e Hmm, I'm not sure how to merge commits over git branches with hg-git... Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
[Cython] Switching from Py_UNICODE to Py_UCS4
Hi, there is a recent discussion on python-dev about a new memory layout for the unicode type in CPython 3.3(?), proposed by Martin von Löwis (so it's serious ;) http://comments.gmane.org/gmane.comp.python.devel/120784 If nothing else, it gave me a new view on Py_UCS4 (basically a 32bit unsigned int), which I had completely lost from sight. It's public and undocumented and has been there basically forever, but it's a much nicer type to support than Py_UNICODE, which changes size based on build time options. Py_UCS4 is capable of representing any Unicode code point on any platform. So, I'm proposing to switch from the current Py_UNICODE support to Py_UCS4 internally (without breaking user code which can continue to use either of the two explicitly). This means that loops over unicode objects will infer Py_UCS4 as loop variable, as would indexing. It would basically become the native C type that 1 character unicode strings would coerce to and from. Coercion from Py_UCS4 to Py_UNICODE would raise an exception if the value is too large in the given CPython runtime, as would write access to unicode objects (in case anyone really does that) outside of the platform specific Py_UNICODE value range. Writing to unicode buffers will be dangerous and tricky anyway if the above PEP gets accepted. One open question that I see is whether we should handle surrogate pairs automatically. They are basically a split of large Unicode code point values (65535) into two code points in specific ranges that are safe to detect. So we could allow a 2 'character' surrogate pair in a unicode string to coerce to one Py_UCS4 character and coerce that back into a surrogate pair string if the runtime uses 16 bit for Py_UNICODE. Note that this would only work for single characters, not for looping or indexing (without the PEP, that is). So it's somewhat inconsistent. It would work well for literals, though. Also, we'd have to support it for 'in' tests, as a Py_UCS4 value may simply not be in a Py_UNICODE buffer, even though the character is in the string. Comments? Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Sage problem after fixing #654
Robert Bradshaw, 27.01.2011 07:39: On Wed, Jan 26, 2011 at 10:29 PM, Stefan Behnel wrote: ticket #654 describes a problem with the order in which function call arguments are evaluated. I fixed it and it broke Sage. Thanks for fixing this, I never got around to it last night. I'm wondering how many other similar situations may be lurking around. I could imagine that there are cases where comparisons of coerced types run into the same issue. And other kinds of expressions may also suffer from this. Of course we could store every sub-expression in temporary variables, but that would be less than ideal. We will be getting close to that, though, look at this: - cdef object X = 1 cdef redefine_global(): global X x,X = X,2 return x cdef call3(object x1, int o, object x2): return (x1, o, x2) def test_global_redefine(): test_global_redefine() (1, 1, 2) return call3(X, redefine_global(), X) - The same applies to any non-local or closure variable. NameNode currently just says this: def is_simple(self): # If it's not a C variable, it'll be in a temp. return 1 True when you interpret simple as side-effect free, but false if you take it to mean has no external dependencies. Maybe we need something like coerce_to_local() or coerce_to_owned()? The problem was that in cases where some arguments are simple and others are not, the non-simple arguments are stuffed into a temp before hand, thus being evaluated before the other arguments. This can break side-effects of simple arguments, which include C function calls. My fix was to put all arguments into temps if any of them are in temps anyway. However, there is this code in Sage (wrapper_rr.pyx/.pxd): cdef class Wrapper_rr(Wrapper): cdef int _n_args cdef mpfr_t* _args ... # offending line: mpfr_init2(self._args[i], self.domain.prec()) The signature of mpfr_init2() is ctypedef __mpfr_struct* mpfr_t void mpfr_init2 (mpfr_t x, mp_prec_t prec) Cython generates this code: mpfr_t __pyx_t_7; ... __pyx_t_7 = (((struct ...)__pyx_v_self)-_args[__pyx_v_i]); ... mpfr_init2(__pyx_t_7, __pyx_t_8); Looks reasonable at first sight. However, gcc complains about the temp assignment: sage/ext/interpreters/wrapper_rr.c:2737: error: incompatible types in assignment Any idea what might be going wrong here? mpfr_t is actually an __mpfr_struct[1]. (It's a clever hack to have stack-allocated, pass-by-reference semantics.) We only need to pull non-simple arguments into temp variables. Ah, that would explain it. But this also means that there may be cases where we can't help breaking existing code with such a change. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] unbound variables
Vitja Makarov, 27.01.2011 08:25: 2011/1/25 Stefan Behnel: Vitja Makarov, 25.01.2011 10:01: 2011/1/25 Stefan Behnel: def x(): do_some_stuff() return # disable rest of code, e.g. during debugging unreachable_code() Cython shouldn't bother generating dead code here (even if the C compiler would drop it anyway). That should be rather easy to do: remove all the nodes in StatList after: break, continue, return, raise, something else? Careful, this may involve recursive propagation through helper nodes. The tree isn't always as simple as that. I think an attribute is_terminator on Nodes might do the job. It's set to False by default and to True on all nodes you mentioned above, and is inherited by StatListNode if its last node is a terminator (while dropping its remaining child nodes at the same time) and by all helper nodes that contain StatListNodes. This could be done in analyse_types() (or maybe earlier?). Ok. I've moved it into ParseTreeTransforms and created branch: https://github.com/vitek/cython/commit/a8e957ec29f0448ee7c43bd3969012772d09b236 Interesting: +is_terminator = 0 +is_terminator = 1 +is_terminator = True Also, as I said, this is just the very first step, the StatListNode (and other nodes) should inherit the flag from their last child. Some error tests do fail because nodes are removed and code generation time error is omited. That should be fixable in most cases. We could also use a compiler option that disables dead code removal, and then use it in the error tests. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] unbound variables
Vitja Makarov, 27.01.2011 10:02: 2011/1/27 Stefan Behnel: Vitja Makarov, 27.01.2011 08:25: 2011/1/25 Stefan Behnel: Vitja Makarov, 25.01.2011 10:01: 2011/1/25 Stefan Behnel: def x(): do_some_stuff() return # disable rest of code, e.g. during debugging unreachable_code() Cython shouldn't bother generating dead code here (even if the C compiler would drop it anyway). That should be rather easy to do: remove all the nodes in StatList after: break, continue, return, raise, something else? Careful, this may involve recursive propagation through helper nodes. The tree isn't always as simple as that. I think an attribute is_terminator on Nodes might do the job. It's set to False by default and to True on all nodes you mentioned above, and is inherited by StatListNode if its last node is a terminator (while dropping its remaining child nodes at the same time) and by all helper nodes that contain StatListNodes. This could be done in analyse_types() (or maybe earlier?). Ok. I've moved it into ParseTreeTransforms and created branch: https://github.com/vitek/cython/commit/a8e957ec29f0448ee7c43bd3969012772d09b236 the StatListNode (and other nodes) should inherit the flag from their last child. This could be done simply in RemoveUnreachableCode node.is_terminator = True Sure. I don't actually understand where could be that used later? Well, if you recursively propagate the flag through nodes that support it, you may end up in another StatListNode that strip its trailing dead code. Look through UtilNodes.py, there are a couple of nodes that can wrap Stat(List)Nodes. Maybe check Nodes.py also, not sure if there aren't any other nodes that can safely assume that a terminator at the end of their last child makes them a terminator as well. Some error tests do fail because nodes are removed and code generation time error is omited. That should be fixable in most cases. We could also use a compiler option that disables dead code removal, and then use it in the error tests. Hmm, not sure here. I think it should be better to move checks outside code generation. Ah, ok, I misunderstood you then. Yes, errors should no longer occur at code generation time. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] unbound variables
Vitja Makarov, 27.01.2011 08:25: Ok. I've moved it into ParseTreeTransforms and created branch: https://github.com/vitek/cython/commit/a8e957ec29f0448ee7c43bd3969012772d09b236 Hmm, currently, it runs after type analysis: _check_c_declarations, AnalyseExpressionsTransform(self), RemoveUnreachableCode(self), This /may/ make a difference, considering that there may be declarations in the dead code that may have an impact on the remaining code. However, I would prefer running the dead code removal as early as possible, even before analysing the declarations. If users rely on dead code making live code work, I'd consider that a bug. That would also simplify the removal operation, as the tree tends to be a lot simpler in earlier stages (e.g. before optimisations, WithTransform, etc.). Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] unbound variables
Vitja Makarov, 27.01.2011 10:49: def x(): do_some_stuff() return # disable rest of code, e.g. during debugging unreachable_code() Cython shouldn't bother generating dead code here (even if the C compiler would drop it anyway). That should be rather easy to do: remove all the nodes in StatList after: break, continue, return, raise, something else? Careful, this may involve recursive propagation through helper nodes. The tree isn't always as simple as that. I think an attribute is_terminator on Nodes might do the job. the StatListNode (and other nodes) should inherit the flag from their last child. I don't actually understand where could be that used later? Well, if you recursively propagate the flag through nodes that support it, you may end up in another StatListNode that strip its trailing dead code. Look through UtilNodes.py, there are a couple of nodes that can wrap Stat(List)Nodes. Maybe check Nodes.py also, not sure if there aren't any other nodes that can safely assume that a terminator at the end of their last child makes them a terminator as well. return and loop-controls are very different here if cond: return 1 else: return 0 # dead code follows We can set is_terminator here if all the clauses are terminators for i in a: if i 0: continue else break # dead code here for i in a: return # not dead for i in a: return else: return # dead It seems to me that each case should be handled its own way. We can't simply say: this node have child with is_terminator set so it's terminator too. That's what I meant, yes. You may also consider making is_terminator a method instead of a simple attribute. That would allow nodes to reimplement it (even recursively) and make decisions based on their context. Is there a way to write tests for warnings? If no think we should create one. Not yet, it wasn't really a high priority back when I wrote the error test support. Some error tests do fail because nodes are removed and code generation time error is omited. That should be fixable in most cases. We could also use a compiler option that disables dead code removal, and then use it in the error tests. Hmm, not sure here. I think it should be better to move checks outside code generation. Ah, ok, I misunderstood you then. Yes, errors should no longer occur at code generation time. Think that's whole lot of work. Steps in that direction have been taken several times already. Hopefully, they will eventually converge to the expected state. When implementing generators I've moved error checking into ParseTreeTransforms and left 'yield not supported' in YieldExprNode. That could be later replaced with: raise InternalError Hmm? YieldExprNode is used and generates code, doesn't it? Or did you mean to raise InternalError on errors that should have been caught before? I don't think there's a need to retest inside of the node. If someone disables that transform, I'm fine with seeing Cython crash. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] unbound variables
Vitja Makarov, 27.01.2011 11:19: https://github.com/vitek/cython/blob/master/Cython/Compiler/ExprNodes.py#L4995 def analyse_types(self, env): if not self.label_num: error(self.pos, 'yield' not supported here) This error message should be replaced with assertion on self.label_num or internal error. Yes, if handled by the transform already. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Sage problem after fixing #654
Stefan Behnel, 27.01.2011 09:40: Robert Bradshaw, 27.01.2011 07:39: On Wed, Jan 26, 2011 at 10:29 PM, Stefan Behnel wrote: ticket #654 describes a problem with the order in which function call arguments are evaluated. I fixed it and it broke Sage. Thanks for fixing this, I never got around to it last night. I'm wondering how many other similar situations may be lurking around. [...]this also means that there may be cases where we can't help breaking existing code with such a change. My attempts to fix this properly seem to uncover brittle code in several other places. Given that the particular problem that #654 describes is less likely to break code than my current attempts to fix it, I would suggest releasing 0.14.1 without my latest changes, i.e. from Mark's last commit fe17af96655e7ab0310a. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] unbound variables
Vitja Makarov, 28.01.2011 07:19: 2011/1/27 Stefan Behnel: Vitja Makarov, 27.01.2011 11:19: https://github.com/vitek/cython/blob/master/Cython/Compiler/ExprNodes.py#L4995 def analyse_types(self, env): if not self.label_num: error(self.pos, 'yield' not supported here) This error message should be replaced with assertion on self.label_num or internal error. Yes, if handled by the transform already. I tried to handle IfStatNode terminator here: https://github.com/vitek/cython/commits/uninitialized About tests I the easiest way is to add compiler directive -Werror We already have an errors are fatal option, but I like this one. And add it in cython header comment # cython: werror=True Or rather warnings_are_errors=True. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] unbound variables
Vitja Makarov, 28.01.2011 07:19: 2011/1/27 Stefan Behnel: Vitja Makarov, 27.01.2011 11:19: https://github.com/vitek/cython/blob/master/Cython/Compiler/ExprNodes.py#L4995 def analyse_types(self, env): if not self.label_num: error(self.pos, 'yield' not supported here) This error message should be replaced with assertion on self.label_num or internal error. Yes, if handled by the transform already. I tried to handle IfStatNode terminator here: https://github.com/vitek/cython/commits/uninitialized Yes, that looks good. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Bug report: includ_dirs not handled correctly
Almar Klein, 26.01.2011 16:52: Clearly, there should be a space between the -I and the C:\Pro Mmm, I jumped too conclusions here, because the style seemed weird to me. While trying older versions, including the version I had before this (0.12.1), I realized that this is how it always happened. I will dig into this and see if I can find the source of my problem. There should be some more interesting compiler error output somewhere above the failure line that you posted. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
[Cython] Sage problem after fixing #654
Hi, ticket #654 describes a problem with the order in which function call arguments are evaluated. I fixed it and it broke Sage. The problem was that in cases where some arguments are simple and others are not, the non-simple arguments are stuffed into a temp before hand, thus being evaluated before the other arguments. This can break side-effects of simple arguments, which include C function calls. My fix was to put all arguments into temps if any of them are in temps anyway. However, there is this code in Sage (wrapper_rr.pyx/.pxd): cdef class Wrapper_rr(Wrapper): cdef int _n_args cdef mpfr_t* _args ... # offending line: mpfr_init2(self._args[i], self.domain.prec()) The signature of mpfr_init2() is ctypedef __mpfr_struct* mpfr_t void mpfr_init2 (mpfr_t x, mp_prec_t prec) Cython generates this code: mpfr_t __pyx_t_7; ... __pyx_t_7 = (((struct ...)__pyx_v_self)-_args[__pyx_v_i]); ... mpfr_init2(__pyx_t_7, __pyx_t_8); Looks reasonable at first sight. However, gcc complains about the temp assignment: sage/ext/interpreters/wrapper_rr.c:2737: error: incompatible types in assignment Any idea what might be going wrong here? Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Sage problem after fixing #654
Stefan Behnel, 27.01.2011 07:29: ticket #654 describes a problem with the order in which function call arguments are evaluated. I fixed it and it broke Sage. The problem was that in cases where some arguments are simple and others are not, the non-simple arguments are stuffed into a temp before hand, thus being evaluated before the other arguments. This can break side-effects of simple arguments, which include C function calls. My fix was to put all arguments into temps if any of them are in temps anyway. BTW, I wonder if a better fix would be to declare C function calls non-simple in general. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] unbound variables
Robert Bradshaw, 25.01.2011 10:00: On Mon, Jan 24, 2011 at 11:33 PM, Stefan Behnel wrote: Vitja Makarov, 25.01.2011 08:15: I want to write simple code to find unbound variables. I'm assuming you mean unassigned (as opposed to unbound in, e.g., a closure)? [...] It probably wouldn't be too hard to walk the tree to discover this kind of information, recording on each NameNode as one goes along what its possible states are. A general NameNode dependency walk of the tree could give us *loads* of information, also for better type inference. Knowing that a Python variable is definitely not None from a given point on, or that it has a specific type at one point and a different type at another would be really cool. Basically, we could build a dependency tree for each NameNode (at its specific point in the tree, not just through its symtab Entry) that references its assignment RHSs, but including a representation of relevant branches in the code. But I also agree that loops are evil. :-/ Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] unbound variables
Vitja Makarov, 25.01.2011 10:01: 2011/1/25 Stefan Behnel: def x(): do_some_stuff() return # disable rest of code, e.g. during debugging unreachable_code() Cython shouldn't bother generating dead code here (even if the C compiler would drop it anyway). That should be rather easy to do: remove all the nodes in StatList after: break, continue, return, raise, something else? Careful, this may involve recursive propagation through helper nodes. The tree isn't always as simple as that. I think an attribute is_terminator on Nodes might do the job. It's set to False by default and to True on all nodes you mentioned above, and is inherited by StatListNode if its last node is a terminator (while dropping its remaining child nodes at the same time) and by all helper nodes that contain StatListNodes. This could be done in analyse_types() (or maybe earlier?). Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython workshop
Stefan Behnel, 23.01.2011 11:50: Dag Sverre Seljebotn, 23.01.2011 10:12: On 01/23/2011 09:53 AM, Robert Bradshaw wrote: On Sat, Jan 22, 2011 at 12:06 AM, Stefan Behnel wrote: Dag Sverre Seljebotn, 20.01.2011 12:26: Starting new thread in case somebody did not see the hijacking. I've created this wiki page: http://wiki.cython.org/workshop1 If you're interested in coming to a Cython workshop, please fill in your details in the table to help decide when and where a workshop should be held. Current status so far: 4 core developers have signed in, 5 people in total, all from Europe. Robert said he'd like to join in, too. Have there been any further off-list replies so far? I've spoken to two people, both of whom won't be able to do much the first half of this year. Still, I think close to now is a very good time to do it, because 0.14.x is nearing stability and thus leaving the focus but needs a serious docsprint, 0.15 is in the pipeline and worth putting some work into and 1.0 is getting in sight and worth discussing. Certainly enough to fill four days. I added a couple of development topics that could benefit from a hacking workshop. Depending on how nontrivial the NFS funding is - maybe we could split it into two events? One more planning/core developer oriented workshop now and one geared towards code sprinting later during the year? Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] unbound variables
Vitja Makarov, 25.01.2011 08:15: I want to write simple code to find unbound variables. There is some code that does (part of) that already, in a transform IIRC. It's used to detect if we need to initialise a local variable to None or can safely set it to NULL. As it's hard to detect them in common case it will mark them as: - bound (defenitley bound) def foo(): a = 1 print a Bound variables should be initialized to NULL - unbound (defenitley unbound) def foo(): print a a = 1 Unbound variables should be initialized to Py_None. And user may be warned (should depend on command line flags) - don't know (not sure here) def foo(x): if x: a = 1 [else: # optional a = 2] print a Algo is too stupid it don't know much about branches, so it's not sure here. Well, it *can't* know what will happen at runtime. That's not being stupid at all, it's just the correct way to do it. This ones will be initialized to Py_None. To be correct, they'd have to get initialised to NULL and a check needs to get generated on read access as long as we don't know for sure if it has been initialised or not. CPython raises an exception on unbound locals, so Cython should do the same and should do it efficiently. When we get to the point that we safely know which variables are being initialised and for which only the runtime behaviour can tell us if they are or not, I think it's fine to add the little performance penalty of checking for NULL in exactly the unsafe cases. Also I would like to check for unused variables and unreachable code (this should be removed). Some unreachable code is getting dropped during constant folding, usually stuff like if False: ..., but I agree that there's always more that can be done. Think of this: def x(): do_some_stuff() return # disable rest of code, e.g. during debugging unreachable_code() Cython shouldn't bother generating dead code here (even if the C compiler would drop it anyway). Knowing where a function returns and if it has trailing code with a default None return would also be good to know. It would be a step towards inferring the return type of functions automatically. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython workshop
Dag Sverre Seljebotn, 23.01.2011 10:12: On 01/23/2011 09:53 AM, Robert Bradshaw wrote: On Sat, Jan 22, 2011 at 12:06 AM, Stefan Behnel wrote: Dag Sverre Seljebotn, 20.01.2011 12:26: Starting new thread in case somebody did not see the hijacking. I've created this wiki page: http://wiki.cython.org/workshop1 If you're interested in coming to a Cython workshop, please fill in your details in the table to help decide when and where a workshop should be held. Current status so far: 4 core developers have signed in, 5 people in total, all from Europe. Robert said he'd like to join in, too. Have there been any further off-list replies so far? I've spoken to two people, both of whom won't be able to do much the first half of this year. Still, I think close to now is a very good time to do it, because 0.14.x is nearing stability and thus leaving the focus but needs a serious docsprint, 0.15 is in the pipeline and worth putting some work into and 1.0 is getting in sight and worth discussing. Certainly enough to fill four days. In case we actually want to advertise and run this before the end of April (or even March, and Vitja's visa will likely take a while to get), it would be good if the remaining core developers and other interested parties could speak up soon, say, within a week's time, so that we can decide on date and location. The first half of March doesn't work for me, and April would be tricky, but I might be able to do March 30-April 3. More ideal would be May-June-July, but that conflicts directly with Stefans availability--are there any weekends in there that would work or are those months right out? Tricky for me. The first weekend in June might work, but I'd have to check that with my employer. End of June may work better. I think 4 days is about the right amount of time. Within the months I listed there's good chances of things working for me; although April-June is better than July. March 30 - April 3 should work. Assuming there's never a date that won't be tricky for any of us, it seems that the first weekend in April could make it. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython workshop
Dag Sverre Seljebotn, 20.01.2011 12:26: Starting new thread in case somebody did not see the hijacking. I've created this wiki page: http://wiki.cython.org/workshop1 If you're interested in coming to a Cython workshop, please fill in your details in the table to help decide when and where a workshop should be held. Current status so far: 4 core developers have signed in, 5 people in total, all from Europe. Robert said he'd like to join in, too. Have there been any further off-list replies so far? In case we actually want to advertise and run this before the end of April (or even March, and Vitja's visa will likely take a while to get), it would be good if the remaining core developers and other interested parties could speak up soon, say, within a week's time, so that we can decide on date and location. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython workshop
Vitja Makarov, 22.01.2011 10:47: 2011/1/22 Stefan Behnel: In case we actually want to advertise and run this before the end of April (or even March, and Vitja's visa will likely take a while to get), it would be good if the remaining core developers and other interested parties could speak up soon, say, within a week's time, so that we can decide on date and location. My visa is valid until 5 apr, new one could take 2-3 weeks. Ok, just to throw in a couple of dates here that work well for me: March 2-6 or March 30-April 3rd. Is four days ok (assuming that people leave during the Sunday) or is it too short? Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython workshop
Stefan Behnel, 18.01.2011 11:31: William Stein, 18.01.2011 10:46: I recently got some new nontrivial NSF funding for a Cython workshop. A docathon would be a great activity at such a workshop. When some Cython developers are ready to all get flown somewhere cool to work together on Cython for a few days, all expenses paid, let me know. I'm serious. Hmm, in case the funding allows non-US territory - anyone interested in coming to Munich? Maybe not the coolest place in the world but not too bad either. And the (tiny) local PUG is based in the local University, so I should be able to get some infrastructure up for a weekend and likely a couple of days around that. We could advertise a couple of public talks to increase their interest. Quick follow-up on this: I asked hand-waving and it seems like we could get a place to gather at the University for basically as many days as we like if we can manage to organise it before the end of April (lectures restart on May 2nd). Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython workshop
Dag Sverre Seljebotn, 20.01.2011 12:26: I've created this wiki page: http://wiki.cython.org/workshop1 I restricted access to that page to authenticated users as we are collecting personal information there. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython workshop
Dag Sverre Seljebotn, 20.01.2011 15:36: On 01/20/2011 03:34 PM, Dag Sverre Seljebotn wrote: On 01/20/2011 03:26 PM, Stefan Behnel wrote: Dag Sverre Seljebotn, 20.01.2011 12:26: I've created this wiki page: http://wiki.cython.org/workshop1 I restricted access to that page to authenticated users as we are collecting personal information there. I think that is counterproductive, because we want to develop that page into the poster for the workshop that anyone can visit to see if they're interested in coming. Sounds like a different page to me. How if we drop listing people's location and instead say where they prefer the conference to be held? And people don't have to list when they're available if they don't want to. And I guess we'd need to recreate the page to delete editing history Not at long as it's access restricted. :) If it's the URL you want to keep, then yes. if it's your location that's at stake. We've started putting lots of information on that page already. I think it's better to have a planning page and a separate advertising page. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython workshop
Dag Sverre Seljebotn, 20.01.2011 16:09: On 01/20/2011 03:48 PM, Stefan Behnel wrote: Dag Sverre Seljebotn, 20.01.2011 15:36: On 01/20/2011 03:34 PM, Dag Sverre Seljebotn wrote: On 01/20/2011 03:26 PM, Stefan Behnel wrote: Dag Sverre Seljebotn, 20.01.2011 12:26: I've created this wiki page: http://wiki.cython.org/workshop1 I restricted access to that page to authenticated users as we are collecting personal information there. I think that is counterproductive, because we want to develop that page into the poster for the workshop that anyone can visit to see if they're interested in coming. Sounds like a different page to me. For this workshop there's a very gray area between announcing and planning. E.g., here's an email I just sent to numpy-discussion. Anyone who's interested enough to read it can always register and it is just a minor annoyance, so I guess I'm fine with it. As we have funding for it, we're talking about organizing a Cython workshop sometimes this year (possibly in Munich, Germany, though it's not decided yet). It's still not clear how user-centric vs. developer-centric the workshop will be, or how strong a role numerical computation will have vs. more general language features. We're just getting in touch with people potentially interested in joining the workshop, and then we'll take it from there. Respond on the wiki page or on cython-dev. http://wiki.cython.org/workshop1 http://wiki.cython.org/workshop1#preview Not at long as it's access restricted. :) Anyone can register, can't they? No, they can't. You still have to send a htpasswd entry to one of us. So this is just for robots/search engines? Most of all, yes. We've started putting lots of information on that page already. I think it's better to have a planning page and a separate advertising page. Like Mark I don't see what the lots of information is supposed to be, and the thought didn't strike me at all until you mentioned it. Then again, academics want their name and location spread around as much as possible anyway :-) I just followed the pattern Sage set up which is reasonably efficient, e.g. http://wiki.sagemath.org/days10 BUT, I can agree that the less privacy-conscious should yield to the ones who are more so, so I'm fine with following your lead here. I've replaces the page with a new page and moved the complete (restricted) page to http://wiki.cython.org/workshop2011 Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython workshop
Stefan Behnel, 20.01.2011 16:21: Dag Sverre Seljebotn, 20.01.2011 16:09: Anyone can register, can't they? No, they can't. You still have to send a htpasswd entry to one of us. Hmm, I guess I was wrong here. So you actually just need to create an account then. I think that's enough of a privacy 'protection'. So this is just for robots/search engines? Most of all, yes. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython 0.14.1 release candidate
Arfrever Frehtes Taifersar Arahesis, 18.01.2011 18:50: IOError: [Errno 2] No such file or directory: '/var/tmp/portage/dev-python/cython-0.14.1_rc1/work/Cython-0.14.1rc1/BUILD/Cy3/Cython/Debugger/Tests/codefile' I had added the file to setup.py's package_data when I noticed that it was missing, but forgot to add it to MANIFEST.in as well. Mark has fixed that now. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] [OT] Python like lanugages [was Re: After C++, what with Python?]
Terry Reedy, 17.01.2011 22:38: A few other details confuse me, but enough for now. It's helpful to be reminded from time to time that the documentation shows the usual features of an underfinanced OpenSource project. Part of it was inherited from the original Pyrex documentation and suffers from bit rot, and some other parts were copied over from the Wiki or from talk notes and continue to be badly integrated with the rest. Maybe we should call out a docathon on the cython-users list to see if we can't get some of our happy users to give something back by fixing up the obvious problems in the documentation. Some have done so in the past already, even without being asked. My main interest at the moment is whether Cython is a viable third method (versus swig and ctypes) for wrapping C library code for access from Python. It seems to sit in between somewhat. Regarding a comparison with SWIG, you might find this interesting: http://www.mail-archive.com/cython-dev@codespeak.net/msg01354.html I do not consider SWIG a real alternative, except when it can play its single joker, i.e. you want to generate and maintain a substantial set of identical wrappers for different languages. You might want to do that if you are the author of the library you wrap, but it's a lot less likely that you want to do it if you are a user of the library. I consider ctypes a viable alternative with two main advantages: a) it's part of CPython, which b) makes it plain Python to work with. Even PyPy supports it (IIRC), and Jython has at least been thinking about it for a while. Cython will always depend on CPython (*), on a C compiler, and on the ability to install and use binary extension modules. However, once you cross that barrier, the main advantages of Cython strike immediately: it has a much more natural way to deal with external C code than ctypes (assuming the same level of C knowledge that you also need for ctypes), and it's substantially faster. Stefan (*) or at least on its C-API - there's also work being done on a port to IronPython ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython workshop
William Stein, 18.01.2011 10:46: I recently got some new nontrivial NSF funding for a Cython workshop. A docathon would be a great activity at such a workshop. When some Cython developers are ready to all get flown somewhere cool to work together on Cython for a few days, all expenses paid, let me know. I'm serious. Hmm, in case the funding allows non-US territory - anyone interested in coming to Munich? Maybe not the coolest place in the world but not too bad either. And the (tiny) local PUG is based in the local University, so I should be able to get some infrastructure up for a weekend and likely a couple of days around that. We could advertise a couple of public talks to increase their interest. Bonus: last I heard, it's a lot easier to get into Germany than into the US part of North America. :) Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython workshop
Stefan Behnel, 18.01.2011 11:31: William Stein, 18.01.2011 10:46: I recently got some new nontrivial NSF funding for a Cython workshop. A docathon would be a great activity at such a workshop. When some Cython developers are ready to all get flown somewhere cool to work together on Cython for a few days, all expenses paid, let me know. I'm serious. Hmm, in case the funding allows non-US territory - anyone interested in coming to Munich? Maybe not the coolest place in the world but not too bad either. Just in case: the Wikipedia page has a couple of pretty representative pictures (at least of what München wants to show you). http://en.wikipedia.org/wiki/Munich (but please don't ask me to organise the trip to Neuschwanstein for you ;) Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython workshop
mark florisson, 18.01.2011 16:07: On 18 January 2011 15:44, Stefan Behnel wrote: Stefan Behnel, 18.01.2011 11:31: William Stein, 18.01.2011 10:46: I recently got some new nontrivial NSF funding for a Cython workshop. A docathon would be a great activity at such a workshop. When some Cython developers are ready to all get flown somewhere cool to work together on Cython for a few days, all expenses paid, let me know. I'm serious. Hmm, in case the funding allows non-US territory - anyone interested in coming to Munich? Maybe not the coolest place in the world but not too bad either. Just in case: the Wikipedia page has a couple of pretty representative pictures (at least of what München wants to show you). http://en.wikipedia.org/wiki/Munich Looks nice :) Do funds also extend to the debugger guy? I do hope so. You sure want to give a tutorial, right? Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] decorators and arithmetic operators
Willem Jan Palenstijn, 17.01.2011 15:04: When using a decorator on an arithmetic operator such as __mod__ in a cdef class, the method __mod__ is properly replaced, but the corresponding operator doesn't seem to be. See below for an example. Am I doing something wrong, or is this indeed not supported? It's not supported. For cdef classes, Cython generates a C function of the appropriate signature and stuffs it into the C slot of the Python type. Since we can't know what the decorator actually does with the function, we can't just call it and expect that the return value is compatible with that slot. It should be possible to fake this feature, i.e. to actually generate a wrapper function for the slot instead that simply calls whatever the decorator actually returned. However, that requires a dedicated implementation and nothing has been done in that direction. I created a ticket for this: http://trac.cython.org/cython_trac/ticket/648 It would be nice if cython could raise a warning or error in this situation. Absolutely. http://trac.cython.org/cython_trac/ticket/649 Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] [OT] Python like lanugages [was Re: After C++, what with Python?]
Hi Terry, thanks a lot for this huge list of feedback. I'm forwarding it to the two Cython mailing lists here. We're about to release 0.14.1 real soon, so this is the perfect time to go through the documentation and fix it up. Thanks! Stefan Original-Nachricht Betreff: Re: [OT] Python like lanugages [was Re: After C++, what with Python?] Datum: Mon, 17 Jan 2011 16:38:34 -0500 Von: Terry Reedy tjre...@udel.edu An: Stefan Behnel stefan...@behnel.de So seriously need to take a look at Cython. http://cython.org OK. Some comments... http://docs.cython.org/src/quickstart/build.html Building a Cython module using distutils This section jumps from that topic to Sage notebook use. It seems that a section heading is missing. http://docs.cython.org/src/quickstart/cythonize.html Typing Variables Simply compiling this in Cython merely gives a 35% speedup. With what arguments? Suggest inserting for integrate_f(0,2,.1) or whatever. same page Typing Functions cdef double f(double) except? -2: return x**2-x I presume '...f(double x)...' The except? -2 means that an error will be checked for if -2 is returned (though the ? indicates that -2 may also be used as a valid return value). Huh? the minimum value is -.25 at .5, so why '-2' versus -10 or any other impossible return? Speedup: 150 times over pure Python. I presume this refers back to non-cdefed integrate_f. Perhaps change this to Integrate_f(...) is now 150 times as fast as the original pure Python version. http://docs.cython.org/src/tutorial/external.html It is perfectly OK do from math import sin to use Python’s sin() function. . However, calling C’s own sin() function is substantially faster, especially in tight loops. This and some of the following seem to be missing a bit of context. My suggested expansion. Suppose in the previous integration example we change f to return sin(x*x) instead of x**2-x. We could do from math import Python's sin() function. However, this reintroduces a slow Python call within the main for loop. Calling C’s own sin() is substantially faster. http://docs.cython.org/src/tutorial/clibraries.html Using C libraries As seen for the C string decoding functions above, ??? This is the beginning of the second sentence and the only thing 'above' is the integrate example and C sin function. it is actually trivial to call C functions directly in the code. The following describes what needs to be done to use an external C library in Cython code. This seems like it is drawing a false distinction, in that calling the C sin function *is* use of the external math library where it resides. (Or am I missing an important distinction?) Thus is seems to unnecessarily disconnect what the reader already knows from the following much more complicated example. Could cdef extern from math.h: double sin(double) be put in a separate .pxd file, that is then imported? If so, I suggest showing that variation of the easy example and then go to the new and much harder example. cimport python_exc This is, I believe, the first mention of 'python_exc' and there is no explanation thereafter. My guess by analogy: it is a Cython-supplied .pxd (like cqueue.pxd) that (re)defines the c-api to builtin Python exceptions. (Reading further I see 'python_exc.PyErr_NoMemory()', so my guess is not completely right.) PYTHONPATH=. python -c 'import queue.Queue as Q; Q()' Is this missing a ; or something? ...the pop() method is almost identical. if cqueue.queue_is_empty(self._c_queue): raise IndexError(Queue is empty) If the queue had one item (0) before the pop, would not this be true *after* the pop, and thus the exception erroneous? Unlike with peeking, I think you need to unconditionally check before the pop. def __nonzero__(self): Adding '# __bool__ in Python 3' would be nice. The following listing shows the complete implementation that uses cpdef methods where possible. This feature is obviously not available for the extend() method, as the method signature is incompatible with Python argument types. I would like the example to show how to deal with this to make the class completely usable from Python. Could not one rename 'extend' as 'extend_c' and define extend with Python types. Would cpdef extend(self, it): for i in it: self.append(i) work? Would append raise an error for non ints. Or how about cpdef extend(self, it): for i in it: if isinstance(i,int) and lo = i = hi: if not cqueue.queue_push_tail(self._c_queue, void*value): python_exc.PyErr_NoMemory() else: raise ValueError('bad value') with 'lo' and 'hi' replaced with appropriate expressions. http://docs.cython.org/src/tutorial/cdef_classes.html Extension types Cython supports a second kind of class: extension types, We have, of course
Re: [Cython] string literal parsing problem
Robert Bradshaw, 15.01.2011 22:51: I think the primary motivation of the -2 flag is so that, eventually, we can make -3 the default without providing a recourse for people who don't want to change their code right away. Exactly. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
[Cython] Stable ABI mode?
Hi, Python 3.2 comes with a stable ABI for extension modules. http://www.python.org/dev/peps/pep-0384/ It would be nice if Cython provided an option to restrict the C-API usage to what the ABI considers fixed and stable. That would disable excesive optimisations like list.pop(), but also disallow several other assumptions that we use all over the place in the generated C code. I'd figure it would be a somewhat involved change, though... Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] string literal parsing problem
Vitja Makarov, 15.01.2011 15:02: Looking into pyregr test log, I found that this code crashes cython compiler: print('\uXX') Here is traceback: File Cython/Compiler/Parsing.py, line 788, in p_string_literal chrval = int(systr[2:], 16) ValueError: invalid literal for int() with base 16: '' Hmm, right, the scanner notices that '\uXX' is not a valid Unicode escape sequence and reads it as '\u' + 'XX'. Good catch, I'll fix it. http://trac.cython.org/cython_trac/ticket/647 Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] string literal parsing problem
Vitja Makarov, 15.01.2011 17:21: 2011/1/15 Stefan Behnel: Vitja Makarov, 15.01.2011 15:02: Looking into pyregr test log, I found that this code crashes cython compiler: print('\uXX') Here is traceback: File Cython/Compiler/Parsing.py, line 788, in p_string_literal chrval = int(systr[2:], 16) ValueError: invalid literal for int() with base 16: '' Hmm, right, the scanner notices that '\uXX' is not a valid Unicode escape sequence and reads it as '\u' + 'XX'. Good catch, I'll fix it. http://trac.cython.org/cython_trac/ticket/647 The same applies to hex sequences, BTW. Please notice that '\u' is valid string but not unicode string, I know, I've written (and rewritten) most of that code. ;-) so it's valid in py2 and not py3. Nope, it's valid in byte strings but not in unicode strings. Py2/Py3 is not an issue here. Invalid hex sequences should trigger an error in both cases. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] string literal parsing problem
Stefan Behnel, 15.01.2011 17:29: Vitja Makarov, 15.01.2011 17:21: Please notice that '\u' is valid string but not unicode string, so it's valid in py2 and not py3. Nope, it's valid in byte strings but not in unicode strings. Py2/Py3 is not an issue here. Hmm, actually, you are right. Py2/Py3 *is* an issue here, because unprefixed strings in Py2 code become unicode strings in Py3. So, if we find an invalid unicode escape sequence, the string must become a plain byte string, and the parsed unicode string must be dropped. I think a Py3 incompatibility warning would be good in that case. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] string literal parsing problem
Vitja Makarov, 15.01.2011 17:44: print \xff /tmp/foo.pyx:1:6: String decoding as 'UTF-8' failed. Consider using a byte string or unicode string explicitly, or adjust the source code encoding. That will go away with the fix. Invalid hex escapes are illegal for both byte strings and unicode strings. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] string literal parsing problem
Stefan Behnel, 15.01.2011 17:48: Stefan Behnel, 15.01.2011 17:29: Vitja Makarov, 15.01.2011 17:21: Please notice that '\u' is valid string but not unicode string, so it's valid in py2 and not py3. Nope, it's valid in byte strings but not in unicode strings. Py2/Py3 is not an issue here. Hmm, actually, you are right. Py2/Py3 *is* an issue here, because unprefixed strings in Py2 code become unicode strings in Py3. So, if we find an invalid unicode escape sequence, the string must become a plain byte string, and the parsed unicode string must be dropped. I think a Py3 incompatibility warning would be good in that case. Actually, no, I think an error is perfectly ok. If a byte string is wanted, use a byte string in the first place. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] string literal parsing problem
Stefan Behnel, 15.01.2011 17:52: Vitja Makarov, 15.01.2011 17:44: print \xff /tmp/foo.pyx:1:6: String decoding as 'UTF-8' failed. Consider using a byte string or unicode string explicitly, or adjust the source code encoding. That will go away with the fix. Invalid hex escapes are illegal for both byte strings and unicode strings. Argl. I misread '\xff' as '\xXX', sorry. I'll give it a second try. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] string literal parsing problem
Vitja Makarov, 15.01.2011 17:44: When I say about py3 I mean that strings are unicode by default. For this code '\u' Cython now gives you an error just like you'd get when pasting the above into Python 3. That's how 'str' works in Cython. How about this kind of errors: Error converting Pyrex file to C: ... print \xff ^ /tmp/foo.pyx:1:6: String decoding as 'UTF-8' failed. Consider using a byte string or unicode string explicitly, or adjust the source code encoding. This error has been gone for good for quite a while now. Where did you see it? Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] string literal parsing problem
Vitja Makarov, 15.01.2011 18:46: 2011/1/15 Stefan Behnel: Vitja Makarov, 15.01.2011 17:44: When I say about py3 I mean that strings are unicode by default. For this code '\u' Cython now gives you an error just like you'd get when pasting the above into Python 3. That's how 'str' works in Cython. That's ok to me, but will break some pyregr tests. I know, but that's what was decided (as the result of a huge amount of discussion). It's there to make the life of the users easier. What will it print for '\u1234'? Python 2.6.6 (r266:84292, Sep 15 2010, 16:22:56) [GCC 4.4.5] on linux2 Type help, copyright, credits or license for more information. '\u' '\\u' '\u1234' '\\u1234' I think that '\u' should be translated into '\\u' for python2 That's what it does, yes. This works because we actually parse unprefixed strings in parallel as byte strings and unicode strings. However, now that I tried it, I actually get the same result in Py3, although it should have parsed the string correctly. Not sure if we discussed this problem before, but it looks like a bug to me. How about this kind of errors: Error converting Pyrex file to C: ... print \xff ^ /tmp/foo.pyx:1:6: String decoding as 'UTF-8' failed. Consider using a byte string or unicode string explicitly, or adjust the source code encoding. This error has been gone for good for quite a while now. Where did you see it? Hmm, it's still there: vitja@vitja-laptop:~/work/cython.git$ cat foo.pyx print '\xFF' vitja@vitja-laptop:~/work/cython.git$ python cython.py foo.pyx Error compiling Cython file: ... print '\xFF' ^ foo.pyx:1:6: Decoding unprefixed string literal from 'UTF-8' failed. Consider usinga byte string or unicode string explicitly, or adjust the source code encoding. Weird, I didn't get that. Anyway, different error message, different place to look then. This is related to the above problem. If both string representations get written into the C file correctly, this is no longer necessary. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] string literal parsing problem
Stefan Behnel, 15.01.2011 19:13: Vitja Makarov, 15.01.2011 18:46: What will it print for '\u1234'? Python 2.6.6 (r266:84292, Sep 15 2010, 16:22:56) [GCC 4.4.5] on linux2 Type help, copyright, credits or license for more information. '\u' '\\u' '\u1234' '\\u1234' I think that '\u' should be translated into '\\u' for python2 That's what it does, yes. This works because we actually parse unprefixed strings in parallel as byte strings and unicode strings. However, now that I tried it, I actually get the same result in Py3, although it should have parsed the string correctly. Not sure if we discussed this problem before, but it looks like a bug to me. Thinking about this some more, it's inconsistent either way. 1) If the literal string semantics should be fixed at compile time, you shouldn't get a unicode string in Python 3 in the first place. 2) If the literal should become a byte string in Py2 and a unicode string in Py3, then the unicode string should be what you you'd get if you ran your code in Py3, i.e. the unescaped unicode literal. Given that 1) is out of discussion, 2) should be fixed, IMHO. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] string literal parsing problem
Vitja Makarov, 15.01.2011 19:29: 2011/1/15 Stefan Behnel: Stefan Behnel, 15.01.2011 19:13: Vitja Makarov, 15.01.2011 18:46: What will it print for '\u1234'? Python 2.6.6 (r266:84292, Sep 15 2010, 16:22:56) [GCC 4.4.5] on linux2 Type help, copyright, credits or license for more information. '\u' '\\u' '\u1234' '\\u1234' I think that '\u' should be translated into '\\u' for python2 That's what it does, yes. This works because we actually parse unprefixed strings in parallel as byte strings and unicode strings. However, now that I tried it, I actually get the same result in Py3, although it should have parsed the string correctly. Not sure if we discussed this problem before, but it looks like a bug to me. Thinking about this some more, it's inconsistent either way. 1) If the literal string semantics should be fixed at compile time, you shouldn't get a unicode string in Python 3 in the first place. 2) If the literal should become a byte string in Py2 and a unicode string in Py3, then the unicode string should be what you you'd get if you ran your code in Py3, i.e. the unescaped unicode literal. Given that 1) is out of discussion, 2) should be fixed, IMHO. Can't we rely on -[23] cython switch? In -2 mode strings are always byte string and -3 always unicode? With -3, unprefixed strings *are* unicode strings. As for -2, there isn't currently any change in behaviour when you use that switch, and I feel reluctant to change that. For one, I doubt that anyone would seriously use it. The problem that unicode escapes in unprefixed strings behave differently in Python 2 and Python 3 is unlikely to create problems in real world code, i.e. outside of CPython's regression test suite. self.assertEqual(audioop.lin2alaw(data[0], 1), '\xd5\xc5\xf5') That's a different problem. You will notice that this code has been fixed to use the 'b' prefix in the Py3 test suite. This is a problem that cannot be solved automatically. For Python 2 code, the compiler cannot know if the user intended an unprefixed literal to be a (binary) byte string or a unicode (text) string. Only a human brain can disambiguate the code here. Remember that Python 2 will also try to decode the above binary bytes literal if it happens to be concatenated with a unicode string for some reason. String handling is structurally hard to get right in Python 2, we have to live with that (and hope that Py2 will die out soon). I think it's a great feature of Cython that it fails fast and thus tells you that your code is ambiguous and requires changes to work in Python 3. It perfectly found the problems in the above code, for one. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] string literal parsing problem
Robert Bradshaw, 15.01.2011 21:26: Whether it's the -2 flag, or something else, we should at least have a mode that handles things exactly as they would be handled in Python 2. Otherwise people won't be able to just compile their existing code without worrying about subtle issues like this. Hmm. I wouldn't mind having a mode that compiles Python 2 code and fails fast on compilation under Python 3 with a C #error - if someone has enough interest in this to implement it. I certainly don't. While I agree that these issues are subtle, they certainly aren't common enough to really worry about them. Broken code is best worked around by fixing the code. Even the Python 2.x line hasn't always bowed to the holy cow of backwards compatibility. Anyway, given that -3 doesn't prevent code from working in Python 2, the -2 flag doesn't seem like a good match. It should be something clear as in --python2-only. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython 0.14.1?
Robert Bradshaw, 14.01.2011 10:05: On Fri, Jan 14, 2011 at 12:02 AM, Robert Bradshaw wrote: On Thu, Jan 13, 2011 at 11:53 PM, Stefan Behnel wrote: mark florisson, 13.01.2011 12:05: In any case, there are virtually no changes when the --gdb flag is inactive. These last additions would be very welcome [...] Ok, here's a general statement. I personally consider the debugging support new, experimental and not a critical feature for compiler or language. You are the current maintainer of that part anyway, so I'm fine with merging the changes for 0.14.1, even unseen, as long as it doesn't break anything that's more important. I'm actually looking at that code right now. I've merged this. The build complains because it cannot find a file called Cython/Debugger/do_repeat.pyx Is that a missing commit? Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython 0.14.1?
mark florisson, 14.01.2011 12:08: On 14 January 2011 12:06, Stefan Behnelstefan...@behnel.de wrote: Robert Bradshaw, 14.01.2011 10:05: On Fri, Jan 14, 2011 at 12:02 AM, Robert Bradshaw wrote: On Thu, Jan 13, 2011 at 11:53 PM, Stefan Behnel wrote: mark florisson, 13.01.2011 12:05: In any case, there are virtually no changes when the --gdb flag is inactive. These last additions would be very welcome [...] Ok, here's a general statement. I personally consider the debugging support new, experimental and not a critical feature for compiler or language. You are the current maintainer of that part anyway, so I'm fine with merging the changes for 0.14.1, even unseen, as long as it doesn't break anything that's more important. I'm actually looking at that code right now. I've merged this. The build complains because it cannot find a file called Cython/Debugger/do_repeat.pyx I don't have it, how does it complain? https://sage.math.washington.edu:8091/hudson/job/cython-devel-build-py2-trunk/744/console setup.py says: compiled_modules = [Cython.Plex.Scanners, Cython.Plex.Actions, ..., Cython.Runtime.refnanny, Cython.Debugger.do_repeat,] So, you're suggesting that this line can be removed? Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython 0.14.1?
Robert Bradshaw, 14.01.2011 12:12: On Fri, Jan 14, 2011 at 3:11 AM, mark florisson wrote: On 14 January 2011 12:08, mark florisson wrote: On 14 January 2011 12:06, Stefan Behnel wrote: The build complains because it cannot find a file called Cython/Debugger/do_repeat.pyx Apparently something left that should have been removed, apologies. You can remove it from the setup.py in the list on line 104. Done. Ok, that looks better. Now that the debugger support is an official feature, should we have a Hudson job that builds Cython and all tests with gdb support enabled? Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython 0.14.1?
mark florisson, 14.01.2011 12:45: is it possible that my branch will make it to Hudson? [...] If that's a hassle I wouldn't mind mainline branch access either, I promise I'll be good :) I'm ok with giving you write access to the main branch, so that you can merge your changes over yourself. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
[Cython] Cython 0.14.1?
Hi, I looked through the tickets that are scheduled for 0.14.1 but couldn't see any that are truly critical, especially no regressions. http://trac.cython.org/cython_trac/query?status=assignedstatus=newstatus=reopenedorder=prioritycol=idcol=summarycol=statuscol=typecol=prioritycol=milestonecol=componentmilestone=0.14.1 I think we fixed all known bugs that appeared with 0.14 by now, so I'd say it's time for at least a release candidate. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython 0.14.1?
mark florisson, 13.01.2011 11:42: On 13 January 2011 11:39, Stefan Behnel wrote: I looked through the tickets that are scheduled for 0.14.1 but couldn't see any that are truly critical, especially no regressions. http://trac.cython.org/cython_trac/query?status=assignedstatus=newstatus=reopenedorder=prioritycol=idcol=summarycol=statuscol=typecol=prioritycol=milestonecol=componentmilestone=0.14.1 I think we fixed all known bugs that appeared with 0.14 by now, so I'd say it's time for at least a release candidate. Great, I presume the pending pull requests will be merged for this release? The test runner change by Vitja should be ok. Can't comment on the debugging changes. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython 0.14.1?
Dag Sverre Seljebotn, 13.01.2011 11:56: As we use Trac to coordinate fixes with releases I think it'd be good if we went into the habit of always creating a small Trac ticket for each pull request. +1. My immediate thoughts were oh great, one place more to look before a release Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Regression in cython trunk
Sébastien Sablé, 13.01.2011 13:45: Thanks for Cython it is a great tool that I use everyday in my work. I wanted to test cython 0.14.1 before its release and I found a regression related to type inference compared to 0.13 that impacts my project. The following code will not correctly compile: #cython: infer_types=True cdef get_field_flags(): flags = [] append = flags.append append('fetched') return flags ... with the obvious (and much faster) work-around to not use the bound method instead of calling it directly on the list object. cythoning magnum/magpy.pyx to magnum/magpy.c Error compiling Cython file: ... #cython: infer_types=True cdef get_field_flags(): flags = [] append = flags.append append('fetched') ^ magnum/magpy.pyx:6:10: Call with wrong number of arguments (expected 2, got 1) The same code would work great with cython 0.13 and it would work with 0.14.1 if I remove the infer_types=True line. Thanks for the report. It's quite possible that the type inference mechanism now assumes that the bound method is actually the C-API function replacement, i.e. the C function PyList_Append(). That's the wrong thing to infer here as it cannot be made to work. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Regression in cython trunk
Stefan Behnel, 13.01.2011 15:53: Sébastien Sablé, 13.01.2011 13:45: #cython: infer_types=True cdef get_field_flags(): flags = [] append = flags.append append('fetched') ^ magnum/magpy.pyx:6:10: Call with wrong number of arguments (expected 2, got 1) The same code would work great with cython 0.13 and it would work with 0.14.1 if I remove the infer_types=True line. Thanks for the report. It's quite possible that the type inference mechanism now assumes that the bound method is actually the C-API function replacement, i.e. the C function PyList_Append(). That's the wrong thing to infer here as it cannot be made to work. Here's another related problem: # cython: infer_types=True cdef int func(int x): return x+2 def inferred(): f = func 'f' is inferred as C function, not as function pointer. http://trac.cython.org/cython_trac/ticket/643 http://trac.cython.org/cython_trac/ticket/644 Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Regression in cython trunk
Stefan Behnel, 13.01.2011 16:32: Stefan Behnel, 13.01.2011 15:53: Sébastien Sablé, 13.01.2011 13:45: #cython: infer_types=True cdef get_field_flags(): flags = [] append = flags.append append('fetched') ^ magnum/magpy.pyx:6:10: Call with wrong number of arguments (expected 2, got 1) The same code would work great with cython 0.13 and it would work with 0.14.1 if I remove the infer_types=True line. Thanks for the report. It's quite possible that the type inference mechanism now assumes that the bound method is actually the C-API function replacement, i.e. the C function PyList_Append(). That's the wrong thing to infer here as it cannot be made to work. Here's another related problem: # cython: infer_types=True cdef int func(int x): return x+2 def inferred(): f = func 'f' is inferred as C function, not as function pointer. http://trac.cython.org/cython_trac/ticket/643 http://trac.cython.org/cython_trac/ticket/644 Fixes are here: https://github.com/cython/cython/commit/3ac5fb86833de2831e72bd368fab6b6f1fb03ec2 https://github.com/cython/cython/commit/48177ec6f603d7ae88ee6302bfe5c4c299cd350a Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Gambit-C problem
David Dreisigmeyer, 13.01.2011 20:38: I've been having problem trying to use Gambit-C scheme in Python. Everything works fine if I compile Gambit-C alone and run it. But if I try to combine it with Cython to make a extension it doesn't work. Hi, please note that the right place for help requests regarding Cython is the cython-users mailing list, not the core developers mailing list. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython 0.14.1?
mark florisson, 13.01.2011 12:05: In any case, there are virtually no changes when the --gdb flag is inactive. These last additions would be very welcome [...] Ok, here's a general statement. I personally consider the debugging support new, experimental and not a critical feature for compiler or language. You are the current maintainer of that part anyway, so I'm fine with merging the changes for 0.14.1, even unseen, as long as it doesn't break anything that's more important. And, no, that's not a green card for back doors. ;-) Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] raw string problem
Vitja Makarov, 12.01.2011 19:29: It seems that cython parses raw strings as usual strings: Try this: print r'\' That's slightly exaggerated. The only broken cases are '\' and '\'', everything else works AFAICT. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] raw string problem
Vitja Makarov, 12.01.2011 22:08: 2011/1/12 Stefan Behnel: Vitja Makarov, 12.01.2011 19:29: It seems that cython parses raw strings as usual strings: Try this: print r'\' That's slightly exaggerated. The only broken cases are '\' and '\'', everything else works AFAICT. That breaks StringEncoding._to_escape_sequence btw... What do you mean? When I fix the parser, it works for me. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] raw string problem
Robert Bradshaw, 12.01.2011 22:16: On Wed, Jan 12, 2011 at 12:51 PM, Stefan Behnel wrote: Vitja Makarov, 12.01.2011 19:29: It seems that cython parses raw strings as usual strings: Try this: print r'\' That's slightly exaggerated. The only broken cases are '\' and '\'', everything else works AFAICT. That is comforting, and given that it's a tiny corner case I think it's worth and safe fixing in 0.14.1. I pushed a fix. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] General generator expressions
Robert Bradshaw, 07.01.2011 19:17: On Fri, Jan 7, 2011 at 4:27 AM, Vitja Makarov wrote: I've added GenerartorDefNode and GeneratorBodyDefNode now all the generators stuff is handled there. Yay! I'm pretty busy this weekend, but have been following your branch from a distance and so reviewing this shouldn't take too long. I'm thinking we should get a 0.14.1 bug fix out ASAP (e.g. once those Windows and type inference bugs are in) and then 0.15 will follow with generators. Absolutely. I should find some time for a review early next week, but more eyes are always better. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython 0.14 Bugreport
Daniel Norberg, 06.01.2011 12:20: I'm getting a curious error in Cython 0.14 when trying to compile this: def bar(foo): qux = foo quux = foo[qux.baz] The error message: $ cython bar.py Error compiling Cython file: ... def bar(foo): qux = foo quux = foo[qux.baz] ^ /Users/daniel/Desktop/cython-test/bar.py:3:15: Object of type 'unspecified' has no attribute 'baz' Cython 0.13 compiles this just fine. I also tried the latest revision of cython-devel (b816b03ff502) and it fails. I can reproduce this. From a quick test, it seems like the type inference machinery processes 'quux' and 'qux' in the wrong order, i.e. 'quux' before 'qux'. Anyone interested in taking a closer look? Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
[Cython] Fixing #602 - type inference for byte string literals
Hi, I've been working on a fix for ticket #602, negative indexing for inferred char*. http://trac.cython.org/cython_trac/ticket/602 Currently, when you write this: s = b'abc' s is inferred as char*. This has several drawbacks. For one, we loose the length information, so len(s) becomes O(n) instead of O(1). Negative indexing fails completely because it will use pointer arithmetic, thus leaving the allocated memory area of the string. Also, code like the following is extremely inefficient because it requires multiple conversions from a char* of unknown length to a Python bytes object: s = b'abc' a = s1 + s b = s2 + s I came to the conclusion that the right fix is to stop letting byte string literals start off as char*. This immediately fixes these issues and improves Python compatibility while still allowing automatic coercion, but it also comes with its own drawbacks. In nogil blocks, you will have to explicitly declare a variable as char* when assigning a byte string literal to it, otherwise you'd get a compile time error for a Python object assignment. I think this is a minor issue as most users would declare their variables anyway when using nogil blocks. Given that there isn't much you can do with a Python string inside of a nogil block, we could also honour nogil blocks during type inference and automatically infer char* for literals here. I don't think it would hurt anyone to do that. The second drawback is that it impacts type inference for char loops. Previously, you could write s = b'abc' for c in s: print c and Cython would infer 'char' for c and print integer byte values. When s is inferred as 'bytes', c will be inferred as 'Python object' because Python 2 returns 1-byte strings and Python 3 returns integers on iteration. Thus the loop will run entirely in Python code and return different things in Py2 and Py3. I do not expect that this is a major issue either. Iteration over literals should be rare, after all, and if the byte string is constructed in any way, the type either becomes a bytes object through Python operations (like concatenation) or is explicitly provided, e.g. as a return type of a function call. But it is a clear behavioural change for the type inference in an area where Cython's (and Python's) semantics are tricky anyway. Personally, I think that the advantages outweigh the disadvantages here. Most common use cases won't notice the change because coercion will not be impacted, and most existing code (IMHO) either uses explicit typing or expects a Python bytes object anyway. So my preferred change would be to make byte string literals 'bytes' by default, except in nogil blocks. Opinions? Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Fixing #602 - type inference for byte string literals
Dag Sverre Seljebotn, 03.01.2011 19:10: On 01/03/2011 06:35 PM, Robert Bradshaw wrote: On Mon, Jan 3, 2011 at 4:01 AM, Lisandro Dalcindalc...@gmail.com wrote: On 3 January 2011 04:41, Stefan Behnelstefan...@behnel.de wrote: Hi, I've been working on a fix for ticket #602, negative indexing for inferred char*. http://trac.cython.org/cython_trac/ticket/602 Currently, when you write this: s = b'abc' s is inferred as char*. This has several drawbacks. For one, we loose the length information, so len(s) becomes O(n) instead of O(1). Negative indexing fails completely because it will use pointer arithmetic, thus leaving the allocated memory area of the string. Also, code like the following is extremely inefficient because it requires multiple conversions from a char* of unknown length to a Python bytes object: s = b'abc' a = s1 + s b = s2 + s I came to the conclusion that the right fix is to stop letting byte string literals start off as char*. This immediately fixes these issues and improves Python compatibility while still allowing automatic coercion, but it also comes with its own drawbacks. In nogil blocks, you will have to explicitly declare a variable as char* when assigning a byte string literal to it, otherwise you'd get a compile time error for a Python object assignment. I think this is a minor issue as most users would declare their variables anyway when using nogil blocks. Given that there isn't much you can do with a Python string inside of a nogil block, we could also honour nogil blocks during type inference and automatically infer char* for literals here. I don't think it would hurt anyone to do that. The second drawback is that it impacts type inference for char loops. Previously, you could write s = b'abc' for c in s: print c and Cython would infer 'char' for c and print integer byte values. When s is inferred as 'bytes', c will be inferred as 'Python object' because Python 2 returns 1-byte strings and Python 3 returns integers on iteration. Thus the loop will run entirely in Python code and return different things in Py2 and Py3. I do not expect that this is a major issue either. Iteration over literals should be rare, after all, and if the byte string is constructed in any way, the type either becomes a bytes object through Python operations (like concatenation) or is explicitly provided, e.g. as a return type of a function call. But it is a clear behavioural change for the type inference in an area where Cython's (and Python's) semantics are tricky anyway. Personally, I think that the advantages outweigh the disadvantages here. Most common use cases won't notice the change because coercion will not be impacted, and most existing code (IMHO) either uses explicit typing or expects a Python bytes object anyway. So my preferred change would be to make byte string literals 'bytes' by default, except in nogil blocks. +1 +1 I might say it should even be required in nogil blocks for consistency. +1 to not making nogil blocks a special case, the disadvantage of another special case to remember outweighs the advantage of syntactic brevity IMO. Ok, then it's the proposed change without a special case for nogil. That's better anyway because I just noticed that type inference doesn't know about nogil environments. They are not determined before the subsequent type analysis step. https://github.com/cython/cython/commit/342eb45a2fd19869273ec038144c71ac6e49db0e Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] General generator expressions
Vitja Makarov, 30.12.2010 18:53: Why did you reverted decorators stuff? https://github.com/vitek/cython/commit/e0d366d9409680849e6f429992ac9724e2ad1016 Because I didn't get it finished, it broke the Sage build and cython-devel should stay cleanly working before the release. I added a link to the build failure log to the trac ticket but didn't even manage to come up with a suitable test case before I had to stop working on it. What I implemented so far works in most cases but there are issues with functions at the module scope. I didn't check the history but I think I recall from the code comments that Robert worked in this area (method binding) a while ago, and the comments hint at an unfinished refactoring that would help with this change. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython 0.14.1
Arfrever Frehtes Taifersar Arahesis, 26.12.2010 12:58: 2010-12-26 09:46:48 Stefan Behnel napisał(a): regarding the next release, I think we should fix the ref-counting bug that Martijn Meijers found (ticket #633) and then release 0.14.1 ASAP from the current git tip, which already has a couple of other bug fixes. It would be nice if compatibility with Numpy 1.5 was fixed (ticket #630). IMHO, that's not the same level of criticality as a serious crash bug, so unless someone provides a ready-to-review-and-merge patch for #630, I'd prefer getting 0.14.1 out without waiting for a fix. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Hg to git
Robert Bradshaw, 21.12.2010 18:32: On Tue, Dec 21, 2010 at 3:10 AM, Stefan Behnelstefan...@behnel.de wrote: Robert Bradshaw, 21.12.2010 07:12: Anyone object to making http://hg.cython.org/ read-only, and making https://github.com/cython/cython/ (and its forks) the live devel branch? I think it's a good time to do this now that 0.14 is out. Yep. Done. Hudson also builds from the git repo now. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython 0.14.1
Robert Bradshaw, 27.12.2010 17:36: On Mon, Dec 27, 2010 at 12:07 AM, Stefan Behnel wrote: Arfrever Frehtes Taifersar Arahesis, 26.12.2010 12:58: 2010-12-26 09:46:48 Stefan Behnel napisał(a): regarding the next release, I think we should fix the ref-counting bug that Martijn Meijers found (ticket #633) and then release 0.14.1 ASAP from the current git tip, which already has a couple of other bug fixes. My thoughts exactly. It would be nice if compatibility with Numpy 1.5 was fixed (ticket #630). IMHO, that's not the same level of criticality as a serious crash bug, so unless someone provides a ready-to-review-and-merge patch for #630, I'd prefer getting 0.14.1 out without waiting for a fix. I agree. There's a fair number of fixes in the tree already, so I was actually thinking about doing a 0.14.1 release soon anyways. Of course something big like this necessitates a quicker release, but how about we see what fixes we can get in this next release and release first thing in January? +1, that should be close enough. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
[Cython] Cython 0.14.1
Hi, regarding the next release, I think we should fix the ref-counting bug that Martijn Meijers found (ticket #633) and then release 0.14.1 ASAP from the current git tip, which already has a couple of other bug fixes. I'll see if I can free some time to work on a fix next week, but wouldn't mind if someone beat me to it. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython 0.14: KeyError during compilation of Yasm
Robert Bradshaw, 20.12.2010 20:35: On Sun, Dec 19, 2010 at 11:20 AM, Stefan Behnel wrote: The __new__ method must be called __cinit__. Without closer investigation, I suspect that the relevant change in Cython 0.14 is that it knows about __new__ being a staticmethod. However, for cdef classes is silently translates __new__ into __cinit__ internally. It seems like both don't work together. Using __new__ instead of __cinit__ has been deprecated for years now, how about turning it into a clean error (which would fix this bug). +1, that's what I thought, too. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Hg to git
Robert Bradshaw, 21.12.2010 07:12: Anyone object to making http://hg.cython.org/ read-only, and making https://github.com/cython/cython/ (and its forks) the live devel branch? I think it's a good time to do this now that 0.14 is out. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] [PATCH] fix broken PyBuffer_Release declaration
W. Trevor King, 21.12.2010 14:56: On Sun, Dec 19, 2010 at 07:28:32PM +0100, Stefan Behnel wrote: W. Trevor King, 19.12.2010 14:26: Here's a patch bringing the ByBuffer_Release declaration up to speed with current Python headers. Applied, thanks. The buffer PEP and its implementation were still in flux even shortly before the release of Py3. Actually, the declaration in PEP 3118 has not been updated [1]. Is that a PEP bug? Looks like it (and wouldn't be the first time ;). That was the time when the buffer.pxd was written, and I guess it wasn't fixed since because most people just use the builtin buffer support in Cython. Hmm, perhaps I'm doing things the wrong way then ;). I'm calling an external library that needs: fn(unsigned sort *array, int len) Is there another way to get that pointer? Admittedly, it's a bit hidden in the docs if you don't use NumPy. Does this help? http://docs.cython.org/src/tutorial/numpy.html http://wiki.cython.org/enhancements/buffer I'm surprised I didn't find any more extensive documentation of this feature in the core documentation. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] [PATCH] fix broken PyBuffer_Release declaration
W. Trevor King, 19.12.2010 14:26: Here's a patch bringing the ByBuffer_Release declaration up to speed with current Python headers. Applied, thanks. The buffer PEP and its implementation were still in flux even shortly before the release of Py3. That was the time when the buffer.pxd was written, and I guess it wasn't fixed since because most people just use the builtin buffer support in Cython. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] Cython 0.14: KeyError during compilation of Yasm
Arfrever Frehtes Taifersar Arahesis, 19.12.2010 15:39: $ wget http://www.tortall.net/projects/yasm/releases/yasm-1.1.0.tar.gz ... $ tar -xzf yasm-1.1.0.tar.gz $ cd yasm-1.1.0 $ ./configure --enable-python --enable-python-bindings ... $ make ... /usr/bin/python -c from Cython.Compiler.Main import main; main(command_line=1) \ -o yasm_python.c yasm.pyx Traceback (most recent call last): [...] File /usr/lib64/python2.7/site-packages/Cython/Compiler/Nodes.py, line 3249, in analyse_declarations self.body.analyse_declarations(scope) File /usr/lib64/python2.7/site-packages/Cython/Compiler/Nodes.py, line 346, in analyse_declarations stat.analyse_declarations(env) File /usr/lib64/python2.7/site-packages/Cython/Compiler/Nodes.py, line 1999, in analyse_declarations self.analyse_signature(env) File /usr/lib64/python2.7/site-packages/Cython/Compiler/Nodes.py, line 2097, in analyse_signature arg.hdr_type = sig.fixed_arg_type(i) File /usr/lib64/python2.7/site-packages/Cython/Compiler/TypeSlots.py, line 100, in fixed_arg_type return self.format_map[self.fixed_arg_format[i]] KeyError: 'T' This problem doesn't occur with Cython 0.13. Is it a bug in Cython 0.14 or in Yasm? Both, I'd say. The crash is in line 100 of yasm.pyx, where it says: cdef class __assoc_data_callback: cdef yasm_assoc_data_callback *cb def __new__(self, destroy, print_): # === HERE self.cb = yasm_assoc_data_callback * \ malloc(sizeof(yasm_assoc_data_callback)) self.cb.destroy = void (*) (void *)PyCObject_AsVoidPtr(destroy) #self.cb.print_ = void (*) (void *, FILE *, int) \ PyCObject_AsVoidPtr(print_) The __new__ method must be called __cinit__. Without closer investigation, I suspect that the relevant change in Cython 0.14 is that it knows about __new__ being a staticmethod. However, for cdef classes is silently translates __new__ into __cinit__ internally. It seems like both don't work together. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] ANN: Cython 0.14 released
Lisandro Dalcin, 15.12.2010 20:38: On 15 December 2010 09:02, Arfrever Frehtes Taifersar Arahesis == ERROR: compiling (c) and running numpy_bufacc_T155 -- FAILED (errors=11) === Got errors: === 10:9: 'ndarray' is not a type identifier 173:49: mode is not a buffer option The problem here is that the dictionary keywords of __cythonbufferdefaults__ are parsed as BytesLiteral, there is code with if not mode in buffer_options at Buffer.py that fails because mode is 'bytes' and buffer_options do have 'str' keys... Stefan, could you take a look at this? Yes, I know. IIRC, NumPy 1.4 *requires* bytes values here, though. I don't have any NumPy's installed, BTW, especially not in a Py3 build. I think the right way to parse the dict keys here is as identifiers. I usually do d = dict(a=1, b=5) to get this behaviour. Cython transforms this to a literal dict with identifier keywords internally. Not sure if this works in a .pxd... We could also override the type of the dict keys as identifiers explicitly when handling __cythonbufferdefaults__. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] failing test: py_unicode_type
Lisandro Dalcin, 14.12.2010 00:46: No clue about what's going on here... $ python runtests.py py_unicode_type Python 2.6.4 (r264:75706, Jun 4 2010, 18:20:16) [GCC 4.4.4 20100503 (Red Hat 4.4.4-2)] Running tests against Cython 0.14.rc0 py_unicode_type.c:1:2: error: #error Do not use this file, it is the result of a failed Cython compilation. That was one of the three broken tests I meantioned earlier. Seems to be fixed now. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev
Re: [Cython] failure with MSVC
Lisandro Dalcin, 14.12.2010 00:37: builtin_type_inheritance_T608.c(371) : error C2016: C requires that a struct or union has at least one member builtin_type_inheritance_T608.c(384) : error C2016: C requires that a struct or union has at least one member builtin_type_inheritance_T608.c(397) : error C2016: C requires that a struct or union has at least one member I'm not sure what's the best way to fix this... That must seriously be the first time MSVC actually discovered a bug. ;-) The problem is that the types in this test inherit from builtin types that have builtin methods declared on them. These methods are considered a reason to generate a vtable for the type, even though the type itself does not define any methods, and the builtin methods do not need a vtable. I'll check if this can be special cased somehow. Stefan ___ Cython-dev mailing list Cython-dev@codespeak.net http://codespeak.net/mailman/listinfo/cython-dev