Re: [Python-Dev] New calling convention to avoid temporarily tuples when calling functions
Hi,
I pushed the most basic implementation of _PyObject_FastCall(), it
doesn't support keyword parameters yet:
https://hg.python.org/cpython/rev/a1a29d20f52d
https://bugs.python.org/issue27128
Then I patched a lot of call sites calling PyObject_Call(),
PyObject_CallObject(), PyEval_CallObject(), etc. with a temporary
tuple. Just one example:
-args = PyTuple_Pack(1, match);
-if (!args) {
-Py_DECREF(match);
-goto error;
-}
-item = PyObject_CallObject(filter, args);
-Py_DECREF(args);
+item = _PyObject_FastCall(filter, &match, 1, NULL);
The next step is to support keyword parameters. In fact, it's already
supported in all cases except of Python functions:
https://bugs.python.org/issue27809
Supporting keyword parameters will allow to patch much code to avoid
temporary tuples, but it is also required for a much more interesting
change:
https://bugs.python.org/issue27810
"Add METH_FASTCALL: new calling convention for C functions"
I propose to add a new METH_FASTCALL calling convention. The example
using METH_VARARGS | METH_KEYWORDS:
PyObject* func(DirEntry *self, PyObject *args, PyObject *kwargs)
becomes:
PyObject* func(DirEntry *self, PyObject **args, int nargs, PyObject *kwargs)
Later, Argument Clinic will be modified to *generate* code using the
new METH_FASTCALL calling convention. Code written with Argument
Clinic will only need to be updated by Argument Clinic to get the new
faster calling convention (avoid the creation of a temporary tuple for
positional arguments).
Victor
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] File system path encoding on Windows
Nick Coghlan writes: > On 21 August 2016 at 06:31, Steve Dower wrote: > > My biggest concern is that it then falls onto users to know how > > to start Python with that flag. The users I'm most worried about belong to organizations where concerted effort has been made to "purify" the environment so that they *can* use bytes-oriented code. That is, getfilesystemencoding() == getpreferredencoding() == what is actually used throughout the system. Such organizations will be able to choose the flag correctly, and implement it organization-wide, I'm pretty sure. I doubt that all will choose UTF-8 at this point in time, though I wish they would. > Not necessarily, as this is one of the areas where commercial > redistributors can earn their revenue stream - by deciding that > flipping the default behaviour is the right thing to do for *their* > user base (which is inevitably only a subset of the overall Python > user base). This assumes that the Python applications are the mission-critical ones for their clients. What if they're not? I think the commercial redistributors will have to make their decisions on a client-by-client basis, too. They may be in a better position to do so, but why buy trouble? They'll be quite conservative (unless they're basically monopoly IT supplier to a whole organization, but they'll still have to face potential problems from existing files on users' storage, and perhaps applications that they supply but don't "own"). I have real trouble seeing trying to force UTF-8 as a good idea until the organization mandates UTF-8. :-( This really is an organizational decision, to be implemented with client resources. We can't do it for them, redistributors can't do it for them. It needs to be an option in Python. Python itself is already ready for UTF-8, except that on Windows getfilesystemencoding and getpreferredencoding can't honestly return 'utf-8', AIUI. I understand that that is exactly what Steve wants to change, but "honestly" is the rub. What happens if Python 3.6 is only part of a bytes-oriented system, receives a filename forced to UTF-8- encoded bytes, and passes that over a pipe or in shared memory or in a file to a non-Python-3.6 application that trusts the system defaults? "Boom!", no? Is there any experience anywhere in any implementation language with systems used on Windows that use this approach of pretending the Windows world is UTF-8? If not, why is it a good idea for Python to go first? > Making that possible doesn't mean redistributors will actually follow > through, but it's an option worth keeping in mind, as while it does > increase the ecosystem complexity in the near term (since default > behaviour may vary based on how you obtained your Python runtime), in > the longer term it can allow for better informed design decisions at > the reference interpreter level. (For business process wonks, it's > essentially like running through a deliberate divergence/convergence > cycle at the level of the entire language ecosystem: > http://theagilepirate.net/archives/1392 ) It's worse than "the entire language ecosystem" -- it's your whole business.[1] If the proposed change to getfilesystemencoding and file system APIs creates issues at all, it matters because files on disk, or other applications that receive bytes from Python, refer to filenames encoded in the preferred encoding != UTF-8. It's unlikely in the extreme that all such files are exclusively used by Python, which at best means individual users will need to manage encodings file by file. At worst, some of the filenames so encoded will be shared with applications that expect the preferred encoding, and then you've got a war on your hands. > > On the other hand, having code opt-in or out of the new handling > > requires changing code (which is presumably not going to happen, > > or we wouldn't consider keeping the old behaviour and/or letting > > the user control it), I don't understand why this argument doesn't cut both ways equally. If you believe that, you should also believe that the same people who won't change code to opt in also won't use a Python containing fix #1, and may not install it at all. Doesn't that matter? > I think you'll want to escalate this to a PEP as well +1 for the reasons Nick gives. The conclusions of this discussion need a canonical URL. Footnotes: [1] I'm assuming that readers are going to associated "language" <--> "Python". The blog post Nick refers to is about the whole business. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Anyone know Brendan Scott, author of 'Python for Kids'?
On 22 August 2016 at 07:22, Terry Reedy wrote: > So, if you agree with me, please either write Brendan personally if you know > him, or just leave your own comment on the blog. Brendan spoke at the inaugural PyCon Australia Education Seminar last year, so I've contacted him (cc you) to suggest making the fix. Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] File system path encoding on Windows
On 22Aug2016 0247, Stephen J. Turnbull wrote: Nick Coghlan writes: > On 21 August 2016 at 06:31, Steve Dower wrote: > > My biggest concern is that it then falls onto users to know how > > to start Python with that flag. The users I'm most worried about belong to organizations where concerted effort has been made to "purify" the environment so that they *can* use bytes-oriented code. That is, getfilesystemencoding() == getpreferredencoding() == what is actually used throughout the system. Such organizations will be able to choose the flag correctly, and implement it organization-wide, I'm pretty sure. I doubt that all will choose UTF-8 at this point in time, though I wish they would. I think that these are also the people who are likely to read a PEP and enable an environment variable to preserve the current behaviour. I'm more concerned about uncontrolled environments where a library breaks on a random user's machine because random user downloaded a file from a foreign website. I don't recall whether I mentioned an environment variable (i.e. PYTHONUSELEGACYENCODING or similar) to switch back to mbcs:ignore by default, but it was always my intent to have one. Python itself is already ready for UTF-8, except that on Windows getfilesystemencoding and getpreferredencoding can't honestly return 'utf-8', AIUI. I understand that that is exactly what Steve wants to change, but "honestly" is the rub. What happens if Python 3.6 is only part of a bytes-oriented system, receives a filename forced to UTF-8- encoded bytes, and passes that over a pipe or in shared memory or in a file to a non-Python-3.6 application that trusts the system defaults? "Boom!", no? Is there any experience anywhere in any implementation language with systems used on Windows that use this approach of pretending the Windows world is UTF-8? If not, why is it a good idea for Python to go first? The Windows world is Unicode. Mostly represented in UTF-16, but UTF-8 is entirely equivalent. All MSVC users have been pushed towards Unicode for many years. The .NET Framework has defaulted to UTF-8 its entire existence. The use of code pages has been discouraged for decades. We're not going first :) > > On the other hand, having code opt-in or out of the new handling > > requires changing code (which is presumably not going to happen, > > or we wouldn't consider keeping the old behaviour and/or letting > > the user control it), I don't understand why this argument doesn't cut both ways equally. If you believe that, you should also believe that the same people who won't change code to opt in also won't use a Python containing fix #1, and may not install it at all. Doesn't that matter? People already do this (e.g. Python 2.7). I don't think it should matter enough to prevent us from making changes in new versions of Python. Otherwise, why would we ever release new versions? So I guess the question here is: for organisations who have already (incorrectly) assumed that the file system encoding and the active code page are always the same, have built solid infrastructure around this using bytes (including ensuring that their systems never encounter external paths in glob/listdir/etc.), are currently using 3.5 and want to migrate to 3.6 - is an environment variable to change back to mbcs sufficient to meet their needs? Cheers, Steve ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] File system path encoding on Windows
On Mon, Aug 22, 2016 at 3:58 PM, Steve Dower wrote: > All MSVC users have been pushed towards Unicode for many years. The .NET > Framework has defaulted to UTF-8 its entire existence. The use of code pages > has been discouraged for decades. We're not going first :) I just wrote a simple function to enumerate the 822 system locales on my Windows box (using EnumSystemLocalesEx and GetLocaleInfoEx, which are Unicode-only functions), and 36.7% of them lack an ANSI codepage. They're Unicode-only locales. UTF-8 is the only way to support these locales with a bytes API. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Deprecating invalid escape sequences: review request
Hello Python-dev, some time ago I went ahead and implemented a patch to deprecate the invalid escape sequences (e.g. \c, \g, \h, etc.) in str and bytes literals. The change itself is pretty straightforward, and shouldn't be hard to review. The change was split in two patches; one which does the actual deprecation and adds relevant tests, and the other fixes all invalid escape sequences in the entire CPython distribution (this resulted in a substantial patch file of over 2000 lines). I'd like to get this reviewed and merged in time, so I'm asking here. Thanks in advance! http://bugs.python.org/issue27364 -Emanuel ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] File system path encoding on Windows
Steve Dower writes: > The Windows world is Unicode. Mostly represented in UTF-16, but UTF-8 is > entirely equivalent. Sort of, yes, and not for present purposes. AFAICS, the Windows world is mostly application/* media that require substantial developer effort to extract text from; character encoding is a minor annoyance. These are not Unicode, even if the embedded text uses the Unicode coded character set. When in comes to text/* media (including file system names), my personal experience is that non-Unicode encodings are used often, even where they're forbidden (and, ironically enough, where forbidden only by Windows users[1]). As far as the UTF in use, I concede your expertise. UTF-8 is absolutely not equivalent to UTF-16 from the point of view of developers. Passing it to Windows APIs requires decoding to UTF-16 (or from a Python developer's point of view, decoding to str and use of str APIs). That fact is what got you started on this whole proposal! > All MSVC users have been pushed towards Unicode for many years. But that "push" is due to the use of UTF-16-based *W APIs and deprecation of ACP-based *A APIs, right? The input to *W APIs must be decoded from all text/* content "out there", including UTF-8 content. I don't see evidence that users have been pushed toward *UTF-8* in that statement; they may be decoding from something else. Unicode != UTF-8 for our purposes! In any case, I suspect lot of people use Python to avoid C, and so existing Python users may not be affected by MSVC "pressure". > The .NET Framework has defaulted to UTF-8 Default != enforce, though. Do you know that almost nobody changes the default, and that behavior is fairly uniform across different classes of organization (specifically by language)? Or did you mean "enforce"? > its entire existence. The use of code pages has been discouraged > for decades. We're not going first :) The fact that a framework, which by definition provides a world- within-a-world, can insist on UTF-8 from the start is very different from a generic programming language, which has deliberately provided multiscript capability for decades. People who buy in to .NET do so because the disadvantages (which may include character encoding conversion at the boundary, or "purification" of the environment to use only UTF-8) are outweighed by both the individual features of the framework and their packaging into a consistent whole. This is closely related to my idea about "effective monopoly IT providers". On the contrary, people who use Python may very well have done to *avoid* the Unicode strictures of .NET (or at least consider it a convenience compared to changing user behavior to conform to .NET), perhaps "localized" to a particular department or use case. I believe I've mentioned that my employers' various downloadable database queries (course catalog, student rosters) are mostly structured as CSV files, with the option to encode as UTF-8 or Shift-JIS. I suspect that is very common in Japanese universities because of the popularity of Macs among educators, professionals, and students. I don't know about business and government, which is very Windows-oriented. There, I suspect Shift-JIS is the rule for text/* media, but Excel for data tables and Word, Powerpoint, and PDF for "rich text" may be used almost exclusively, so text/* may not be relevant in information interchange. > > I don't understand why this argument doesn't cut both ways > > equally. If you believe that, you should also believe that the > > same people who won't change code to opt in also won't use a > > Python containing fix #1, and may not install it at all. Doesn't > > that matter? > > People already do this (e.g. Python 2.7). I don't think it should > matter enough to prevent us from making changes in new versions of > Python. Of course it shouldn't, for the generic idea of change. But the argument you made is that "if we don't *force* UTF-8, users who won't change code won't get the benefit of UTF-8". My rebuttal is that "if we *do* force UTF-8, those same users lose the benefit of both Python 3.6 and UTF-8." It matters how many are in that situation, but unfortunately we'll just have to guess about that. > So I guess the question here is: for organisations who have already > (incorrectly) assumed that the file system encoding and the active > code page are always the same, Stop bashing the users, please! This "users are stupid, we know better" is the attitude that scares me about this proposal. In the enterprises I'm talking about, that is an organizational decision, not an assumption. (It is likely to be "close enough" to true in some cases that lack such a policy, too.) Or are you telling me that Windows will change the active code page behind the users' backs even if it's told not to do so? Now, you can argue that few organizations actually have such policies, and you may be right. I don't know, and you don't know. The damag
Re: [Python-Dev] socket.setsockopt() with optval=NULL
Another option would be add a setalg method with whatever (nice, pythonic) API we want. Emulating the crummy C-level API needn't be a goal I think. On Sun, Aug 21, 2016, at 05:37, Christian Heimes wrote: > Hi, > > the socket.setsockopt(level, optname, value) method has two calling > variants. When it is called with a buffer-like object as value, it calls > the C API function setsockopt() with optval=buffer.buf and > optlen=buffer.len. When value is an integer, setsockopt() packs it as > int32 and sends it with optlen=4. > > --- > # example.py > import socket > sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) > sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, > b'\x00\x00\x00\x00') > sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) > --- > > $ strace -e setsockopt ./python example.py > setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [0], 4) = 0 > setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 > > > For AF_ALG (Linux Kernel crypto) I need a way to call the C API function > setsockopt() with optval=NULL and optlen as any arbitrary number. I have > been playing with multiple ideas. So far I liked the idea of > value=(None, int) most. > > setsockopt(socket.SOL_ALG, socket.ALG_SET_AEAD_AUTHSIZE, (None, taglen)) > > What do you think? > > Christian > ___ > Python-Dev mailing list > [email protected] > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/benjamin%40python.org ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
