Re: [Python-3000] Making more effective use of slice objects in Py3k
Greg Ewing <[EMAIL PROTECTED]> writes: > It might be possible to represent it in a narrower format, > however. Perhaps there should be an explicit operation for > re-packing a string into the narrowest possible format? I suppose it's better to always normalize a polymorphic string representation. And always normalize bignums to fixnums (long->int). It increases chances of using the more compact representation. It doesn't add any asymptotic cost, it's done when the whole object is to be allocated anyway (these are immutable objects). It simplifies equality comparison. The narrow formats should be statistically more common than wide formats anyway. Programmers should not be expected to care about explicitly calling a normalization function. -- __("< Marcin Kowalczyk \__/ [EMAIL PROTECTED] ^^ http://qrnik.knm.org.pl/~qrczak/ ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
[Python-3000] encoding hell
i'm quite finished with the base of iostack (streams and layers), and have moved to implementing the adpaters layer (especially the dreaded TextAdapter). as was discussed earlier, streams and layers work with bytes, while adpaters may work with arbitrary objects (be it struct-style records, serialized objects, characters and whatnot). the question that arises is -- how far should we stretch this abstraction? for example, the TextAdapter reads and writes characters to the stream, after they go encoding or decoding, so from the programmer's point of view, he's working with *characters*, not *bytes*. that means the programmer need not be aware of how the characters are "physically" stored in the underlying stream. that's all very nice, but what do we do when it comes to seek()ing? do you want to seek by character position or by byte position? logically you are working with characters, but it would be impossible to implement without first decoding the entire stream in-memory... which is unacceptable of course. and if seek()ing is byte-oriented, then you must somehow seek only to the beginning of a multibyte character sequence... how would you do that? my solution would be completely leaving seek() and tell() out of the 3rd layer -- it's a byte-level operation. anyone thinks differently? if so, what's your solution? - - - - you can find the latest sources here (note: i haven't tested it yet, many things are likely to be broken, it's still being redesigned): http://sebulbasvn.googlecode.com/svn/trunk/iostack/ http://sebulbasvn.googlecode.com/svn/trunk/sock2/ -tomer ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
[Python-3000] The future of exceptions
While looking at the changes necessary to implement the exception related syntax changes (except ... as ..., raise without type), I came across some more substantial things that I think must be discussed. * How should exceptions be represented in C code? Should there still be a (type, value, traceback) triple? * Could the traceback be made an attribute of the exception? * What about exception chaining? Something like this comes to mind:: try: whatever except ValueError as err: raise CustomException("Something went wrong", prev=err) With tracebacks becoming part of the exception, that could be:: raise CustomException(*args, prev=err, tb=traceback) (`prev` and `tb` would be keyword-only arguments) With that, all exception info would be contained in one object, so sys.exc_info() could be renamed to sys.last_exc(). cheers, Georg ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] The future of exceptions
Georg Brandl <[EMAIL PROTECTED]> writes: > * Could the traceback be made an attribute of the exception? > > * What about exception chaining? > > Something like this comes to mind:: > > try: > whatever > except ValueError as err: > raise CustomException("Something went wrong", prev=err) In my language the traceback is materialized from the stack only if needed (typically when an exception escapes from the toplevel), and it includes the history of other exceptions thrown from exception handlers, intermingled with source locations. The stack is not physically unwound until an exception handler completes successfully, so the data is available until then. For example the above (without storing prev) would include: - locations of active functions leading to whatever - the location of whatever when the value error is raised - exception: the ValueError instance - the location of raise CustomException - exception: the CustomException instance Printing the stack trace recognizes when the same exception object is reraised again, and prints this as a propagation instead of repeating the exception description. Of course this design is suitable only if the previous exception is used merely for printing the stack trace, not for unpacking and examining by the program. I don't know how Python stack traces are implemented, so I have no idea whether this would be practical for Python, assuming it would be desirable at all. -- __("< Marcin Kowalczyk \__/ [EMAIL PROTECTED] ^^ http://qrnik.knm.org.pl/~qrczak/ ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] encoding hell
tomer filiba wrote: > i'm quite finished with the base of iostack (streams and layers), and > have moved to implementing the adpaters layer (especially the dreaded > TextAdapter). > > as was discussed earlier, streams and layers work with bytes, while > adpaters may work with arbitrary objects (be it struct-style records, > serialized objects, characters and whatnot). > > the question that arises is -- how far should we stretch this abstraction? > for example, the TextAdapter reads and writes characters to the > stream, after they go encoding or decoding, so from the programmer's > point of view, he's working with *characters*, not *bytes*. > that means the programmer need not be aware of how the characters > are "physically" stored in the underlying stream. > > that's all very nice, but what do we do when it comes to seek()ing? > do you want to seek by character position or by byte position? > logically you are working with characters, but it would be impossible > to implement without first decoding the entire stream in-memory... > which is unacceptable of course. > > and if seek()ing is byte-oriented, then you must somehow seek > only to the beginning of a multibyte character sequence... how > would you do that? > > my solution would be completely leaving seek() and tell() out of the > 3rd layer -- it's a byte-level operation. > > anyone thinks differently? if so, what's your solution? Well, for comparison with other APIs: The .Net equivalent, System.IO.TextReader, does not have a "seek" method at all. The Java version, Java.io.BufferedReader, has a "skip()" method which only allows seeking forward. Sounds to me like copying the Java model would work. -- Talin ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] The future of exceptions
On 9/2/06, Georg Brandl <[EMAIL PROTECTED]> wrote: While looking at the changes necessary to implement the exceptionrelated syntax changes (except ... as ..., raise without type),I came across some more substantial things that I think must be discussed. You have read Ping's PEP 344, right? * How should exceptions be represented in C code? Should there still be a (type, value, traceback) triple?* Could the traceback be made an attribute of the exception?The problem with this is that it keeps the frame alive. This is why this and exception chaining were considered a design issue in Ping's PEP since that is a lot of stuff to keep alive. * What about exception chaining?Something like this comes to mind:: try:whateverexcept ValueError as err:raise CustomException("Something went wrong", prev=err)With tracebacks becoming part of the exception, that could be:: raise CustomException(*args, prev=err, tb=traceback)(`prev` and `tb` would be keyword-only arguments)With that, all exception info would be contained in one object,so sys.exc_info() could be renamed to sys.last_exc().Right, which is why the original suggestion came up in the first place. It would be nice to compartmentalize exceptions entirely, but the worry of keeping a ont of memory alive for it needs to be addressed, especially if exceptions are to be kept lightweight and usable for things other than flagging errors. -Brett ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] encoding hell
[Talin] > The Java version, Java.io.BufferedReader, has a "skip()" method which > only allows seeking forward. > Sounds to me like copying the Java model would work. then there's no need for it at all... just read() and discard the return value. we don't need a special API for that. on the other hand, the .NET version has a BaseStream attribute holding the underlying stream over which the StreamReader operates... this means you *can* change the position if the underlying stream supports seeking. i read through the msdn but found no explicit definition for what happens in the case of seeking in text-encoded streams, but they noted somewhere they use a "best fit" decoder, which, to the best of my understanding, may skip some bytes until it's in synch with the stream. that's a *horrible* design, imho, but that's microsoft. i say let's leave it below layer 3, at the byte level. if users find seeking very important, we can come up with a layer-2 ReSyncLayer, which will attempt to come in synch with a specified encoding. for example: f = TextAdapter( ReSyncLayer( BufferedLayer( FileStream("blah", "r") ), encoding = "utf8" ), encoding = "utf8" ) # read 3 UTF8 *characters* f.read(3) # this will seek by AT LEAST 7 *bytes*, until resynched f.substream.seekby(7) # we can resume reading of UTF8 *characters* f.read(3) heck, i even like this idea :) thanks for the pointers. -tomer On 9/2/06, Talin <[EMAIL PROTECTED]> wrote: > tomer filiba wrote: > > i'm quite finished with the base of iostack (streams and layers), and > > have moved to implementing the adpaters layer (especially the dreaded > > TextAdapter). > > > > as was discussed earlier, streams and layers work with bytes, while > > adpaters may work with arbitrary objects (be it struct-style records, > > serialized objects, characters and whatnot). > > > > the question that arises is -- how far should we stretch this abstraction? > > for example, the TextAdapter reads and writes characters to the > > stream, after they go encoding or decoding, so from the programmer's > > point of view, he's working with *characters*, not *bytes*. > > that means the programmer need not be aware of how the characters > > are "physically" stored in the underlying stream. > > > > that's all very nice, but what do we do when it comes to seek()ing? > > do you want to seek by character position or by byte position? > > logically you are working with characters, but it would be impossible > > to implement without first decoding the entire stream in-memory... > > which is unacceptable of course. > > > > and if seek()ing is byte-oriented, then you must somehow seek > > only to the beginning of a multibyte character sequence... how > > would you do that? > > > > my solution would be completely leaving seek() and tell() out of the > > 3rd layer -- it's a byte-level operation. > > > > anyone thinks differently? if so, what's your solution? > > Well, for comparison with other APIs: > > The .Net equivalent, System.IO.TextReader, does not have a "seek" method > at all. > > The Java version, Java.io.BufferedReader, has a "skip()" method which > only allows seeking forward. > > Sounds to me like copying the Java model would work. > > -- Talin > ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] encoding hell
tomer filiba wrote: > my solution would be completely leaving seek() and tell() out of the > 3rd layer -- it's a byte-level operation. That's what I'd recommend, too. Seeking doesn't make sense when the underlying units aren't fixed-length. The best you could do would be to return some kind of opaque object from tell() that could be passed back to seek(). But I'm far from convinced that would be worth the trouble. -- Greg ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] The future of exceptions
On 9/2/06, Brett Cannon <[EMAIL PROTECTED]> wrote: > Right, which is why the original suggestion came up in the first place. It > would be nice to compartmentalize exceptions entirely, but the worry of > keeping a ont of memory alive for it needs to be addressed, especially if > exceptions are to be kept lightweight and usable for things other than > flagging errors. > > -Brett So, at issue is attaching tracebacks to exceptions keeps too much alive and thus makes exceptions too heavy? If the traceback was passed to the exception constructor and then held as an attribute of the exception, any exception meant for "light" work (ie., not normal error flagging) could simply decided not to include the traceback, and so it would be destroyed, removing the weight from the exception. Similarly, tracebacks could have some lean() method to drop references to the frames. ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
Re: [Python-3000] The future of exceptions
On 9/2/06, Calvin Spealman <[EMAIL PROTECTED]> wrote: On 9/2/06, Brett Cannon <[EMAIL PROTECTED]> wrote:> Right, which is why the original suggestion came up in the first place. It> would be nice to compartmentalize exceptions entirely, but the worry of > keeping a ont of memory alive for it needs to be addressed, especially if> exceptions are to be kept lightweight and usable for things other than> flagging errors.>> -BrettSo, at issue is attaching tracebacks to exceptions keeps too much alive and thus makes exceptions too heavy?Basically. Memory usage goes up if you do this as it stands now. If the traceback was passedto the exception constructor and then held as an attribute of theexception, any exception meant for "light" work (ie., not normal errorflagging) could simply decided not to include the traceback, and so it would be destroyed, removing the weight from the exception. Similarly,tracebacks could have some lean() method to drop references to theframes.Problem with that is you then lose any API guarantees of the traceback being there, which would mean you would still need to keep around sys.exc_info().-Brett ___ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com