Re: [Python-3000] Making more effective use of slice objects in Py3k

2006-09-02 Thread Marcin 'Qrczak' Kowalczyk
Greg Ewing <[EMAIL PROTECTED]> writes:

> It might be possible to represent it in a narrower format,
> however. Perhaps there should be an explicit operation for
> re-packing a string into the narrowest possible format?

I suppose it's better to always normalize a polymorphic string
representation. And always normalize bignums to fixnums (long->int).

It increases chances of using the more compact representation.
It doesn't add any asymptotic cost, it's done when the whole
object is to be allocated anyway (these are immutable objects).
It simplifies equality comparison.

The narrow formats should be statistically more common than wide
formats anyway.

Programmers should not be expected to care about explicitly calling
a normalization function.

-- 
   __("< Marcin Kowalczyk
   \__/   [EMAIL PROTECTED]
^^ http://qrnik.knm.org.pl/~qrczak/
___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com


[Python-3000] encoding hell

2006-09-02 Thread tomer filiba
i'm quite finished with the base of iostack (streams and layers), and
have moved to implementing the adpaters layer (especially the dreaded
TextAdapter).

as was discussed earlier, streams and layers work with bytes, while
adpaters may work with arbitrary objects (be it struct-style records,
serialized objects, characters and whatnot).

the question that arises is -- how far should we stretch this abstraction?
for example, the TextAdapter reads and writes characters to the
stream, after they go encoding or decoding, so from the programmer's
point of view, he's working with *characters*, not *bytes*.
that means the programmer need not be aware of how the characters
are "physically" stored in the underlying stream.

that's all very nice, but what do we do when it comes to seek()ing?
do you want to seek by character position or by byte position?
logically you are working with characters, but it would be impossible
to implement without first decoding the entire stream in-memory...
which is unacceptable of course.

and if seek()ing is byte-oriented, then you must somehow seek
only to the beginning of a multibyte character sequence... how
would you do that?

my solution would be completely leaving seek() and tell() out of the
3rd layer -- it's a byte-level operation.

anyone thinks differently? if so, what's your solution?

- - - -

you can find the latest sources here (note: i haven't tested it yet,
many things are likely to be broken, it's still being redesigned):
http://sebulbasvn.googlecode.com/svn/trunk/iostack/
http://sebulbasvn.googlecode.com/svn/trunk/sock2/


-tomer
___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com


[Python-3000] The future of exceptions

2006-09-02 Thread Georg Brandl
While looking at the changes necessary to implement the exception
related syntax changes (except ... as ..., raise without type),
I came across some more substantial things that I think must be discussed.

* How should exceptions be represented in C code? Should there still
  be a (type, value, traceback) triple?

* Could the traceback be made an attribute of the exception?

* What about exception chaining?

Something like this comes to mind::

try:
whatever
except ValueError as err:
raise CustomException("Something went wrong", prev=err)

With tracebacks becoming part of the exception, that could be::

raise CustomException(*args, prev=err, tb=traceback)

(`prev` and `tb` would be keyword-only arguments)

With that, all exception info would be contained in one object,
so sys.exc_info() could be renamed to sys.last_exc().

cheers,
Georg

___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com


Re: [Python-3000] The future of exceptions

2006-09-02 Thread Marcin 'Qrczak' Kowalczyk
Georg Brandl <[EMAIL PROTECTED]> writes:

> * Could the traceback be made an attribute of the exception?
>
> * What about exception chaining?
>
> Something like this comes to mind::
>
> try:
> whatever
> except ValueError as err:
> raise CustomException("Something went wrong", prev=err)

In my language the traceback is materialized from the stack only
if needed (typically when an exception escapes from the toplevel),
and it includes the history of other exceptions thrown from exception
handlers, intermingled with source locations. The stack is not
physically unwound until an exception handler completes successfully,
so the data is available until then.

For example the above (without storing prev) would include:
- locations of active functions leading to whatever
- the location of whatever when the value error is raised
- exception: the ValueError instance
- the location of raise CustomException
- exception: the CustomException instance

Printing the stack trace recognizes when the same exception object is
reraised again, and prints this as a propagation instead of repeating
the exception description.

Of course this design is suitable only if the previous exception
is used merely for printing the stack trace, not for unpacking and
examining by the program.

I don't know how Python stack traces are implemented, so I have no
idea whether this would be practical for Python, assuming it would be
desirable at all.

-- 
   __("< Marcin Kowalczyk
   \__/   [EMAIL PROTECTED]
^^ http://qrnik.knm.org.pl/~qrczak/
___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com


Re: [Python-3000] encoding hell

2006-09-02 Thread Talin
tomer filiba wrote:
> i'm quite finished with the base of iostack (streams and layers), and
> have moved to implementing the adpaters layer (especially the dreaded
> TextAdapter).
> 
> as was discussed earlier, streams and layers work with bytes, while
> adpaters may work with arbitrary objects (be it struct-style records,
> serialized objects, characters and whatnot).
> 
> the question that arises is -- how far should we stretch this abstraction?
> for example, the TextAdapter reads and writes characters to the
> stream, after they go encoding or decoding, so from the programmer's
> point of view, he's working with *characters*, not *bytes*.
> that means the programmer need not be aware of how the characters
> are "physically" stored in the underlying stream.
> 
> that's all very nice, but what do we do when it comes to seek()ing?
> do you want to seek by character position or by byte position?
> logically you are working with characters, but it would be impossible
> to implement without first decoding the entire stream in-memory...
> which is unacceptable of course.
> 
> and if seek()ing is byte-oriented, then you must somehow seek
> only to the beginning of a multibyte character sequence... how
> would you do that?
> 
> my solution would be completely leaving seek() and tell() out of the
> 3rd layer -- it's a byte-level operation.
> 
> anyone thinks differently? if so, what's your solution?

Well, for comparison with other APIs:

The .Net equivalent, System.IO.TextReader, does not have a "seek" method 
at all.

The Java version, Java.io.BufferedReader, has a "skip()" method which 
only allows seeking forward.

Sounds to me like copying the Java model would work.

-- Talin
___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com


Re: [Python-3000] The future of exceptions

2006-09-02 Thread Brett Cannon
On 9/2/06, Georg Brandl <[EMAIL PROTECTED]> wrote:
While looking at the changes necessary to implement the exceptionrelated syntax changes (except ... as ..., raise without type),I came across some more substantial things that I think must be discussed.
You have read Ping's PEP 344, right? * How should exceptions be represented in C code? Should there still
  be a (type, value, traceback) triple?* Could the traceback be made an attribute of the exception?The problem with this is that it keeps the frame alive.  This is why this and exception chaining were considered a design issue in Ping's PEP since that is a lot of stuff to keep alive.
* What about exception chaining?Something like this comes to mind::
try:whateverexcept ValueError as err:raise CustomException("Something went wrong", prev=err)With tracebacks becoming part of the exception, that could be::
raise CustomException(*args, prev=err, tb=traceback)(`prev` and `tb` would be keyword-only arguments)With that, all exception info would be contained in one object,so sys.exc_info() could be renamed to 
sys.last_exc().Right, which is why the original suggestion came up in the first place.  It would be nice to compartmentalize exceptions entirely, but the worry of keeping a ont of memory alive for it needs to be addressed, especially if exceptions are to be kept lightweight and usable for things other than flagging errors.
-Brett
___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com


Re: [Python-3000] encoding hell

2006-09-02 Thread tomer filiba
[Talin]
> The Java version, Java.io.BufferedReader, has a "skip()" method which
> only allows seeking forward.
> Sounds to me like copying the Java model would work.

then there's no need for it at all... just read() and discard the return value.
we don't need a special API for that.

on the other hand, the .NET version has a BaseStream attribute holding
the underlying stream over which the StreamReader operates... this
means you *can* change the position if the underlying stream supports
seeking.

i read through the msdn but found no explicit definition for what happens
in the case of seeking in text-encoded streams, but they noted
somewhere they use a "best fit" decoder, which, to the best of my
understanding, may skip some bytes until it's in synch with the stream.

that's a *horrible* design, imho, but that's microsoft. i say let's leave it
below layer 3, at the byte level. if users find seeking very important,
we can come up with a layer-2 ReSyncLayer, which will attempt to
come in synch with a specified encoding.

for example:

f = TextAdapter(
ReSyncLayer(
BufferedLayer(
FileStream("blah", "r")
),
encoding = "utf8"
),
encoding = "utf8"
)

# read 3 UTF8 *characters*
f.read(3)

# this will seek by AT LEAST 7 *bytes*, until resynched
f.substream.seekby(7)

# we can resume reading of UTF8 *characters*
f.read(3)

heck, i even like this idea :)
thanks for the pointers.


-tomer

On 9/2/06, Talin <[EMAIL PROTECTED]> wrote:
> tomer filiba wrote:
> > i'm quite finished with the base of iostack (streams and layers), and
> > have moved to implementing the adpaters layer (especially the dreaded
> > TextAdapter).
> >
> > as was discussed earlier, streams and layers work with bytes, while
> > adpaters may work with arbitrary objects (be it struct-style records,
> > serialized objects, characters and whatnot).
> >
> > the question that arises is -- how far should we stretch this abstraction?
> > for example, the TextAdapter reads and writes characters to the
> > stream, after they go encoding or decoding, so from the programmer's
> > point of view, he's working with *characters*, not *bytes*.
> > that means the programmer need not be aware of how the characters
> > are "physically" stored in the underlying stream.
> >
> > that's all very nice, but what do we do when it comes to seek()ing?
> > do you want to seek by character position or by byte position?
> > logically you are working with characters, but it would be impossible
> > to implement without first decoding the entire stream in-memory...
> > which is unacceptable of course.
> >
> > and if seek()ing is byte-oriented, then you must somehow seek
> > only to the beginning of a multibyte character sequence... how
> > would you do that?
> >
> > my solution would be completely leaving seek() and tell() out of the
> > 3rd layer -- it's a byte-level operation.
> >
> > anyone thinks differently? if so, what's your solution?
>
> Well, for comparison with other APIs:
>
> The .Net equivalent, System.IO.TextReader, does not have a "seek" method
> at all.
>
> The Java version, Java.io.BufferedReader, has a "skip()" method which
> only allows seeking forward.
>
> Sounds to me like copying the Java model would work.
>
> -- Talin
>
___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com


Re: [Python-3000] encoding hell

2006-09-02 Thread Greg Ewing
tomer filiba wrote:

> my solution would be completely leaving seek() and tell() out of the
> 3rd layer -- it's a byte-level operation.

That's what I'd recommend, too. Seeking doesn't make
sense when the underlying units aren't fixed-length.

The best you could do would be to return some kind
of opaque object from tell() that could be passed
back to seek(). But I'm far from convinced that
would be worth the trouble.

--
Greg

___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com


Re: [Python-3000] The future of exceptions

2006-09-02 Thread Calvin Spealman
On 9/2/06, Brett Cannon <[EMAIL PROTECTED]> wrote:
> Right, which is why the original suggestion came up in the first place.  It
> would be nice to compartmentalize exceptions entirely, but the worry of
> keeping a ont of memory alive for it needs to be addressed, especially if
> exceptions are to be kept lightweight and usable for things other than
> flagging errors.
>
> -Brett

So, at issue is attaching tracebacks to exceptions keeps too much
alive and thus makes exceptions too heavy? If the traceback was passed
to the exception constructor and then held as an attribute of the
exception, any exception meant for "light" work (ie., not normal error
flagging) could simply decided not to include the traceback, and so it
would be destroyed, removing the weight from the exception. Similarly,
tracebacks could have some lean() method to drop references to the
frames.
___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com


Re: [Python-3000] The future of exceptions

2006-09-02 Thread Brett Cannon
On 9/2/06, Calvin Spealman <[EMAIL PROTECTED]> wrote:
On 9/2/06, Brett Cannon <[EMAIL PROTECTED]> wrote:> Right, which is why the original suggestion came up in the first place.  It> would be nice to compartmentalize exceptions entirely, but the worry of
> keeping a ont of memory alive for it needs to be addressed, especially if> exceptions are to be kept lightweight and usable for things other than> flagging errors.>> -BrettSo, at issue is attaching tracebacks to exceptions keeps too much
alive and thus makes exceptions too heavy?Basically.  Memory usage goes up if you do this as it stands now. 
 If the traceback was passedto the exception constructor and then held as an attribute of theexception, any exception meant for "light" work (ie., not normal errorflagging) could simply decided not to include the traceback, and so it
would be destroyed, removing the weight from the exception. Similarly,tracebacks could have some lean() method to drop references to theframes.Problem with that is you then lose any API guarantees of the traceback being there, which would mean you would still need to keep around 
sys.exc_info().-Brett
___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com