Re: [Python-Dev] interning

2009-10-24 Thread Antoine Pitrou
Alexander Belopolsky gmail.com> writes: > > AB> I disagree with Martin. I think interning is a set > AB> operation and it is unfortunate that set API does not > AB> support it directly. > > ML> I disagree with Alexander's last remark in several respects: [...] > ML> The operation "give me the me

Re: [Python-Dev] Interning string subtype instances

2007-02-14 Thread Josiah Carlson
Greg Ewing <[EMAIL PROTECTED]> wrote: > > Josiah Carlson wrote: > > > Assuming that dictionaries and the hash algorithm for strings is not > > hopelessly broken, I believe that one discovers quite quickly when two > > strings are not equal. > > You're probably right, since if there's a hash col

Re: [Python-Dev] Interning string subtype instances

2007-02-14 Thread Greg Ewing
Larry Hastings wrote: > If I understand your question correctly, you're saying "why doesn't > string comparison take advantage of interned strings?" No, I understand that it takes advantage of it when the strings are equal. I was wondering about the case where they're not equal. But as has been

Re: [Python-Dev] Interning string subtype instances

2007-02-14 Thread Greg Ewing
Josiah Carlson wrote: > Assuming that dictionaries and the hash algorithm for strings is not > hopelessly broken, I believe that one discovers quite quickly when two > strings are not equal. You're probably right, since if there's a hash collision, the colliding strings probably differ in the fir

Re: [Python-Dev] Interning string subtype instances

2007-02-14 Thread Martin v. Löwis
Larry Hastings schrieb: > If I understand your question correctly, you're saying "why doesn't > string comparison take advantage of interned strings?" If so, the > answer is "it does". Examine string_richcompare() in stringobject.c, > and PyUnicode_compare() in unicodeobject.c. Both functions

Re: [Python-Dev] Interning string subtype instances

2007-02-14 Thread Larry Hastings
Greg Ewing wrote: > Can anyone shed any light on this? It seems to me that > by not using this information, only half the benefit of > interning is being achieved. If I understand your question correctly, you're saying "why doesn't string comparison take advantage of interned strings?" If so, the

Re: [Python-Dev] Interning string subtype instances

2007-02-14 Thread Josiah Carlson
Greg Ewing <[EMAIL PROTECTED]> wrote: > > Josiah Carlson wrote: > > def intern(st): > > ... > > > > If I remember the implementation of intern correctly, that's more or > > less what happens under the covers. > > That doesn't quite give you everything that real interning > does, though. The

Re: [Python-Dev] Interning string subtype instances

2007-02-14 Thread Martin v. Löwis
Greg Ewing schrieb: > It's certainly possible to tell very easily whether > a string is interned -- there's a PyString_CHECK_INTERNED > macro that tests a field in the string object header. > But, I can't find anywhere that it's used in the core, > apart from the interning code itself and the strin

Re: [Python-Dev] Interning string subtype instances

2007-02-13 Thread Greg Ewing
Martin v. Löwis wrote: > Greg Ewing schrieb: > > > The string comparison method knows when both > > strings are interned > > No, it doesn't - see stringobject.c:string_richcompare. Well, I'm surprised and confused. It's certainly possible to tell very easily whether a string is interned -- the

Re: [Python-Dev] Interning string subtype instances

2007-02-13 Thread Greg Ewing
Martin v. Löwis wrote: > OTOH, in an application that needs unique strings, you normally know > what the scope is (i.e. where the strings come from, and when they > aren't used anymore). That's true -- if you know that all relevant strings have been interned using the appropriate method, you can

Re: [Python-Dev] Interning string subtype instances

2007-02-13 Thread Hrvoje Nikšić
On Wed, 2007-02-07 at 15:39 +0100, Hrvoje Nikšić wrote: > The patch could look like this. If there is interest in this, I can > produce a complete patch. The complete patch is now available on SourceForge: http://sourceforge.net/tracker/index.php?func=detail&aid=1658799&group_id=5470&atid=305470

Re: [Python-Dev] Interning string subtype instances

2007-02-13 Thread Martin v. Löwis
Greg Ewing schrieb: > That doesn't quite give you everything that real interning > does, though. The string comparison method knows when both > strings are interned, so it can compare them quickly > whether they are equal or not. No, it doesn't - see stringobject.c:string_richcompare. If they ar

Re: [Python-Dev] Interning string subtype instances

2007-02-13 Thread Martin v. Löwis
Hrvoje Nikšić schrieb: > Another reason is that Python's interning mechanism is much better than > such a simple implementation: it stores the interned state directly in > the PyString_Object structure, so you can find out that a string is > already interned without looking it up in the dictionary.

Re: [Python-Dev] Interning string subtype instances

2007-02-13 Thread Hrvoje Nikšić
On Mon, 2007-02-12 at 12:29 -0800, Josiah Carlson wrote: > Hrvoje Nikšic <[EMAIL PROTECTED]> wrote: > > > > I propose modifying PyString_InternInPlace to better cope with string > > subtype instances. > > Any particular reason why the following won't work for you? [... snipped a simple intern imp

Re: [Python-Dev] Interning string subtype instances

2007-02-12 Thread Greg Ewing
Josiah Carlson wrote: > def intern(st): > ... > > If I remember the implementation of intern correctly, that's more or > less what happens under the covers. That doesn't quite give you everything that real interning does, though. The string comparison method knows when both strings are intern

Re: [Python-Dev] Interning string subtype instances

2007-02-12 Thread Josiah Carlson
Hrvoje Nikšic <[EMAIL PROTECTED]> wrote: > > I propose modifying PyString_InternInPlace to better cope with string > subtype instances. Any particular reason why the following won't work for you? _interned = {} def intern(st): #add optional type checking here if st not in _interned:

Re: [Python-Dev] Interning string subtype instances

2007-02-12 Thread Martin v. Löwis
Hrvoje Nikšić schrieb: > The patch could look like this. If there is interest in this, I can > produce a complete patch. I can't see a problem with that (although I do wonder why people create string subtypes in the first place). Regards, Martin ___ P

Re: [Python-Dev] Interning string subtype instances

2007-02-12 Thread Martin v. Löwis
Mike Klaas schrieb: >> cause problems for other users of the interned string. I agree with the >> reasoning, but propose a different solution: when interning an instance >> of a string subtype, PyString_InternInPlace could simply intern a copy. > > Interning currently requires an external referen

Re: [Python-Dev] Interning string subtype instances

2007-02-12 Thread Mike Klaas
On 2/12/07, Hrvoje Nikšić <[EMAIL PROTECTED]> wrote: > cause problems for other users of the interned string. I agree with the > reasoning, but propose a different solution: when interning an instance > of a string subtype, PyString_InternInPlace could simply intern a copy. Interning currently r

[Python-Dev] Interning string subtype instances

2007-02-12 Thread Hrvoje Nikšić
I propose modifying PyString_InternInPlace to better cope with string subtype instances. Current implementation of PyString_InternInPlace does nothing and returns if passed an instance of a subtype of PyString_Type. This is a problem for applications that need to support string subtypes, but also