On 13 Jun, 2012, at 2:22, Ken Thomases wrote:

> I can't claim to follow all of the details of how PyObjC works, but here are 
> my impressions:
> 
> On Jun 12, 2012, at 10:13 AM, Ronald Oussoren wrote:
> 
>> trim_chars = u"*"
>> #trim_chars = Cocoa.CFStringCreateWithCString(None, trim_chars, 
>> Cocoa.kCFStringEncodingASCII)
>> 
>> Cocoa.CFStringTrim(s, trim_chars)
> 
> Here, there are two paths, depending on whether that line above is commented 
> out.  Either a unicode object is implicitly converted to a CFString/NSString 
> or a it's an objc.pyobjc_unicode object wrapping what's already a 
> CFString/NSString.  What are the code paths used by the two cases?  A lot of 
> it is implicit and I'm not able to follow it.

Case 1:  trim_chars = u""   -> CFStringTrim's second argument is a Python 
unicode which is proxied as a OC_PythonUnicode object and that proxy is passed 
to the C function 
Case 2: trim_chars = Cocoa.CFString...    -> CFStringTrim's second argument is 
an instance of pyobjc_unicode, and the embedded NSString* value is passed to 
the C function

pyobjc_unicode is used to proxy an Objective-C string to Python. I'd prefer to 
not use a custom class for that,  but that would increase friction: you'd have 
to explicitly convert NSString instances to a python unicode object when using 
numerous Python APIs implemented in C.  Pyobjc_unicode shouldn't really be an 
issue for this problem though, the only instance of that class in the script is 
the instance of "s", and that's released as soon as "s.nsstring()" is called.

> 
>> For me te problem only occurs when I run this code with a 64-bit build of 
>> python ("arch -x86_64 python2.7 ...") and works fine in a 32-bit build 
>> ("arch -i386 python2.7 ..."). I have only tested on OSX 10.7, I'm currently 
>> traveling and cannot easily test on other releases.  To make life even more 
>> interesting, the problem only occurs when "PyObjC_UNICODE_FAST_PATH" is 
>> active.
> 
> I note that PyObjC_UNICODE_FAST_PATH affects PyObjCUnicode_New() as well as 
> the implementation of OC_PythonUnicode.  I recommend that, instead of 
> disabling it globally by tweaking the header, you disable it in each 
> translation unit independently to isolate which is the problem.  I'm sort of 
> suspecting it's PyObjCUnicode_New() rather than OC_PythonUnicode, especially 
> given that you tried disabling the implementation of -getCharacters:range:.

I've in effect disabled PyObjC_UNICODE_FAST_PATH for OC_PythonUnicode at least 
for now, and that fixes the issue as well. PyObjCUnicode_New should be in the 
clear.

My new implementation for OC_PythonUnicode always uses the __realObject__ 
trick, and only optimzes the implementation of __realObject__: when 
sizeof(unichar) == sizeof(Py_UNICODE) I create the NSString with the "NoCopy" 
variant of the NSString initializer.  This avoids duplicating the string 
contents, although there's still the unnecessary object that eats more memory.

I'll probably revisit this in the future, I'd prefer to get rid of the 
additional object where possible.  For now my current implementation works, and 
there are more important things to work on right now (not in the least getting 
an actual release out).

> 
> I also note that PyObjCUnicode_New() uses the deprecated method 
> -getCharacters: (no range) when the Unicode fast path is enabled.  Other than 
> that, I see no obvious problems with either code.

Good catch. This should be harmless, but -getCharacters:range: is saver. I'll 
update the code.

> 
> Given the 32-/64-bit difference, I was for a while suspecting that Py_UNICODE 
> might be 4-byte UCS32 under 64-bit, but I see that PyObjC_UNICODE_FAST_PATH 
> would not be enabled in that case. *shrug*

Luckily the size of Py_UNICODE is a configure-time constant, and 16 bits in the 
default builds of Python. In Python 3.3 and later Py_UNICODE is UCS4 
unconditionally, and Python's unicode object then uses UCS1, UCS2 or UCS4 as 
the backing store as appropriate.   And after some optimization Python 3.3 
unicode string is now as memory efficient and fast as Python 2.7's byte string 
for most code (according to discussions on python-dev, I haven't done 
benchmarking myself).

Anyway, thanks for your help.

I've also talked to an Apple engineer at WWDC, and the custom NSString subclass 
should just work, I was at one point worrying that CFString's APIs weren't 
guaranteed to work with custom NSString subclasses.

Ronald


Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
http://mail.python.org/mailman/listinfo/pythonmac-sig
unsubscribe: http://mail.python.org/mailman/options/Pythonmac-SIG

Reply via email to