On Mar 14, 2009, at 3:29 , Paul Sanders wrote:

How about perl instead? (I don't think egrep is a fair test, it
doesn't have to 'do anything' with the results, like create a new
string from them). This is a rough perl equivalent of my original
problem:

I guess that's the point I was trying to get across - the overhead of
creating all those strings (and whatever other temporary objects have to be
created behind the scenes) might be significant.

Right on the money.

http://xml.sys-con.com/node/250512 talks about this a bit in the context of XML, and I commented in http://www.metaobject.com/blog/2008_10_01_archive.html in the context of Cocoa and Objective-C.


The way to solve this problem is to (a) use C whenever you're dealing with bulk data that can be expressed in C (b) create only the final object representation, avoid intermediate object representations ( yes, I am looking at you, XML DOM and Cocoa Property List ) and (c) optimize the heck out any intermediate objects you can not actually avoid.

In Objective-XML, (a) is an unrolled character scanning state- machine. Item (b) is accomplished by using various NSString-based equivalence maps to construct a mapping layer that goes straight from the C-characters that make up the original XML to semantically relevant message-sends. Finally, (c) is achieved by re-using objects as much as possible with an object-/thread-local object cache, turning object alloc/dealloc cycles into a couple of pointer-manipulations without any of the locking or other synchronization that is usually necessary in memory ops.

(c) is not currently possible under GC, because it requires the retain- count to see if object-references have escaped.

As Michael Ash has been kind enough to document, object instantiation is
quite expensive :

http://www.mikeash.com/?page=pyblog/performance-comparisons-of-common-operations.html

Yes, object-allocation in Objective-C is rather expensive. If you are using objects to represent C-level constructs such as characters or numbers, you are looking at a roughly 100x overhead, which is difficult to recover from if you're dealing with any significant amounts of data (it obviously doesn't matter if you representing a single number or text string in the UI).

But of course, if you need to break your original text up into NSString's,
there's not much you can do about it.

Yep. However, with a little bit of creativity or pre-existing libraries that encapsulate that creativity, you can actually do around 10-20x better.

Marcel

_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to