Glenn Linderman schrieb:
It does seem, however, that there could be a large class of users of A::Z that are read-only, and that solutions within A::Z would be better overall than building caching structures on the outside.

Steffen, you mention "the array implementation" and replacing it with a "hash-based" implementation. Is the root problem, then, some O(n^2) algorithm for lookups by name searching through "the array implementation" to find the name? Or what?

I don't understand where "this bites if somebody does a $member->fileName("NewName").", nor why. A little more verbose explanation of the problem might result in more ideas being generated.

And David mentions caching internally... but only for readonly use... my point, perhaps already thought of and discarded for some valid reason, is that perhaps it would be possible for A::Z to do the caching internally, but support updates to the cache for modification operations? Caches don't necessarily need to be read-only, they can be updatable... if there is a central place where the "array" data structure is maintained, could not that code be modified to also update the (internal) cache, to allow higher performance interfaces?

The problem is that A::Z implements all matches on file names of Zip members as a search in the array. It's O(n). That might seem decent, but in case you wonder whether a specific file is in the Zip, you want a O(1) hash lookup. Additionally, A::Z calls ->fileName() on every file-object every time it looks for an object by its file name. That's immensely costly.

I first replaced the array with a file_name => $file_obj hash. That was a problem because the A::Z tests and possibly some user code rely on the order of the zip members. Then, I hacked it to become a Tie::IxHash (ordered hash) implementation. That worked until the following kicked in:

$file_obj has a file name! Associating it with a static copy of that name outside of the object itself breaks its encapsulation because the user can do "$file_obj->fileName($newName)" to set it. But that doesn't change the association in the Archive object (which is now a hash/IxHash). Working around that case requires that a $file_obj has a reference to its "parent" Archive. But I'm not even sure that $file_obj can't belong to several archives, etc. That's really, really tedious and error prone to implement. (I'm not going to do it.)

Does that make more sense?

(David: Laugh at me now for not realizing the encapsulation issue earlier!)

Steffen

Reply via email to