# from David Nicol # on Wednesday 23 May 2007 02:36 pm: >but perl syntax is expressive enough, that I have trouble imagining >when a data structure would be something to put in a module, rather >than create as needed
I did used to have the "don't need no stinkin' objects" attitude (which may or may not be what you're saying here), but large codebases seem to have beaten it out of me. I do still *deeply* appreciate that Perl has a spectrum of formality available. This means I can bang-out the simplest thing that could possibly work and then step back to look at it and say "I'm an idiot" when I see something simpler. All this without creating 20 classes or whatever -- but once I see the implementation start to leak outside of a single subroutine it really starts to look like an object would be more robust. >What do you mean by "data-structure module?" Trees, ordered hashes, and sets just to name a few. The vocabulary quickly gets pretty vague and I've asked "what do I call this structure/pattern" kinds of questions here more than once before. (That's precisely why I wonder if we need something besides a standard search scheme specifically for data structures -- because they tend to travel under many aliases.) Whenever the form of the data starts to have rules, it's quite likely that somebody is going to come along and break the rules unless you encapsulate it in some way. Note these rules aren't necessarily about what data/types can and cannot be stored or whether swear words are allowed (though those things can sometimes be a valid use of encapsulation.) In this particular case, it is just a matter of maintaining the data integrity and managing psuedo-asynchronous access to it. Testing and code reuse are also good reasons to modularize a data structure. Anything non-trivial should be tested against regression, but focussed testing just isn't possible if it is implemented as ad-hoc scattered bits of code juggling the contents of a reference. Even typo'd hash keys can be a hair-pulling problem when code grows beyond a few hundred lines. Having well-tested objects handling all of the juggly bits means you spend your time chasing bugs in your code and not your data structures. In this particular case, the array-of-arrays has a few important characteristics: 1. each entry is identifiable (the id's need to be unique within the object and therefore should not have to be user-supplied -- adding an item should return this id for future reference.) 2. There is a "current" item. (By convention, the end of the list, but I might want to extend this to an arbitrary "cursor" concept -- at which point encapsulation becomes important because removing an item now impacts not only the array state, but also possibly requires the cursor to be adjusted.) 3. An item may be deleted from any position (not just the end.) So, it is desirable to make an object and encapsulate the data-management. Attempting to access missing data throws an error, the cursor is always right, etc. I get to trust that the object will "just do it's job" (delegation.) It just makes the code more concise, robust, and workable. Spreading data-structure code around in an ad-hoc way quickly becomes unmanageable once there are a few data structures involved in the same code block. Taking that sort of practice to an extreme, you end up with monolithic code in one package using implicit global variables with no hope of ever saying 'use strict' without a complete rewrite. The converse of all this is of course when you spend time looking for said code on CPAN and don't find it. --Eric -- Turns out the optimal technique is to put it in reverse and gun it. --Steven Squyres (on challenges in interplanetary robot navigation) --------------------------------------------------- http://scratchcomputing.com ---------------------------------------------------