[For folks who aren't aware, we just had an intense three day hackathon in Oslo during which about a dozen of us tried to hash out new TAP extensions and write some sort of well formed spec. We got a lot done, but didn't have quite the clean resolution I was hoping for. Afterwards I had a number of revelations about the TAP development process which I'd like to share.]
If we're all in agreement about how a thing should be used, I get worried. We saw this with the nested TAP syntax in Oslo. 7pm the first day we thought we had it all figured out in 15 minutes. Next morning we found the flaw. It all fell apart and we still haven't put the pieces together. We're a very homogeneous group. We're Unix people. We're Perl programmers. We've all grown up with one particular way of testing and just two primary libraries to do it (Test::More and Test::Harness). If we all agree that something is only going to be used one way you can bet that we haven't thought it through well enough. This is the Test ANYTHING Protocol, but it's being designed by a group with a very narrow experience. We don't have voices from other languages, communities, and testing systems to provide a healthy mix of use cases. So we must step very carefully not on what we allow, but what we disallow. Something I observed about the discussions in the TAP room was that when we had clear, existing use cases to follow we did quite well. We agreed on what needed to be done and it was just a matter of making it work within the confines of the protocol. We were just supporting existing best practice. Paving the cow paths [1]. When we strayed off the cow paths, when we pushed into unknown territory, when there was not yet a clear best practice or burgeoning user need, when we didn't have any clear information about how a feature is going to be used the process fell apart. We argued, but not about technical details but details of use. "The feature will get used like this!" "No no, it'll get used like this!" "People want to read it like this" "No, I never do it that way, I do it like this". They became quite emotional and frustrating. Two points in particular: the test "contexts" [2] which got quite far along but broke down in details because we had never given the idea much thought before. After a lot of arguing that pulled in several other groups at the hackathon we eventually decided that we don't have enough information and haven't given it enough thought so we'll shelve it until we do. The other was user-defined YAML keys. What should we allow? What should we disallow? We should we reserve? Initially it started out with reserving lower case and leaving upper case to users. Then edge cases came up. And people worried what happened if users did crazy things with the keys? What happens if they name a key "#&(!#*("? Or they use an ambiguously cased Hungarian i? Or if they use a font that doesn't show up case vs down case well? Or if their TAP producer has a bug and spits out a lower case key as upper case (or vice versa) and they should be able to spot that! Quite rapidly everyone shifted over to thinking that we should only allow "X-foo" for user keys because it's unambiguous. Then we don't have to worry about characters that don't have an up/down-case concept. And we can eyeball a user vs reserved key slip. And it looks like mail headers and we're all used to reading mail headers. And we can always allow a wider use later. Etc... Seems like a fine solution. Everyone agreed but me. It seemed like I was just being a sore loser, and maybe I am, but I don't often dig in my heels unless I think it's really, really important to get it right. The last time that happened was the business about Test::Harness 3 merging STDOUT and STDERR which took months to resolve. I don't really care so much about doing "Foo" or "X-Foo". It's all an aesthetic choice. What worries me is that we're encoding an aesthetic choice at all. That we're proscribing behaviors because we think it might be ugly or hard to read or harmful or stupid or redundant or difficult to specify. We have too narrow a vision to make that decision. All we can truthfully say about the future is that our predictions will be wrong. If we proscribe what we think might be bad, because we're going to be wrong, we also proscribe what might be good. If we proscribe what is bad now, because things change we also proscribe what might be good later. If we write parsers now which are proscriptive, that complain if they see something we don't like such as a "Foo" key instead of "X-foo", we paint the protocol into a corner. Any relaxing of the protocol later becomes a parser error which is a roadblock to change. We saw it happen several times in the past and in Oslo. If you disallow non-TAP lines you make it impossible to extend the language without upgrading all the parsers in lock-step with the producers. If we make an unrecognized lower-cased "reserved" diagnostic key a parser error then we can't add more keys without another lock-step parser/producer upgrade. If parsers puke when they see a future version of TAP we make it difficult to add new, otherwise backwards compatible features. This is why we should be descriptive instead of proscriptive. Descriptive means to specify only what you need and leave the rest open. It provides a playground for users to fool around in and try things out that we'd never have thought of. It provides the cracks into which really clever people can wedge radical new ideas to advance in wild new directions. It's the flexibility that allows a language to survive for 20 years and all the unpredictable changes that come. Perl survives and grows that way, TAP should too. Yes, it means we allow people to do silly things, but what's silly is often subjective. Yes, it makes parsing a bit more difficult and the spec a bit more complicated, but we've always weighted TAP towards simplicity of human reading and machine writing. Yes, it means we can't lock down the meaning of everything, but that's ok. A little chaos is healthy. A little chaos will allow us to extend the protocol more without breaking everything. TAP is 20 years old and growing because of how little it says so little about how you do your testing. It's worked because it's always been about what people need to do, not about preventing them from doing what we think they shouldn't. That's why I dug in my heels on the user keys. Why I don't just give in when something doesn't feel right but I'm "out voted". I couldn't articulate it properly then but I hope I've made it clear. We were violating a very important TAP design principle. We had strayed off the cow paths. We were proscribing use based upon our own narrow experiences, what we like and what makes sense to us today. Worse, we were walling off users from carving their own cow paths for us to follow. If you disallow anything but "X-foo" you can never learn what people might have done otherwise. We were filling in the cracks that future users may use to extend the protocol past anything we'd ever thought of. [1] Or sleigh tracks in the case of Oslo [2] link to test contexts proposal -- s7ank: i want to be one of those guys that types "s/j&jd//.^$ueu*///djsls/sm." and it's a perl script that turns dog crap into gold.