C E Macfarlane wrote:
Thinking about this a bit more, I wouldn't wish to claim a spurious hit was more likely with no upper limit, but nevertheless I would still regard it better programming practice to have one - with normal written English, the potential for spurious hits would be low, and in the event of one it would be delimited quickly by the next space, but if you were trawling raw HTML or similar code, which might contain longs strings of pseudo-random characters as not just PIDs, but also GUIDs, session keys, and the like, then the potential for spurious hit would be very much increased, so more would be found, and in the interests of program efficiency you'd want them to be delimited sooner rather than later.
This is reasonable. The regexp without an upper limit sourced from the BBC's code is used to confirm that a given string is formed only of characters from an acceptable set to make up a PID. In most cases the string which is passed in is explicitly extracted from the request URL, as the application in question is a server-side, web-based one. For such purposes I think the lack of an upper limit is completely acceptable, but if you're writing code to extract a valid PID from text of unknown length or complexity, the regexp probably is not very efficient.
-- James Scholes http://twitter.com/JamesScholes _______________________________________________ get_iplayer mailing list [email protected] http://lists.infradead.org/mailman/listinfo/get_iplayer

