On Wed, Mar 06, 2002 at 09:45:18AM -0800, Rich Morin wrote: > Going a bit further afield, I also started thinking about the general > nature of printf. I've been using this basic syntax since 1970 (in > the form of Fortran's FORMAT statements :-), so I'm pretty comfortable > with it. OTOH, I don't like the fact that the format specifications > can become widely separated from the variables they reference.
I agree that this is a horribly "illiterate" aspect of printf.... > With all of Larry's talk about making "x" mode the standard in REs and > having more "pair-based" syntax here and there, I started thinking > about a replacement for printf, as: > > printx( > 'The value of $foo is %f7.3; ', $foo, > 'the value of $bar is %f7.3.%n', $bar > ); .... But why propose such an off-the-wall solution? Wouldn't it make more sense to make it more like interpolation? Eg, printx 'The value of $foo is %f7.3{foo}', { foo => $foo }; ^^^ key of the following hash Tangent: One crucial but often overlooked aspect of designing "format string" schemes is that they can, with some care, facilitate internationalization. C format strings are actually pretty good for this, due to the following characteristics: - The translator doesn't have to touch code, only format strings. This is obviously desirable. - C format strings are fairly "safe", in that a format string isn't likely to break a program. This is far from strictly true, however, due to things like %n and the ability to access more arguments than are passed, which is undefined in C. This might or might not be exploitable if your translator is malicious! - C format strings give the translator reasonable flexibility: Eg, they can reorder, repeat, or omit placeholders with the %m$ syntax. - Some localization is be "automagic", eg number formatting punctuation. However, if you don't keep internationalization in mind, it is easy to lose these characteristics. For example, your proposal seems to encourage printx 'Your little dog %s ', $dog, 'attacks the evil %s.', $monster; In this example, the translator is unable to change the sentence structure (unless he can change the code). So I would humbly advise anyone thinking about this to - think safe. Don't add features like the ability to execute arbitrary Perl expressions! (Or at least, offer a version without unsafe features, and recommend that programmers use it in most cases.) - think flexibility for the translator. This can be hard if you don't have linguisting or localization experience, but you can use your imagination. Desirable features might include locale-sensitive formatting of dates, currencies, etc; handling of plurals (gettext has a neat solution, though it requires multiple format strings); an "internationalized string" type wrapping up format string plus placeholders[1]. The Java and C# string formatting libraries are worth looking at (don't take this as high praise though). - make sure that a translator doesn't have to change anything except the format string. Andrew [1] This is my pet idea. Tell me if you see it somewhere!