Martin Maechler wrote: >>>>>> "vQ" == Wacek Kusnierczyk <waclaw.marcin.kusnierc...@idi.ntnu.no> >>>>>> on Thu, 23 Apr 2009 11:49:54 +0200 writes: >>>>>> > > vQ> maech...@stat.math.ethz.ch wrote: > >> > vQ> sprintf has a documented limit on strings included in the output > using the > vQ> format '%s'. It appears that there is a limit on the length of > strings included > vQ> with, e.g., the format '%d' beyond which surprising things happen > (output > vQ> modified for conciseness): > >> > > vQ> ... and this limit is *not* documented. > > well, it is basically (+ a few bytes ?) > the same 8192 limit that *is* documented. >
martin, ?sprintf says: " There is a limit of 8192 bytes on elements of 'fmt' and also on strings included by a '%s' conversion specification." for me, it's clear that the *elements of fmt* cannot be longer than 8192 bytes, and that each single bit included in the output in place of a %s cannot be longer than 8192. nowhere does it say that strings included in the output in place of a %d, for example, cannot be longer than 8192. the fact that %s is particularized makes me infer that there is something specific to %s that does not apply to %d, for example, otherwise the help would have been formulated differently. (though given how r help pages are written, nothing seems unlikely.) and in fact, the limit does not seem to apply in an obvious way in cases such as sprintf('%*d', 10000, 1), where the output is correct. at the very least, the documentation leaves the user ignorant as to what will happen if the limit is exceeded. > >> my version of 'man 3 sprintf' contains > >> > >> > >>>> BUGS > >>>> Because sprintf() and vsprintf() assume an arbitrarily > >>>> long string, callers must be careful not to overflow the > >>>> actual space; this is often impossible to assure. Note > >>>> that the length of the strings produced is > >>>> locale-dependent and difficult to predict. Use > >>>> snprintf() and vsnprintf() instead (or asprintf() and vasprintf). > > vQ> yes, but this is c documentation, not r documentation. > > Of course! ... and I *do* apply it to R's C code [sprintf.c] > and hence am even concurring with you .. > > > vQ> while snprintf would help avoid buffer overflow, it may not be a > vQ> solution to the issue of confused output. > > I think it would / will. We would be able to give warnings and > errors, by checking the snprintf() return codes. > maybe, i can't judge without carefully examining the code for sprintf.c (which i am rather unwilling to do, having had a look at a sample). > > >> More precisely, I see that some windows-only code relies on > >> snprintf() being available whereas in at least on non-Windows > >> section, I read /* we cannot assume snprintf here */ > >> > >> Now such platform dependency issues and corresponding configure > >> settings I do typically leave to other R-corers with a much > >> wider overview about platforms and their compilers and C libraries. > >> > > vQ> it looks like src/main/sprintf.c is just buggy, and it's plausible > that > vQ> the bug could be repaired in a platform-independent manner. > > definitely. > In the mean time, I've actually found that what I first said on > the usability of snprintf() in R's code base was only partly correct. > There are other parts of R code where we use snprintf() for all > platforms, hence we rely on its presence (and correct > implementation!) and so we can and I think should use it in > place of sprintf() in quite a few places inside R's sprintf.c > > would be interesting to see how this improves sprintf. > >> BTW, > >> 1) sprintf("%n %g", 1,1) also seg.faults > >> > > vQ> as do > > vQ> sprintf('%n%g', 1, 1) > vQ> sprintf('%n%') > > vQ> etc., while > > vQ> sprintf('%q%g', 1, 1) > vQ> sprintf('%q%') > > vQ> work just fine. strange, because per ?sprintf 'n' is not recognized > as > vQ> a format specifier, so the output from the first two above should be > as > vQ> from the last two above, respectively. (and likewise in the %S case, > vQ> discussed and bug-reported earlier.) > > I have now fixed these bugs at least; > great, i'm going to torture the fix soon ;) > the more subtle "%<too_large_n>d" ones are different, and > as I said, I'm convinced that a nice & clean fix for those will > start using snprintf(). > > >> 2) Did you have a true use case where the 8192 limit was an > >> undesirable limit? > > vQ> how does it matter? > > well, we could increase it, if it did matter. > {{ you *could* have been more polite here, no? > i don't see how i could be more polite here, i had absolutely no intention to be impolite and didn't think i were. i gave a serious answer by means of a serious question. increasing an arbitrary, poorly documented limit of obscure effect is hardly any solution. suggesting that a bug is not a bug because some limit is not likely to be exceeded in practice is not a particularly good idea. > it *was* after all a serious question that I asked! }} > > vQ> if you set a limit, be sure to consistently enforce > vQ> it and warn the user on attempts to exceed it. or write clearly in > the > vQ> docs that such attempts will cause the output to be silently > truncated. > > Sure, I'm not at all disagreeing on that, and if you read this into my > posting, you misunderstand. > no, i didn't read that into your posting, i'm just referring to the state of the 'art' in r. cheers, vQ ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel