[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982 Martin Sebor changed: What|Removed |Added CC||msebor at gcc dot gnu.org --- Comment #39 from Martin Sebor --- I don't know what happened to the patch but since it was posted a printf optimization and warning pass has been added to GCC that deals with a small subset of the issues raised here (e.g., it has its own format string parser and detects its own set of printf problems). I think merging only printf calls that are adjacent in GIMPLE with no intervening assignments to subsequent printf arguments from variables that may have escaped would obviate the problems raised in comment #18 and comment #29. I don't have a sense how much that would impact the optimization. FWIW, I think a more interesting and more widely applicable optimization opportunity than merging printf calls is in transforming sprintf and especially snprintf calls with format strings containing multiple %s directives (and no others) into sequences of strcpy/memcpy/memccpy calls (pr88813).
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982 Steven Bosscher changed: What|Removed |Added Status|ASSIGNED|WAITING Last reconfirmed|2006-02-05 21:14:21 |2019-3-4 --- Comment #38 from Steven Bosscher --- What happened with Diego's patch? (https://gcc.gnu.org/ml/gcc-patches/2005-06/msg00909.html)
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Comment #37 from ghazi at gcc dot gnu dot org 2006-10-26 00:59 --- A request for this optimization made by Bruce in Sept 2000. :-) http://gcc.gnu.org/ml/gcc-patches/2000-09/msg00877.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From ghazi at gcc dot gnu dot org 2005-06-10 12:42 --- (In reply to comment #25) Subject: Re: GCC should combine adjacent stdio calls On Fri, 10 Jun 2005, ghazi at gcc dot gnu dot org wrote: POSIX defines how streams and file descriptors for the same file can be used together and for an unbuffered stream it appears to me that the results from writing to it with stdio (within parts of the file that already existed; not after its end) should be immediately available in the mapped region. If you're talking about *unbuffered* streams, then it shouldn't matter whether you do two printfs or one. The results of the %d should be written as soon as possible with _IONBF and readable to the %.5s. I think we're into severe pathology here, I haven't seen anything which leads me to believe that the governing standards made specific guarantees with regards to the cases you're bringing up. If you can quote specific passages to back up your claim, then fine. But so far your cases haven't held up IMHO. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From ghazi at gcc dot gnu dot org 2005-06-10 12:55 --- (In reply to comment #25) Subject: Re: GCC should combine adjacent stdio calls On Fri, 10 Jun 2005, ghazi at gcc dot gnu dot org wrote: Case (b) involves fmemopen, and I assume you refer to a case where you open memory for writing, printf to the resulting FILE*, and pass a pointer to the memory area back into printf. This can only lead to disaster as you clobber the same memory you are reading from. Since fmemopen is a gnu extension, it can do whatever it wants, but I suspect you're entering unspecified territory here for C programs. We support programs which use functions other than the standard C ones - naturally as the compiler for the GNU system we support extensions in the GNU libraries. The example I gave used %.5s specifically so that it would only look at bytes with known values (if the programmer knows something about the possible values of the integer written) and not at bytes being modified by the printf. My point about gnu extensions, is that for corner cases like this we don't have a reliable standards document to make judgements from. We're essentially relying on the code as a reference implementation. So it can do whatever it wants. With regards to %d followed by %.5s, I don't see any difference regardless of the buffering mode between two printfs and one in how soon the %.5s will see the results of the %d. In buffered mode, you probably won't see it regardless unless you happen to hit BUFSIZ just right. In unbuffered mode, a correctly flushed printf will make the %d available. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From dnovillo at redhat dot com 2005-06-10 13:15 --- Subject: Re: GCC should combine adjacent stdio calls On Thu, Jun 09, 2005 at 07:52:42PM -, joseph at codesourcery dot com wrote: extern char *s; extern int i; printf(%d, i); printf(%.5s, s); you can't merge the printf calls because the first one could have changed what is pointed to by s. How can printing an integer to stdout affect 's'? Unless 's' has been somehow mapped to stdout's buffer? Is that what you have in mind? (a) It could be stdio's buffer (via setvbuf). (b) It could be a glibc memory stream opened with fmemopen (if the user assigned to stdout - which glibc allows - or you do this optimization on fprintf and not just printf). (c) It could point to a memory mapping of the file being written. Good lord. To me this is a pathological case. I'd wager that this happens approximately never. How about a switch disabling stdio merging? Diego. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From joseph at codesourcery dot com 2005-06-10 13:49 --- Subject: Re: GCC should combine adjacent stdio calls On Fri, 10 Jun 2005, ghazi at gcc dot gnu dot org wrote: With regards to %d followed by %.5s, I don't see any difference regardless of the buffering mode between two printfs and one in how soon the %.5s will see the results of the %d. In buffered mode, you probably won't see it regardless unless you happen to hit BUFSIZ just right. In unbuffered mode, a correctly flushed printf will make the %d available. I don't think there is any requirement for each printf conversion to be a separate write() call as long as the whole is written by the time printf returns. Other cases are where the second printf is e.g. printf(%d, __internals_of_FILE(stdout)); where __internals_of_FILE is a macro expansion returning some internal structure changed by the first printf. feof and ferror are the relevant standard functions; they may only return changed values if the file is in error status and the second printf redundant (though it isn't clear to me that a past error implies a future error and so that the second printf is redundant rather than potentially succeeding where the first failed), but it would not surprise me if some system implements something like Solaris and glibc __fpending as a macro. In addition, an implementation can choose to document parts of its stdio structures and a user can write programs based on the implementation documentation. Not that I really see the benefit of printf merging in any case; without statistics showing its effects on real code it seems any size benefit could easily be wiped out by inhibiting the sharing of strings used in more than one printf because instead they get merged with the adjacent different strings. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From dnovillo at redhat dot com 2005-06-10 13:56 --- Subject: Re: GCC should combine adjacent stdio calls On Fri, Jun 10, 2005 at 01:49:54PM -, joseph at codesourcery dot com wrote: Not that I really see the benefit of printf merging in any case; without statistics showing its effects on real code it seems any size benefit could easily be wiped out by inhibiting the sharing of strings used in more than one printf because instead they get merged with the adjacent different strings. This is a good point. Kaveh, do you think you'd have time to do some timings on things like GCC bootstraps or other code bases that use stdio extensively? Diego. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From ghazi at gcc dot gnu dot org 2005-06-10 14:22 --- (In reply to comment #30) Subject: Re: GCC should combine adjacent stdio calls On Fri, Jun 10, 2005 at 01:49:54PM -, joseph at codesourcery dot com wrote: Not that I really see the benefit of printf merging in any case; without statistics showing its effects on real code it seems any size benefit could easily be wiped out by inhibiting the sharing of strings used in more than one printf because instead they get merged with the adjacent different strings. This is a good point. Kaveh, do you think you'd have time to do some timings on things like GCC bootstraps or other code bases that use stdio extensively? Diego. I have the cpu time, but it seems premature. Your patch as it stands only optimizes two adjacent printf calls. Not printf with putc or puts and none of the f* variants, right? And GCC uses mostly the f* variants. This is like asking for tree-ssa benchmarks when the framework was in but before any new passes were written. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From dnovillo at redhat dot com 2005-06-10 14:25 --- Subject: Re: GCC should combine adjacent stdio calls On Fri, Jun 10, 2005 at 02:22:05PM -, ghazi at gcc dot gnu dot org wrote: I have the cpu time, but it seems premature. Your patch as it stands only optimizes two adjacent printf calls. Not printf with putc or puts and none of the f* variants, right? And GCC uses mostly the f* variants. This is like asking for tree-ssa benchmarks when the framework was in but before any new passes were written. Hmm, right. OK, let me finish that part first then. Diego. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From joseph at codesourcery dot com 2005-06-10 14:28 --- Subject: Re: GCC should combine adjacent stdio calls On Fri, 10 Jun 2005, ghazi at gcc dot gnu dot org wrote: I have the cpu time, but it seems premature. Your patch as it stands only optimizes two adjacent printf calls. Not printf with putc or puts and none of the f* variants, right? And GCC uses mostly the f* variants. This is like asking for tree-ssa benchmarks when the framework was in but before any new passes were written. Since putc and puts are typically faster than printf (not needing to parse the input) and we optimize printf of constants into them, it's not clear that merging printf with such functions would be an improvement either. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From dnovillo at redhat dot com 2005-06-10 14:35 --- Subject: Re: GCC should combine adjacent stdio calls On Fri, Jun 10, 2005 at 02:28:36PM -, joseph at codesourcery dot com wrote: Since putc and puts are typically faster than printf (not needing to parse the input) and we optimize printf of constants into them, it's not clear that merging printf with such functions would be an improvement either. No, the patch does not merge different builtins. The whole idea is to collapse multiple I/O calls of the same kind into one. Whether this helps or not, I don't know. It certainly is a straightforward transformation. Diego. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From ghazi at gcc dot gnu dot org 2005-06-10 15:05 --- (In reply to comment #33) Subject: Re: GCC should combine adjacent stdio calls On Fri, 10 Jun 2005, ghazi at gcc dot gnu dot org wrote: Since putc and puts are typically faster than printf (not needing to parse the input) and we optimize printf of constants into them, it's not clear that merging printf with such functions would be an improvement either. You're probably right that it's not a speed win, but it may be a code size win for -Os. I'd like to benchmark that too if possible. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From joseph at codesourcery dot com 2005-06-10 19:07 --- Subject: Re: GCC should combine adjacent stdio calls On Fri, 10 Jun 2005, ghazi at gcc dot gnu dot org wrote: --- Additional Comments From ghazi at gcc dot gnu dot org 2005-06-10 15:05 --- (In reply to comment #33) Subject: Re: GCC should combine adjacent stdio calls On Fri, 10 Jun 2005, ghazi at gcc dot gnu dot org wrote: Since putc and puts are typically faster than printf (not needing to parse the input) and we optimize printf of constants into them, it's not clear that merging printf with such functions would be an improvement either. You're probably right that it's not a speed win, but it may be a code size win for -Os. I'd like to benchmark that too if possible. If an actual gain is demonstrated (for a reasonably large and diverse body of code, given that reduced code is being balanced with increased data because of less string sharing), and all the cases not involving printing data modified by printf are resolved, then it might have sense to do the optimizations with -fno-builtin-printf available to disable them. It's still necessary to parse the format strings to make sure they are understood with no %n, $ operand numbers or excess arguments and to disallow merging where there is a store inbetween to memory which might either be involved in the stdio structures (the macro clearerr case) or which might be part of a string printed by the first printf (the example I gave of a local buffer printed in both the first and second printfs but with value changed inbetween). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From pinskia at gcc dot gnu dot org 2005-06-09 14:07 --- Confirmed, this might be hard, I don't know but would be nice as it should speed up GCC itself. -- What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed||1 Last reconfirmed|-00-00 00:00:00 |2005-06-09 14:07:26 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From joseph at codesourcery dot com 2005-06-09 14:36 --- Subject: Re: GCC should combine adjacent stdio calls Another problem case is if the first format has excess arguments (which is permitted by ISO C) - those arguments must be evaluated but not included in the concatenated argument list. Even if arguments don't have side effects you have problems if arguments of the second printf might refer to anything modifed by printf (the FILE structure, its buffers, the contents of the file written to, ...) - they should have the appropriate values for after the first call. It's OK if the arguments are e.g. local integer variables whose addresses have not escaped, but in general you need to prove that the arguments to the second printf, and anything they point to, cannot be changed by the first printf. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From dnovillo at gcc dot gnu dot org 2005-06-09 14:37 --- (In reply to comment #1) If side effects appear in the arguments, that also would be a problem, e.g.: printf(%d, i++); printf(%d, i++); should not be turned into: printf(%d%d, i++, i++); There should be little danger of this. Side-effects are explicitly exposed in GIMPLE. As long as the printf() calls are adjacent, we should be able to combine them. Diego. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From gcc-bugzilla at gcc dot gnu dot org 2005-06-09 16:11 --- Subject: New: GCC should combine adjacent stdio calls GCC should optimize adjacent stdio calls. For example: printf(foo %d %d\n, i, j); printf(bar %d %d\n, x, y); could instead be emitted as: printf(foo %d %d\nbar %d %d\n, i, j, x, y); More generally, you simply concatenate the format arguments and append all of the remaining first printf's arguments and then all of the second printf's arguments. You can also combine adjacent printf/puts and printf/putc: printf(format, args...); puts(s); - printf(format%s\n, args..., s); printf(format, args...); putc(c); - printf format%c, args..., c); You can also combine adjacent f* variants of these stdio calls (fprintf, fputs, fputc) if the supplied streams are equivalent. One caveat, some format specifiers need special care. E.g. position speficiers must be adjusted. The %n specifier may preclude the optimization entirely. There might be other examples. -- Summary: GCC should combine adjacent stdio calls Product: gcc Version: 4.1.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P2 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: ghazi at gcc dot gnu dot org CC: gcc-bugs at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982 --- Additional Comments From gcc-bugzilla at gcc dot gnu dot org 2005-06-09 16:11 --- Subject: GCC should combine adjacent stdio calls --- Additional Comments From ghazi at gcc dot gnu dot org 2005-06-09 13:01 --- If side effects appear in the arguments, that also would be a problem, e.g.: printf(%d, i++); printf(%d, i++); should not be turned into: printf(%d%d, i++, i++); because we can't guarantee order of evaluation. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From gcc-bugzilla at gcc dot gnu dot org 2005-06-09 16:11 --- Subject: New: GCC should combine adjacent stdio calls GCC should optimize adjacent stdio calls. For example: printf(foo %d %d\n, i, j); printf(bar %d %d\n, x, y); could instead be emitted as: printf(foo %d %d\nbar %d %d\n, i, j, x, y); More generally, you simply concatenate the format arguments and append all of the remaining first printf's arguments and then all of the second printf's arguments. You can also combine adjacent printf/puts and printf/putc: printf(format, args...); puts(s); - printf(format%s\n, args..., s); printf(format, args...); putc(c); - printf format%c, args..., c); You can also combine adjacent f* variants of these stdio calls (fprintf, fputs, fputc) if the supplied streams are equivalent. One caveat, some format specifiers need special care. E.g. position speficiers must be adjusted. The %n specifier may preclude the optimization entirely. There might be other examples. -- Summary: GCC should combine adjacent stdio calls Product: gcc Version: 4.1.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P2 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: ghazi at gcc dot gnu dot org CC: gcc-bugs at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982 --- Additional Comments From gcc-bugzilla at gcc dot gnu dot org 2005-06-09 16:11 --- Subject: GCC should combine adjacent stdio calls --- Additional Comments From ghazi at gcc dot gnu dot org 2005-06-09 13:01 --- If side effects appear in the arguments, that also would be a problem, e.g.: printf(%d, i++); printf(%d, i++); should not be turned into: printf(%d%d, i++, i++); because we can't guarantee order of evaluation. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From dnovillo at gcc dot gnu dot org 2005-06-09 16:18 --- Testing patch. -- What|Removed |Added AssignedTo|unassigned at gcc dot gnu |dnovillo at gcc dot gnu dot |dot org |org Status|NEW |ASSIGNED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From ghazi at gcc dot gnu dot org 2005-06-09 16:49 --- (In reply to comment #4) (In reply to comment #1) If side effects appear in the arguments, that also would be a problem, e.g.: printf(%d, i++); printf(%d, i++); should not be turned into: printf(%d%d, i++, i++); There should be little danger of this. Side-effects are explicitly exposed in GIMPLE. As long as the printf() calls are adjacent, we should be able to combine them. Diego. I'm not sure. In my specific example above, after the combination we don't know which i++ gets executed first because the order is not guaranteed within an argument list of a single function call (right?) So if we want to include combinations whose arguments have side-effects, we have to prove they don't interact with any other arguments. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From pinskia at gcc dot gnu dot org 2005-06-09 16:51 --- (In reply to comment #8) I'm not sure. In my specific example above, after the combination we don't know which i++ gets executed first because the order is not guaranteed within an argument list of a single function call (right?) So if we want to include combinations whose arguments have side-effects, we have to prove they don't interact with any other arguments. What Diego is saying is that there are never any arguments with side effects at GIMPLE level. everything is well defined at the point which we can do the combining. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From dnovillo at redhat dot com 2005-06-09 16:55 --- Subject: Re: GCC should combine adjacent stdio calls On Thu, Jun 09, 2005 at 04:49:40PM -, ghazi at gcc dot gnu dot org wrote: --- Additional Comments From ghazi at gcc dot gnu dot org 2005-06-09 16:49 --- (In reply to comment #4) (In reply to comment #1) If side effects appear in the arguments, that also would be a problem, e.g.: printf(%d, i++); printf(%d, i++); should not be turned into: printf(%d%d, i++, i++); There should be little danger of this. Side-effects are explicitly exposed in GIMPLE. As long as the printf() calls are adjacent, we should be able to combine them. Diego. I'm not sure. In my specific example above, after the combination we don't know which i++ gets executed first because the order is not guaranteed within an argument list of a single function call (right?) So if we want to include combinations whose arguments have side-effects, we have to prove they don't interact with any other arguments. But remember that we are not optimizing C, we are optimizing GIMPLE. And in GIMPLE we don't have those problems. Here's what the tree optimizers see: i_3 = i_1 + 1; printf (%d[0], i_1); printf (%d[0], i_3); Those two calls to printf can be merged. The order of evaluation has been decided by the gimplifier. Whether that's right or wrong for C, I couldn't say, all I know is that for the optimizers, those two printf calls look mergeable. Diego. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From ghazi at gcc dot gnu dot org 2005-06-09 16:55 --- (In reply to comment #3) Subject: Re: GCC should combine adjacent stdio calls Another problem case is if the first format has excess arguments (which is permitted by ISO C) - those arguments must be evaluated but not included in the concatenated argument list. While it may be legal, our -Wformat option warns about excess arguments and I would suggest we don't attempt any optimization unless we pass -Wformat cleanly. So I think this one is surmountable. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From ghazi at gcc dot gnu dot org 2005-06-09 17:02 --- (In reply to comment #10) Subject: Re: GCC should combine adjacent stdio calls But remember that we are not optimizing C, we are optimizing GIMPLE. And in GIMPLE we don't have those problems. Here's what the tree optimizers see: i_3 = i_1 + 1; printf (%d[0], i_1); printf (%d[0], i_3); Those two calls to printf can be merged. The order of evaluation has been decided by the gimplifier. Whether that's right or wrong for C, I couldn't say, all I know is that for the optimizers, those two printf calls look mergeable. Diego. Ah okay thanks. By the way, you may recall you and I discussed doing this during the first GCC summit. One suggestion that IIRC Paul Brook had was to move printf statements around to create more opportunities for combination if the intervening statements didn't interact with the moving printf. E.g. int i=0, j=2; printf(%d, i); j++; printf(%d, j); Pushing the first printf further down, this could be reordered as: int i=0, j=2; j++; printf(%d, i); printf(%d, j); which would expose another combination possibility. Paul seemed to think this wasn't hard with the existing infrastructure, and that was two years ago. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From pinskia at gcc dot gnu dot org 2005-06-09 17:07 --- (In reply to comment #12) Pushing the first printf further down, this could be reordered as: int i=0, j=2; j++; printf(%d, i); printf(%d, j); In fact this is how SSA works, in that there a variable is only ever assigned once so j_1 and j_2 can be alive at the same time (well except for variables across abnormal edges). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From joseph at codesourcery dot com 2005-06-09 17:11 --- Subject: Re: GCC should combine adjacent stdio calls On Thu, 9 Jun 2005, ghazi at gcc dot gnu dot org wrote: --- Additional Comments From ghazi at gcc dot gnu dot org 2005-06-09 16:55 --- (In reply to comment #3) Subject: Re: GCC should combine adjacent stdio calls Another problem case is if the first format has excess arguments (which is permitted by ISO C) - those arguments must be evaluated but not included in the concatenated argument list. While it may be legal, our -Wformat option warns about excess arguments and I would suggest we don't attempt any optimization unless we pass -Wformat cleanly. So I think this one is surmountable. We linked -Wformat into optimization before, then removed the link. Although we could resurrect the status_warning function which could set a status variable if it would warn rather than emitting the warning (and again save and restore all the variable controlling format warnings), it's not clear this is very desirable. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From ghazi at gcc dot gnu dot org 2005-06-09 17:21 --- (In reply to comment #14) Subject: Re: GCC should combine adjacent stdio calls On Thu, 9 Jun 2005, ghazi at gcc dot gnu dot org wrote: --- Additional Comments From ghazi at gcc dot gnu dot org 2005-06-09 16:55 --- (In reply to comment #3) Subject: Re: GCC should combine adjacent stdio calls Another problem case is if the first format has excess arguments (which is permitted by ISO C) - those arguments must be evaluated but not included in the concatenated argument list. While it may be legal, our -Wformat option warns about excess arguments and I would suggest we don't attempt any optimization unless we pass -Wformat cleanly. So I think this one is surmountable. We linked -Wformat into optimization before, then removed the link. Although we could resurrect the status_warning function which could set a status variable if it would warn rather than emitting the warning (and again save and restore all the variable controlling format warnings), it's not clear this is very desirable. Why is that? IIRC, the reason it was removed was that we never did any builtin printf trasformations with arguments beyond e.g. %s\n, %c. So it was easier to simply check these cases manually than invoking the whole format parsing routine. However if we now want to ensure there were no excess arguments I don't see a better way without mostly reimplementing a second format parser. What would you suggest? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From dnovillo at redhat dot com 2005-06-09 19:03 --- Subject: Re: GCC should combine adjacent stdio calls On Thu, Jun 09, 2005 at 05:02:28PM -, ghazi at gcc dot gnu dot org wrote: int i=0, j=2; printf(%d, i); j++; printf(%d, j); Pushing the first printf further down, this could be reordered as: int i=0, j=2; j++; printf(%d, i); printf(%d, j); which would expose another combination possibility. Paul seemed to think this wasn't hard with the existing infrastructure, and that was two years ago. Oh, absolutely. The algorithm I'm using will naturally do this. This is a purely local transformation, we obviously cannot merge builtins in different control flow paths, so the transformation goes like this: when we get to a builtin, we try to merge it with a previously found builtin. The only time we reset the concept of previously found builtin is when we find a CALL_EXPR or an ASM_EXPR which are the only ones that may have side-effects affecting the output of the program. If the program manipulates the same FILE * that is being used by the stdio calls, then we'd lose. But I think that's fair game, right? Diego. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From joseph at codesourcery dot com 2005-06-09 19:15 --- Subject: Re: GCC should combine adjacent stdio calls On Thu, 9 Jun 2005, ghazi at gcc dot gnu dot org wrote: We linked -Wformat into optimization before, then removed the link. Although we could resurrect the status_warning function which could set a status variable if it would warn rather than emitting the warning (and again save and restore all the variable controlling format warnings), it's not clear this is very desirable. Why is that? IIRC, the reason it was removed was that we never did any builtin printf trasformations with arguments beyond e.g. %s\n, %c. So it was easier to simply check these cases manually than invoking the whole format parsing routine. However if we now want to ensure there were no excess arguments I don't see a better way without mostly reimplementing a second format parser. What would you suggest? The requirements for -Wformat and for testing whether you can concatenate formats this way are rather different: there are large numbers of variables controlling the details of warning, but optimization shouldn't depend on these except for the standard version (so it needs to save various variables, turn on -pedantic and set other variables to known values - and whenever a format warning is split up this list needs updating); many format warnings are for constructs which have well-defined meaning but seem odd to write, and optimizing those is reasonable; the format warning code has no reason to be concerned with %n being a problem (although turning on -pedantic will deal with disabling this optimization for formats with $ operand numbers). At least, try to work out a better interface to the format checking than the one used last time. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From joseph at codesourcery dot com 2005-06-09 19:29 --- Subject: Re: GCC should combine adjacent stdio calls On Thu, 9 Jun 2005, dnovillo at redhat dot com wrote: Oh, absolutely. The algorithm I'm using will naturally do this. This is a purely local transformation, we obviously cannot merge builtins in different control flow paths, so the transformation goes like this: when we get to a builtin, we try to merge it with a previously found builtin. The only time we reset the concept of previously found builtin is when we find a CALL_EXPR or an ASM_EXPR which are the only ones that may have side-effects affecting the output of the program. If the program manipulates the same FILE * that is being used by the stdio calls, then we'd lose. But I think that's fair game, right? Although it may not be valid to manipulate the FILE * directly, it seems quite possible that a program might call another stdio.h function between the printf calls, that function on the particular implementation having a macro expansion without a function call. It is also possible that values of arguments to the second built-in printf call may depend on the first one having been previously evaluated; for example, given extern char *s; extern int i; printf(%d, i); printf(%.5s, s); you can't merge the printf calls because the first one could have changed what is pointed to by s. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From dnovillo at redhat dot com 2005-06-09 19:38 --- Subject: Re: GCC should combine adjacent stdio calls On Thu, Jun 09, 2005 at 07:29:42PM -, joseph at codesourcery dot com wrote: Although it may not be valid to manipulate the FILE * directly, it seems quite possible that a program might call another stdio.h function between the printf calls That is fine. Any call between the two builtins blocks the merging. that function on the particular implementation having a macro expansion without a function call. Sorry, you lost me here. It is also possible that values of arguments to the second built-in printf call may depend on the first one having been previously evaluated; for example, given extern char *s; extern int i; printf(%d, i); printf(%.5s, s); you can't merge the printf calls because the first one could have changed what is pointed to by s. How can printing an integer to stdout affect 's'? Unless 's' has been somehow mapped to stdout's buffer? Is that what you have in mind? Diego. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From ghazi at gcc dot gnu dot org 2005-06-09 19:48 --- (In reply to comment #19) Subject: Re: GCC should combine adjacent stdio calls On Thu, Jun 09, 2005 at 07:29:42PM -, joseph at codesourcery dot com wrote: that function on the particular implementation having a macro expansion without a function call. Sorry, you lost me here. I think he means if you call e.g. putc() in between two printf calls, putc may be a macro which does something like this: (--(p)-_cnt 0 ? __flsbuf((x), (p)) : (int)(*(p)-_ptr++ = (unsigned char) (x))) But in this case, I believe this expansion will not allow printfs to move across it due to the side effects of __flsbuf(). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From joseph at codesourcery dot com 2005-06-09 19:52 --- Subject: Re: GCC should combine adjacent stdio calls On Thu, 9 Jun 2005, dnovillo at redhat dot com wrote: Although it may not be valid to manipulate the FILE * directly, it seems quite possible that a program might call another stdio.h function between the printf calls That is fine. Any call between the two builtins blocks the merging. that function on the particular implementation having a macro expansion without a function call. Sorry, you lost me here. Suppose an implementation defines e.g. clearerr as a macro, and the expansion of that macro just clears bits in the stdio structure and doesn't call any functions. Then though the user's source code looks like it contains a function call, after preprocessing it contains manipulation of bits of the FILE structure for stdout instead. It is also possible that values of arguments to the second built-in printf call may depend on the first one having been previously evaluated; for example, given extern char *s; extern int i; printf(%d, i); printf(%.5s, s); you can't merge the printf calls because the first one could have changed what is pointed to by s. How can printing an integer to stdout affect 's'? Unless 's' has been somehow mapped to stdout's buffer? Is that what you have in mind? (a) It could be stdio's buffer (via setvbuf). (b) It could be a glibc memory stream opened with fmemopen (if the user assigned to stdout - which glibc allows - or you do this optimization on fprintf and not just printf). (c) It could point to a memory mapping of the file being written. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From dnovillo at redhat dot com 2005-06-09 19:57 --- Subject: Re: GCC should combine adjacent stdio calls On Thu, Jun 09, 2005 at 07:52:42PM -, joseph at codesourcery dot com wrote: Suppose an implementation defines e.g. clearerr as a macro, and the expansion of that macro just clears bits in the stdio structure and doesn't call any functions. Then though the user's source code looks like it contains a function call, after preprocessing it contains manipulation of bits of the FILE structure for stdout instead. Ah, OK. In which case, we should reset the previous builtin if we find a store to global memory in between. That'd be easy. (a) It could be stdio's buffer (via setvbuf). (b) It could be a glibc memory stream opened with fmemopen (if the user assigned to stdout - which glibc allows - or you do this optimization on fprintf and not just printf). (c) It could point to a memory mapping of the file being written. Gah, so we'll need to parse the format string then. Oh, well. Diego. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From joseph at codesourcery dot com 2005-06-09 20:13 --- Subject: Re: GCC should combine adjacent stdio calls On Thu, 9 Jun 2005, dnovillo at redhat dot com wrote: Gah, so we'll need to parse the format string then. Oh, well. We'll need to parse the format string anyway to know if there are excess arguments to the first printf which should not be printed, and if %n or operand numbers appear. For the other issues, it's the arguments to the second printf and anything else evaluated between the two printfs that matter: If anything evaluated might change the contents of the FILE * or anything pointed to therein (e.g. local buffers passed to setvbuf) then you can't merge. Effectively, any change to any object whose address might have escaped should stop merging. If anything evaluated might change what is pointed to by an argument to the first printf then you can't merge. For example, char buf[] = abc; printf(%s, buf); buf[0] = d; printf(%s, buf); cannot be merged. If an argument or anything pointed to by an argument which is dereferenced and printed (e.g. %s) might have been changed by the first printf - if its address or the address it points to could have escaped - then you can't merge. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From ghazi at gcc dot gnu dot org 2005-06-10 01:20 --- (In reply to comment #22) Subject: Re: GCC should combine adjacent stdio calls On Thu, Jun 09, 2005 at 07:52:42PM -, joseph at codesourcery dot com wrote: (a) It could be stdio's buffer (via setvbuf). (b) It could be a glibc memory stream opened with fmemopen (if the user assigned to stdout - which glibc allows - or you do this optimization on fprintf and not just printf). (c) It could point to a memory mapping of the file being written. Gah, so we'll need to parse the format string then. Oh, well. Diego. While I agree that we need to parse the format string, that's to solve other issues like detecting %n. I'm not convinced the above a-b-c cases Joseph cites are so clear-cut against doing the transformation. To me it seems all three cases are pathological self clobbering actions (readwrite to the same place in one action) or rely on when the OS decides to sync the disk. In case (a), you're suggesting that the programmer would allocate a buffer to be used by stdio buffering, pass it to setvbuf, then effectively read and write to this buffer in the same printf statement by printing the contents of the buffer through the very stream it's being used by for buffering. First of all I'd like to know if the standard guarantees the contents of a stdio buffer (not just when it's flushed) at enough level of detail to reliably retrieve results from it in the midst of buffering through it. I checked and found different contents between glibc and solaris2. Case (b) involves fmemopen, and I assume you refer to a case where you open memory for writing, printf to the resulting FILE*, and pass a pointer to the memory area back into printf. This can only lead to disaster as you clobber the same memory you are reading from. Since fmemopen is a gnu extension, it can do whatever it wants, but I suspect you're entering unspecified territory here for C programs. Case (c) with mmap again looks like you're reading and writing to the same place, but the results depend on how buffering and disk syncing interact. Again, what guarantees from the C standards do we have here on what the results should look like? Since IIRC mmap isn't part of C, there are no guarantees. I'm willing to be convinced otherwise, can you construct a valid code snippet whose behavior is completely specified which works with two separate printf statements but fails when they're combined? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982
[Bug tree-optimization/21982] GCC should combine adjacent stdio calls
--- Additional Comments From joseph at codesourcery dot com 2005-06-10 02:00 --- Subject: Re: GCC should combine adjacent stdio calls On Fri, 10 Jun 2005, ghazi at gcc dot gnu dot org wrote: Case (b) involves fmemopen, and I assume you refer to a case where you open memory for writing, printf to the resulting FILE*, and pass a pointer to the memory area back into printf. This can only lead to disaster as you clobber the same memory you are reading from. Since fmemopen is a gnu extension, it can do whatever it wants, but I suspect you're entering unspecified territory here for C programs. Case (c) with mmap again looks like you're reading and writing to the same place, but the results depend on how buffering and disk syncing interact. Again, what guarantees from the C standards do we have here on what the results should look like? Since IIRC mmap isn't part of C, there are no guarantees. We support programs which use functions other than the standard C ones - naturally as the compiler for the GNU system we support extensions in the GNU libraries. The example I gave used %.5s specifically so that it would only look at bytes with known values (if the programmer knows something about the possible values of the integer written) and not at bytes being modified by the printf. POSIX defines how streams and file descriptors for the same file can be used together and for an unbuffered stream it appears to me that the results from writing to it with stdio (within parts of the file that already existed; not after its end) should be immediately available in the mapped region. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21982