Karol Krizka posted <[EMAIL PROTECTED]>, excerpted below, on Fri, 10 Jun 2005 17:53:25 -0700:
> Hi guys, > Today is the last day of school for me so I don't care if this computer > gets broken. I've decided to try out gcc4 on it. I remember reading some > threads on this list about how it broke some apps on KDE. How about GNOME, > is there anything not working? > > Also what are the best flags to get the fastest code (size dosn't matter)? > I've heard people talk about some '-visibility=hidden'. Is there any > others that I am not aware of? Short answer, using an uptodate ~amd64 system, and using gcc-config where necessary to switch back to an older gcc-3.whatever version, gcc-4.whatever has caused surprisingly few issues here. Operation with gcc-4.whatever (I'm regularly updating to the most current masked gcc-4.0.1-beta2005mmdd snapshots) as my system default compiler has been much smoother than I actually expected it to be. The longer, more detailed answer, follows. gcc-4 is slotted and installed in parallel to your latest gcc-3.x. gcc-config is the tool used to switch between them. Thus, if you are going to use gcc-4, ensure you have the latest ~ gcc-config as well. Also, the latest ~ binutils is probably helpful. KDE now compiles and runs fine, if you are using the individual packages, with the one or two possible exceptions (one of the games from kdegames was one, IIRC, but I've not merged that game so don't know) that I'm not using, here. The latest KDEs have had the -fvisibility-hidden stuff disabled from upstream, AFAIK, due to various issues, because they used it wrong in the first place, so they now work just fine, including the individual problem packages from before, I /think/, tho like I said, I hadn't run into any issues I could attribute to that anyway and wasn't running the one package /known/ to have serious problems (segfaults) from it, so I can't say for sure. I have been using gcc-4.0.1-beta2005mmdd snapshots for some time (unmasking them as necessary), as my regular system-wide compiler. There are a very few programs that won't compile/merge, altho most do just fine. The ones that don't, I simply use gcc-config to switch compilers back to the normal 3.4.4 profile, do that specific package merge, then switch back to 4.0.1-beta-whatever. Do note, however, that I'm running an entirely ~amd64 system, even with some masked-for-testing additions as well. I would *NOT* recommend anyone running stable try using the gcc-4 series just yet, for their entire system, anyway. Again, merging it for use with selected packages, using gcc-config to switch to it from a normal gcc-3.x profile (the reverse of the above, where I have gcc-4 as my normal profile), might be doable on a stable system, but even then, you'll likely need the latest unstable binutils and gcc-config, at minimum, to get it to work smoothly, and those in turn might pull in other necessary unstable dependencies. The two major packages that are KNOWN to still have issues with gcc-4 are glibc, and xorg. There's a still masked (AFAIK) glibc version that's supposed to compile with gcc-4, and is in the tree specifically to allow those that want to try it with gcc-4, to do so, but there are some dire warnings about using it, and while it worked just fine for 64-bit here, the 32-bit parallel build had issues (I couldn't compile anything else to 32-bit with it installed, including further gcc-snapshots, and the portage sandbox package itself, both of which have 32-bit compoenents, they fail during the 32-bit configure phase) so I unmerged it. However, continue to use your gcc-3.whatever compiled version of glibc, and use gcc-config to switch to your gcc-3.whatever profile when compiling any new glibc packages, and that shouldn't be an issue. Likewise with xorg. Simply use gcc-config to switch to your gcc-3.whatever profile, and xorg continues to compile just fine. Pretty much everything else, I've had no problems with using gcc-4.whatever at all. In the unlikely event there /are/ problems, again, simply switching to the gcc-3.whatever profile using gcc-config, and remerging using that, should solve them. The one other gcc-4 related issue I've seen is runtime, not compiletime, and relates to libstdc++. The gcc-4 version of that library is backward compatible with the gcc-3.3.x and 3.4.x versions, but the 3.x versions aren't forward compatible with the gcc-4 versions. The libstdc++ version that gets loaded is ALSO affected by your gcc-config setting. With KDE, it's preferable to compile everything with one OR the other, and then ensure when you load any KDE apps, that you you do so with gcc-config set to the version that matches what you compiled it with. *MOST* of KDE seems to run fine in any case, but anything requiring KHTML for rendering, including not only konqueror, but kcontrol, and some misc. apps like kweather, can refuse to load, under certain conditions, if the libstdc++ libraries used to compile them don't match up and don't match what's pointed to by gcc-config at the time they are launched. I doubt this minor incompatibility will show itself in much else beyond the KDE family, however, even where apps ARE C++, because very few will have the complex dependency structure that KDE does. In any case, here again, remerging the dependency tree of the offending application so all C++ related libraries and the application itself are compiled with a matching gcc, should fix the problem, and has done so here. CFLAGS: Do NOT put -fvisibility-hidden in your CFLAGS!! While this /can/ speed things up where used appropriately, it does so by hiding specific "internal" functions so they don't have to be dealt with when linking and otherwise handling executable libraries and applications. Put that in your CFLAGS, and you are essentially telling gcc to hide *ALL* functions, including those that are intended to be linked to. This *WILL* hose your system!! Other than that, the usual rules and cflags in general continue to apply, nothing particularly new, with ONE known exception. The methods gcc uses for optimization have changed, such that -fweb, which used to be generally optimizing, is now often /de/optimizing, instead. If you used it in your CFLAGS before, consider removing it. (At least, that's what I've read, and what I did. I've not done any benchmarks on it.) As for speed vs size optimization, the following should be interesting... Be /very/ careful with optimizing for speed, while saying size doesn't matter. Very often, theoretically faster code, say -O3, actually runs /slower/ than -O2 or -Os. The reason, when you think about it, is rather simple. Yes, -O3 optimizes for faster code, but it does so while not considering size hardly at all. In real life CPUs, there's such a thing as cache memory limitations. Running from the registers is the fastest, no performance penalty, but there are only a very few of them. L1 cache is next, but it too is very limited, typically 64k each for CPU instructions and data (128k total). L2 cache is slower but still makes a HUGE difference when compared to regular memory. Take a look at the benchmarks of otherwise identical CPUs with different size L2 cache if you've any doubt. L2 cache is normally 1MB on the higher end AMD64 chips, 512KB on the low end "cheap" versions. Beyond that is regular memory, many times slower than L2 cache, but also pretty much as large as your purchasing budget allows. Beyond that is hard drive swap and/or any network accessible memory, both of which are typically EXTREMELY slow to reach, in comparison to local RAM. While the effects of -O3 are generally theoretically faster code, they come at the expense of LARGER code. Thus, in real life, what would otherwise fit into L1 often spills over to L2, and what would otherwise fit in L2 often spills over to main memory. Because accessing this spillover area is MANY TIMES slower than accessing closer cache, the effect of -O3 is commonly if unintuitively, to SLOW DOWN the program, by forcing the CPU to wait for data fetched from further away than it would have been with -O2 or -Os. Thus, for many programs, the effects of -O3 are to make things slower, NOT faster. The exceptions to this general rule are programs that tend to do a lot of cache thrashing, and therefore not keep their instructions or data in the cache, anyway. Anything handling playing or streaming media of any size generally fits this category, thus, all your mplayer and media encoding/decoding applications. (Not coincidentally, such throughput intensive applications are the strongest point for modern deeply pipelined but very high clockrate "Netburst" Pentium 4 style Intel CPUs, as well. RDRAM was similarly optimized for high thruput at the expense of high latency as well, applications. AMD's arch and DDR-SDRAM, OTOH, are far lower latency archs that don't tend to do quite as well in media type applications but tend to be far better in general purpose applications where thread switching and latency are far more critical.) As a consequence of the above, I've been using -Os for some time and continue to do so. There was an article discussed here which demonstrated that with -O3, gcc-4 produced larger executables, and they in general benchmarked slightly worse, than the latest optimized gcc-3.x, with the same -O3. However, I've contended for some time that due to effects on cache overruns, -O3 will often tend to deoptimize code, rather than optimize it, thus my use of -Os, optimizing for size. Unfortunately, the article didn't compare -Os compiled code sizes or performance, and I've not seen comparisons elsewhere (altho I've not been really looking for them either), so I have no hard data on that. DO note, however, that the gcc-3.4 series is relatively mature at this point, and thus should be producing code about as optimized as it's going to. By contrast, the gcc-4.0 series is still new and probably producing far looser code than it will by 4.0.1 or 4.0.2. Thus, in the abstract, it's quite possible it will actually benchmark worse than 3.4.x, which is exactly what we saw in the article covered here, with -O3 optimized code. I still think gcc-4 is producing faster code for me with -Os, with no idea on the size of the executables. However, having not done any benchmarks, I'm absolutely willing to admit that it could easily be just my perception, and performance /may/ actually be worse, as we saw with -O3 in the benchmarks discussed above. (Do note that such could easily be explained as well... -O3 produces theoretically faster code with no concern for size, so if my cache arguments have any validity at all, it's actually quite likely that a "better" job at -O3 optimization would produce slower code in real life, because it would be theoretically faster at the expense of size, thereby cache-busting more efficiently, causing the code to run slower when actually used in a real-life finite-cached processor. Thus, the results above were actually /expected/ IMO, and could indeed mean gcc4 is more efficient at (de)optimizing exactly how it is told to optimize.) All that said, and again entirely by feel, I /think/ I see what could be worse memory leaks. I /think/ I see memory use growing farther and faster over time than I /remember/ happening before, particularly running my gcc-4 compiled KDE in (still gcc-3.4.3-whatever compiled, because it won't compile in 4.x yet) xorg. Quitting KDE/X to the CLI prompt and restarting them essentially eliminates the issue, which only occurs over several days of use, and I'm not /sure/ it's worse than it was, but it just /seems/ so. HOWEVER, note that I'm ALSO running currently masked xorg-x11-6.8.99.x testing ebuilds, and it's QUITE possible, EVEN LIKELY, THAT's where the leaks are, again if it's really any worse than before in the first place. I simply don't know, and am only reporting the observations I see. So, what that all amounts to is this: -Os /may/ not be quite as efficient, and either it or gcc-4 in general /may/ trigger memory leaks I wasn't seeing before. However, the issue /may/ not exist at all, or /may/ be attributable to something else entirely. In any case, it shouldn't be a serious problem for normal use, unless you consider "normal use" to be running long-running applications that take a week to come up with an answer, in which case the (potential) memory leak may be a problem. However, in that case, I'd wonder at your sanity in trying to test an acknowledged not yet stable marked gcc-4 on such a required-stable system in the first place! <g> That should about cover it... <g> -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman in http://www.linuxdevcenter.com/pub/a/linux/2004/12/22/rms_interview.html -- [email protected] mailing list
