Re: SOLVED - was Re: [gentoo-user] Serious problem with linode vm
On Tuesday 16 April 2013 22:15:32 Michael Mol wrote: Or you might simply know what you're doing. http://en.gentoo-wiki.com/wiki/Hardware_CFLAGS#Determining_available_proc essor_features Out of curiosity I had a look at that guide, but `$ echo | gcc - march=native -v -E - 21 | grep cc1` told me I have an i7 processor. In fact it's an i5, which I think is an i7 without hyperthreading. I decided to leave well alone. Was that too cautious? -- Peter
SOLVED - was Re: [gentoo-user] Serious problem with linode vm
On 2013-04-15 2:02 PM, Michael Mol mike...@gmail.com wrote: Were this one of my systems (none of which is in a prod scenario, so take it with a grain of salt), I'd emerge -e --keep-going @system, and then emerge --resume a few times. You're stuck in something not unlike a bootstrap scenario. Ok, well, the DB was down, and I had the data backed up, so last resort, I switched back to the 32bit kernel, rebooted, and started the first emerge -e --keep-going @system, and left for home to continue working on it from there... It was done by the time I got home (about 25 minute drive), so didn't take nearly as long as I had feared - mostly because about 28 packages - most of them the ones that take a really long time (like glib, glibc and gcc) died almost immediately... After the first one completed, I did emerge --resume until everything was emerged. Then I started it all over again, and this time, *everything* recompiled successfully! But, apache still wouldn't start up. The error was PHP related, so, I rebuilt that with emerge -vu (with 5.4 masked so it would pull in the latest update to 5.3 since emerging -vuk (reinstalling the quickpkg'd masked version) didn't work - and this time PHP successfully updated, and presto, everything is now working as expected! I'm still planning on finishing up the new server (had already started on it) and migrating the DB to it, but now the pressure is off. So, massive thanks! to Michael for the suggestion (had heard of totally rebuilding the entire system using -e and --keep-going, but never done it)... and of course, gentoo is amazing. Charles
Re: SOLVED - was Re: [gentoo-user] Serious problem with linode vm
On 04/16/2013 11:23 AM, Tanstaafl wrote: On 2013-04-15 2:02 PM, Michael Mol mike...@gmail.com wrote: Were this one of my systems (none of which is in a prod scenario, so take it with a grain of salt), I'd emerge -e --keep-going @system, and then emerge --resume a few times. You're stuck in something not unlike a bootstrap scenario. Ok, well, the DB was down, and I had the data backed up, so last resort, I switched back to the 32bit kernel, rebooted, and started the first emerge -e --keep-going @system, and left for home to continue working on it from there... It was done by the time I got home (about 25 minute drive), so didn't take nearly as long as I had feared - mostly because about 28 packages - most of them the ones that take a really long time (like glib, glibc and gcc) died almost immediately... After the first one completed, I did emerge --resume until everything was emerged. Then I started it all over again, and this time, *everything* recompiled successfully! But, apache still wouldn't start up. The error was PHP related, so, I rebuilt that with emerge -vu (with 5.4 masked so it would pull in the latest update to 5.3 since emerging -vuk (reinstalling the quickpkg'd masked version) didn't work - and this time PHP successfully updated, and presto, everything is now working as expected! I'm still planning on finishing up the new server (had already started on it) and migrating the DB to it, but now the pressure is off. So, massive thanks! to Michael for the suggestion (had heard of totally rebuilding the entire system using -e and --keep-going, but never done it)... and of course, gentoo is amazing. To be clear, you didn't rebuild the entire system. You rebuilt core packages. To rebuild the entire system, it'd be: emerge -e @world # Plus whatever else there is. You're still at risk of non-@system packages having risky opcodes. Sounds like PHP turned out to be one of those. You will probably need to rebuild others. But I'm very glad I was able to help. :) signature.asc Description: OpenPGP digital signature
Re: SOLVED - was Re: [gentoo-user] Serious problem with linode vm
On 2013-04-16 11:28 AM, Michael Mol mike...@gmail.com wrote: To be clear, you didn't rebuild the entire system. You rebuilt core packages. To rebuild the entire system, it'd be: emerge -e @world Correct - which is why I said @system... ;) # Plus whatever else there is. Hmmm... are there really packages that belong to neither @system or @world? How would I go about finding/updating these? You're still at risk of non-@system packages having risky opcodes. Sounds like PHP turned out to be one of those. You will probably need to rebuild others. Correct again, and am planning on doing this this weekend, *after* I get Linode's backups enabled and get a good snapshot of the system in 'working' (if possibly partially borked) condition... But I'm very glad I was able to help. :) Me too... :)
Re: SOLVED - was Re: [gentoo-user] Serious problem with linode vm
On 04/16/2013 11:50 AM, Tanstaafl wrote: On 2013-04-16 11:28 AM, Michael Mol mike...@gmail.com wrote: To be clear, you didn't rebuild the entire system. You rebuilt core packages. To rebuild the entire system, it'd be: emerge -e @world Correct - which is why I said @system... ;) # Plus whatever else there is. Hmmm... are there really packages that belong to neither @system or @world? How would I go about finding/updating these? I must have missed where you ran emerge -e @world. Oops. :) But, yes, there are packages which don't belong to @system or @world. @world consists of things you've explicitly asked for. @system consists of things that are deemed inherently necessary for system operation. Neither necessarily contains things like x11-libs/libX11, which may be pulled in as a dependency of something in @world or @system (depending on USE flags, etc). Various python modules would also not be found in either @world or @system. @system and @world are explicit statements...there are still the dependencies to worry about. But if you ran emerge -e @world, the -e would have picked those up. [snip] signature.asc Description: OpenPGP digital signature
Re: SOLVED - was Re: [gentoo-user] Serious problem with linode vm
On 2013-04-16 12:12 PM, Michael Mol mike...@gmail.com wrote: I must have missed where you ran emerge -e @world. Oops. :) I didn't... I was replying to your comment that implied that I thought I had rebuilt my entire 'system', when in fact I specified '@system', meaning, only those packages in @system... :) That said... since this entire system was 32bit up until my little mistake, and I only updated/compiled a few packages, my thoughts are that the vast majority of the system should be 'ok' - or am I missing something (wouldn't surprise me if I am)... But, yes, there are packages which don't belong to @system or @world. @world consists of things you've explicitly asked for. @system consists of things that are deemed inherently necessary for system operation. Neither necessarily contains things like x11-libs/libX11, which may be pulled in as a dependency of something in @world or @system (depending on USE flags, etc). Various python modules would also not be found in either @world or @system. @system and @world are explicit statements...there are still the dependencies to worry about. But if you ran emerge -e @world, the -e would have picked those up. Gotcha... thanks for the clarification.
Re: SOLVED - was Re: [gentoo-user] Serious problem with linode vm
On 04/16/2013 02:03 PM, Tanstaafl wrote: On 2013-04-16 12:12 PM, Michael Mol mike...@gmail.com wrote: I must have missed where you ran emerge -e @world. Oops. :) I didn't... I was replying to your comment that implied that I thought I had rebuilt my entire 'system', when in fact I specified '@system', meaning, only those packages in @system... :) That said... since this entire system was 32bit up until my little mistake, and I only updated/compiled a few packages, my thoughts are that the vast majority of the system should be 'ok' - or am I missing something (wouldn't surprise me if I am)... Nuke the site from orbit...$FILL_IN_THE_BLANK :) It's unfortunate there's no tool to perform as revdep-rebuild, except checking that, e.g. a package was built with the current CHOST or CFLAGS set. The fact that I can run 'emerge --info $atomname' to get the build environment for a given $atomname tells me the system has enough information that this is possible. I simply don't know the finer details of where all this information lurks. But if I had such a tool, it would be of immense use to me while installing new systems; no need to emerge -e @world... signature.asc Description: OpenPGP digital signature
Re: SOLVED - was Re: [gentoo-user] Serious problem with linode vm
On Tue, Apr 16, 2013 at 1:09 PM, Michael Mol mike...@gmail.com wrote: It's unfortunate there's no tool to perform as revdep-rebuild, except checking that, e.g. a package was built with the current CHOST or CFLAGS set. The fact that I can run 'emerge --info $atomname' to get the build environment for a given $atomname tells me the system has enough information that this is possible. I simply don't know the finer details of where all this information lurks. But if I had such a tool, it would be of immense use to me while installing new systems; no need to emerge -e @world... Check out /var/db/pkg/$CATEGORY/$PKGNAME/ -- there are text files containing CFLAGS, CHOST and many others. You or someone like you should be able to hack together a simple script to look for differences. :)
Re: SOLVED - was Re: [gentoo-user] Serious problem with linode vm
On Tue, 16 Apr 2013 13:18:51 -0500, Paul Hartman wrote: It's unfortunate there's no tool to perform as revdep-rebuild, except checking that, e.g. a package was built with the current CHOST or CFLAGS set. The fact that I can run 'emerge --info $atomname' to get the build environment for a given $atomname tells me the system has enough information that this is possible. I simply don't know the finer details of where all this information lurks. But if I had such a tool, it would be of immense use to me while installing new systems; no need to emerge -e @world... Check out /var/db/pkg/$CATEGORY/$PKGNAME/ -- there are text files containing CFLAGS, CHOST and many others. You or someone like you should be able to hack together a simple script to look for differences. :) % source /etc/portage/make.conf % for f in /var/db/pkg/*/*/CFLAGS [[ $(cat $f) == $CFLAGS ]] || echo $f It does give quite a few hits though, because ebuilds can strip out flags. Of course, that in itself may be an indication that you have over-ricered your CFLAGS ;-) -- Neil Bothwick There are only two tragedies in life: one is not getting what one wants; and the other is getting it. - Oscar Wilde (1854-1900) signature.asc Description: PGP signature
Re: SOLVED - was Re: [gentoo-user] Serious problem with linode vm
On 04/16/2013 04:53 PM, Neil Bothwick wrote: On Tue, 16 Apr 2013 13:18:51 -0500, Paul Hartman wrote: It's unfortunate there's no tool to perform as revdep-rebuild, except checking that, e.g. a package was built with the current CHOST or CFLAGS set. The fact that I can run 'emerge --info $atomname' to get the build environment for a given $atomname tells me the system has enough information that this is possible. I simply don't know the finer details of where all this information lurks. But if I had such a tool, it would be of immense use to me while installing new systems; no need to emerge -e @world... Check out /var/db/pkg/$CATEGORY/$PKGNAME/ -- there are text files containing CFLAGS, CHOST and many others. You or someone like you should be able to hack together a simple script to look for differences. :) % source /etc/portage/make.conf % for f in /var/db/pkg/*/*/CFLAGS [[ $(cat $f) == $CFLAGS ]] || echo $f It does give quite a few hits though, because ebuilds can strip out flags. ebuilds should not generally be stripping out flags. Certainly there are occasional and valid cases, but they're pretty rare. Heck, last time I reported a bug where a flag was causing a build failure (since the package was using a compiler different from the system compiler), I was told it wasn't the packaging system's job to deal with that kind of bug. Of course, that in itself may be an indication that you have over-ricered your CFLAGS ;-) Or you might simply know what you're doing. http://en.gentoo-wiki.com/wiki/Hardware_CFLAGS#Determining_available_processor_features Highly valuable if you're going to use distcc. signature.asc Description: OpenPGP digital signature
[gentoo-user] Serious problem with linode vm
Hi all, Help! :( I have a serious problem with our production DB machine hosted on linode, and I hope someone can help me. They recently updated their hardware, but taking advantage of it required 'migrating' our machines... I migrated our dev server first, and it failed to boot after the migration, but a question to their support suggested changing to the most recent 64bit kernel - and this worked, it came up fine, and so did the dev database. The production server appeared to migrate ok, and even booted, but the DB did not come up (postgresql)... I had to remote in through their Lish console because SSH wasn't working either. I attempted to change the kernel to the latest 64bit on this one too to see if that would work, but nothing... I've changed the kernel back to 32bit, but there is weirdness going on... Some things will compile ok (portage, gentoolkit, openssh, openssl), others won't (ncurses, libxml2). Right now I'd just really like to get SSH working again so I can scp in and grab all of my data. I've tried recompiling both (both compile/install ok), but when I try to start SSHD I get: # /etc/init.d/sshd start /etc/init.d/sshd: line 18: 2079 Illegal instruction ${SSHD_BINARY} -t ${SSHD_OPTS} * ERROR: sshd failed to start Anyone?
Re: [gentoo-user] Serious problem with linode vm
On 04/15/2013 11:37 AM, Tanstaafl wrote: Hi all, Help! :( [snip] I've tried recompiling both (both compile/install ok), but when I try to start SSHD I get: # /etc/init.d/sshd start /etc/init.d/sshd: line 18: 2079 Illegal instruction ${SSHD_BINARY} -t ${SSHD_OPTS} * ERROR: sshd failed to start ^^ That screams 'CFLAGS' issue. Verify that the CFLAGS for your prod server are the same (or close enough) to that of your dev server. Guessing the new host has different CPU capabilities exposed to the guest, either because of a differing hypervisor configuraiton, or because of the different underlying hardware. signature.asc Description: OpenPGP digital signature
Re: [gentoo-user] Serious problem with linode vm
On 2013-04-15 11:42 AM, Michael Mol mike...@gmail.com wrote: On 04/15/2013 11:37 AM, Tanstaafl wrote: Hi all, Help! :( [snip] I've tried recompiling both (both compile/install ok), but when I try to start SSHD I get: # /etc/init.d/sshd start /etc/init.d/sshd: line 18: 2079 Illegal instruction ${SSHD_BINARY} -t ${SSHD_OPTS} * ERROR: sshd failed to start ^^ That screams 'CFLAGS' issue. Verify that the CFLAGS for your prod server are the same (or close enough) to that of your dev server. Guessing the new host has different CPU capabilities exposed to the guest, either because of a differing hypervisor configuraiton, or because of the different underlying hardware. Thanks Michael - will check this as soon as I can (in the middle of another compile attempt now)... So, if this is the case... would that mean I need to remerge @system (and eventually @world)?
Re: [gentoo-user] Serious problem with linode vm
On 2013-04-15 11:42 AM, Michael Mol mike...@gmail.com wrote: ^^ That screams 'CFLAGS' issue. Verify that the CFLAGS for your prod server are the same (or close enough) to that of your dev server. Hmmm, they are different... Dev (working) server has: CFLAGS=-O2 -march=i686 -pipe Prod server has: CFLAGS=-march=native -O2 -pipe But the Dev server is currently running a 64bit kernel... I'm confused about how this works in a hosted virtual environment. My Dev server failed to come up after the migration, until their tech support suggested switching to the 64bit kernel... did that and it is fine now (or appears to be)... But the Prod server is still on the 32bit kernel... Should I switch it to 64bit and change the CFLAGS to the same as the dev server?
Re: [gentoo-user] Serious problem with linode vm
On 2013-04-15 11:42 AM, Michael Mol mike...@gmail.com wrote: Guessing the new host has different CPU capabilities exposed to the guest, either because of a differing hypervisor configuraiton, or because of the different underlying hardware. Hmmm again... CHOST is the same on both: CHOST=i686-pc-linux-gnu
Re: [gentoo-user] Serious problem with linode vm
On 2013-04-15 11:51 AM, Tanstaafl tansta...@libertytrek.org wrote: I'm confused about how this works in a hosted virtual environment. My Dev server failed to come up after the migration, until their tech support suggested switching to the 64bit kernel... did that and it is fine now (or appears to be)... But the Prod server is still on the 32bit kernel... Should I switch it to 64bit and change the CFLAGS to the same as the dev server? Can you run a 64bit kernel on a system that was originally running/compiled with 32bit?
Re: [gentoo-user] Serious problem with linode vm
On 15/04/13 16:53, Tanstaafl wrote: On 2013-04-15 11:42 AM, Michael Mol mike...@gmail.com wrote: Guessing the new host has different CPU capabilities exposed to the guest, either because of a differing hypervisor configuraiton, or because of the different underlying hardware. Hmmm again... CHOST is the same on both: CHOST=i686-pc-linux-gnu Hi Tanstaafl, Basically your issue is that your Gentoo system is compiled to use a specific instruction set, by doing this you get a very small performance increase on that exact CPU model at the cost of incompatibility with other CPUs. In a virtual environment, especially one where you do not control the hardware this is not a great idea since your provider can swap out your CPU with a different model and there isn't much you can do about it. If I was you I'd boot off the rescue CD Linode provide, mount your root device, chroot in and set the following values in make.conf for both systems: CFLAGS=-O2 -mtune=generic -pipe CHOST=i686-pc-linux-gnu These settings will build packages that will work on almost every modern CPU out there, once they are set you'll need to re-build @system and @world, hopefully the system will be able to cope with that or you're looking at a full re-build. You also want to be using 32 bit kernels on both systems since they are 32bit systems. --- Peter
Re: [gentoo-user] Serious problem with linode vm
On 04/15/2013 11:53 AM, Tanstaafl wrote: On 2013-04-15 11:42 AM, Michael Mol mike...@gmail.com wrote: Guessing the new host has different CPU capabilities exposed to the guest, either because of a differing hypervisor configuraiton, or because of the different underlying hardware. Hmmm again... CHOST is the same on both: CHOST=i686-pc-linux-gnu Argh. Reply to your own posts if you need to append content. Otherwise, I can't easily address everything at once. Anyway, you can (I believe) run a 64-bit kernel with a 32-bit CHOST. Your system is a tad hobbled that way, but it should work. It'd be like running multilib without the 64-bit side of things. Set your CFLAGS on your prod server to that of your dev server, if your dev server is known to work. You're using -march=native on your prod server, which depends on gcc correctly detecting CPU features from the host. There was a thread on this list just a few days ago about how that can fail in virtualized environments. (You can enable/disable exposed features piecemeal, which could well confuse the heck out of gcc's detection heuristics...) I don't know which instruction is 'illegal' on your new host, so, yeah, the safest path is going to be emerging, well, everything. You don't want some --as-needed lib getting pulled in some time down the road, causing a real headscratcher of a crash. As the saying goes[1], Nuke everything from orbit. It's the only way to be sure. You might be best served by setting up a new VM from scratch and copying over the bulk of your configuration (USE flags, daemon configurations, etc.). It's certainly something you should look into once you get this VM hobbling along again. [1] Where'd that come from, anyway? signature.asc Description: OpenPGP digital signature
Re: [gentoo-user] Serious problem with linode vm
On 04/15/2013 12:07 PM, Tanstaafl wrote: On 2013-04-15 11:51 AM, Tanstaafl tansta...@libertytrek.org wrote: I'm confused about how this works in a hosted virtual environment. My Dev server failed to come up after the migration, until their tech support suggested switching to the 64bit kernel... did that and it is fine now (or appears to be)... But the Prod server is still on the 32bit kernel... Should I switch it to 64bit and change the CFLAGS to the same as the dev server? Can you run a 64bit kernel on a system that was originally running/compiled with 32bit? I don't see why not. The 64-bit kernel provides all the hooks necessary for a 32-bit userspace. signature.asc Description: OpenPGP digital signature
Re: [gentoo-user] Serious problem with linode vm
Michael Mol mike...@gmail.com wrote: On 04/15/2013 12:07 PM, Tanstaafl wrote: On 2013-04-15 11:51 AM, Tanstaafl tansta...@libertytrek.org wrote: I'm confused about how this works in a hosted virtual environment. My Dev server failed to come up after the migration, until their tech support suggested switching to the 64bit kernel... did that and it is fine now (or appears to be)... But the Prod server is still on the 32bit kernel... Should I switch it to 64bit and change the CFLAGS to the same as the dev server? Can you run a 64bit kernel on a system that was originally running/compiled with 32bit? I don't see why not. The 64-bit kernel provides all the hooks necessary for a 32-bit userspace. If you do this, be sure to set the configs to emulate 32-bit otherwise your 32-bit apps will not work! -- Your life is like a penny. You're going to lose it. The question is: How do you spend it? John Covici cov...@ccs.covici.com
Re: [gentoo-user] Serious problem with linode vm
On 2013-04-15 12:51 PM, cov...@ccs.covici.com cov...@ccs.covici.com wrote: Michael Mol mike...@gmail.com wrote: On 04/15/2013 12:07 PM, Tanstaafl wrote: Can you run a 64bit kernel on a system that was originally running/compiled with 32bit? I don't see why not. The 64-bit kernel provides all the hooks necessary for a 32-bit userspace. If you do this, be sure to set the configs to emulate 32-bit otherwise your 32-bit apps will not work! Well... I don't know what to say, but my dev server - the working one - wouldn't even boot with a 32bit kernel after the migration (that is what it was running before), and it is running fine right now (apache/php and postgresql db) on a 64bit kernel...
Re: [gentoo-user] Serious problem with linode vm
On 04/15/2013 12:51 PM, cov...@ccs.covici.com wrote: Michael Mol mike...@gmail.com wrote: On 04/15/2013 12:07 PM, Tanstaafl wrote: On 2013-04-15 11:51 AM, Tanstaafl tansta...@libertytrek.org wrote: I'm confused about how this works in a hosted virtual environment. My Dev server failed to come up after the migration, until their tech support suggested switching to the 64bit kernel... did that and it is fine now (or appears to be)... But the Prod server is still on the 32bit kernel... Should I switch it to 64bit and change the CFLAGS to the same as the dev server? Can you run a 64bit kernel on a system that was originally running/compiled with 32bit? I don't see why not. The 64-bit kernel provides all the hooks necessary for a 32-bit userspace. If you do this, be sure to set the configs to emulate 32-bit otherwise your 32-bit apps will not work! Which configs? Be specific; to a 32-bit x86 process[1], a 64-bit kernel looks pretty much like a 32-bit kernel. [1] I just realized I can no longer say 32-bit and expect it to exactly mean x86. Going forward in general conversations, it could well mean x32... signature.asc Description: OpenPGP digital signature
Re: [gentoo-user] Serious problem with linode vm
On 2013-04-15 11:42 AM, Michael Mol mike...@gmail.com wrote: On 04/15/2013 11:37 AM, Tanstaafl wrote: Hi all, Help! :( [snip] I've tried recompiling both (both compile/install ok), but when I try to start SSHD I get: # /etc/init.d/sshd start /etc/init.d/sshd: line 18: 2079 Illegal instruction ${SSHD_BINARY} -t ${SSHD_OPTS} * ERROR: sshd failed to start ^^ That screams 'CFLAGS' issue. Verify that the CFLAGS for your prod server are the same (or close enough) to that of your dev server. Guessing the new host has different CPU capabilities exposed to the guest, either because of a differing hypervisor configuraiton, or because of the different underlying hardware. Ok, as I said, I got SSH working now and am making progress, updating @system a little at a time... Before I started updating everything in @system though, I tried ncurses again (one of the first ones that failed on me), and it still dies with this: INFO: setup Package:sys-libs/ncurses-5.9-r2 Repository: gentoo Maintainer: base-sys...@gentoo.org USE:abi_x86_32 cxx elibc_glibc gpm kernel_linux unicode userland_GNU x86 FEATURES: sandbox INFO: unpack Applying ncurses-5.8-gfbsd.patch ... Applying ncurses-5.7-nongnu.patch ... Applying ncurses-5.9-rxvt-unicode-9.15.patch ... Applying ncurses-5.9-fix-clang-build.patch ... ERROR: compile ERROR: sys-libs/ncurses-5.9-r2 failed (compile phase): (no error message) Call stack: ebuild.sh, line 93: Called src_compile environment, line 2340: Called do_compile 'narrowc' environment, line 467: Called die The specific snippet of code: emake ${make_flags} || die If you need support, post the output of `emerge --info '=sys-libs/ncurses-5.9-r2'`, the complete build log and the output of `emerge -pqv '=sys-libs/ncurses-5.9-r2'`. The complete build log is located at '/var/tmp/portage/sys-libs/ncurses-5.9-r2/temp/build.log'. The ebuild environment file is located at '/var/tmp/portage/sys-libs/ncurses-5.9-r2/temp/environment'. Working directory: '/var/tmp/portage/sys-libs/ncurses-5.9-r2/work/narrowc' S: '/var/tmp/portage/sys-libs/ncurses-5.9-r2/work/ncurses-5.9' Ideas?
Re: [gentoo-user] Serious problem with linode vm
On 04/15/2013 01:46 PM, Tanstaafl wrote: On 2013-04-15 11:42 AM, Michael Mol mike...@gmail.com wrote: On 04/15/2013 11:37 AM, Tanstaafl wrote: Hi all, Help! :( [snip] I've tried recompiling both (both compile/install ok), but when I try to start SSHD I get: # /etc/init.d/sshd start /etc/init.d/sshd: line 18: 2079 Illegal instruction ${SSHD_BINARY} -t ${SSHD_OPTS} * ERROR: sshd failed to start ^^ That screams 'CFLAGS' issue. Verify that the CFLAGS for your prod server are the same (or close enough) to that of your dev server. Guessing the new host has different CPU capabilities exposed to the guest, either because of a differing hypervisor configuraiton, or because of the different underlying hardware. Ok, as I said, I got SSH working now and am making progress, updating @system a little at a time... Before I started updating everything in @system though, I tried ncurses again (one of the first ones that failed on me), and it still dies with this: INFO: setup Package:sys-libs/ncurses-5.9-r2 Repository: gentoo Maintainer: base-sys...@gentoo.org USE:abi_x86_32 cxx elibc_glibc gpm kernel_linux unicode userland_GNU x86 FEATURES: sandbox INFO: unpack Applying ncurses-5.8-gfbsd.patch ... Applying ncurses-5.7-nongnu.patch ... Applying ncurses-5.9-rxvt-unicode-9.15.patch ... Applying ncurses-5.9-fix-clang-build.patch ... ERROR: compile ERROR: sys-libs/ncurses-5.9-r2 failed (compile phase): (no error message) Call stack: ebuild.sh, line 93: Called src_compile environment, line 2340: Called do_compile 'narrowc' environment, line 467: Called die The specific snippet of code: emake ${make_flags} || die If you need support, post the output of `emerge --info '=sys-libs/ncurses-5.9-r2'`, the complete build log and the output of `emerge -pqv '=sys-libs/ncurses-5.9-r2'`. The complete build log is located at '/var/tmp/portage/sys-libs/ncurses-5.9-r2/temp/build.log'. The ebuild environment file is located at '/var/tmp/portage/sys-libs/ncurses-5.9-r2/temp/environment'. Working directory: '/var/tmp/portage/sys-libs/ncurses-5.9-r2/work/narrowc' S: '/var/tmp/portage/sys-libs/ncurses-5.9-r2/work/ncurses-5.9' Ideas? I'd guess that something used as part of ncurses's build process is failing. Were this one of my systems (none of which is in a prod scenario, so take it with a grain of salt), I'd emerge -e --keep-going @system, and then emerge --resume a few times. You're stuck in something not unlike a bootstrap scenario. signature.asc Description: OpenPGP digital signature
Re: [gentoo-user] Serious problem with linode vm
On 2013-04-15 1:46 PM, Tanstaafl tansta...@libertytrek.org wrote: On 2013-04-15 11:42 AM, Michael Mol mike...@gmail.com wrote: On 04/15/2013 11:37 AM, Tanstaafl wrote: Hi all, Help! :( [snip] I've tried recompiling both (both compile/install ok), but when I try to start SSHD I get: # /etc/init.d/sshd start /etc/init.d/sshd: line 18: 2079 Illegal instruction ${SSHD_BINARY} -t ${SSHD_OPTS} * ERROR: sshd failed to start ^^ That screams 'CFLAGS' issue. Verify that the CFLAGS for your prod server are the same (or close enough) to that of your dev server. Guessing the new host has different CPU capabilities exposed to the guest, either because of a differing hypervisor configuraiton, or because of the different underlying hardware. Ok, as I said, I got SSH working now and am making progress, updating @system a little at a time... Ok, I think all I need to get our db back up is to remerge php, but it is failing. The last error appears to be the zlib check. I did already try emerge -1 sys-libs/zlib and retrying to emerge php, but got the same error: checking for ZLIB support... yes checking if the location of ZLIB install directory is defined... no checking for zlib version = 1.2.0.4... configure: error: libz version greater or equal to 1.2.0.4 required !!! Please attach the following file when seeking support: !!! /var/tmp/portage/dev-lang/php-5.4.13/work/sapis-build/cli/config.log * ERROR: dev-lang/php-5.4.13 failed (configure phase): * econf failed * * Call stack: * ebuild.sh, line 93: Called src_configure *environment, line 4080: Called econf '--prefix=/usr/lib/php5.4' '--mandir=/usr/lib/php5.4/man' '--infodir=/usr/lib/php5.4/info' '--libdir=/usr/lib/php5.4/lib' '--with-libdir=lib' '--without-pear' '--disable-maintainer-zts' '--disable-bcmath' '--with-bz2=/usr' '--disable-calendar' '--enable-ctype' '--without-curl' '--without-curlwrappers' '--enable-dom' '--without-enchant' '--disable-exif' '--enable-fileinfo' '--enable-filter' '--disable-ftp' '--with-gettext=/usr' '--without-gmp' '--enable-hash' '--without-mhash' '--with-iconv' '--disable-intl' '--disable-ipv6' '--enable-json' '--without-kerberos' '--enable-libxml' '--with-libxml-dir=/usr' '--enable-mbstring' '--with-mcrypt=/usr' '--without-mssql' '--with-onig=/usr' '--with-openssl=/usr' '--with-openssl-dir=/usr' '--disable-pcntl' '--enable-phar' '--disable-pdo' '--with-pgsql=/usr' '--enable-posix' '--with-pspell=/usr' '--without-recode' '--enable-simplexml' '--disable-shmop' '--without-snmp' '--disable-soap' '--enable-sockets' '--without-sqlite3' '--without-sybase-ct' '--disable-sysvmsg' '--disable-sysvsem' '--disable-sysvshm' '--without-tidy' '--enable-tokenizer' '--disable-wddx' '--enable-xml' '--disable-xmlreader' '--disable-xmlwriter' '--with-xmlrpc' '--without-xsl' '--disable-zip' '--with-zlib=/usr' '--disable-debug' '--enable-dba' '--without-cdb' '--with-db4=/usr' '--disable-flatfile' '--with-gdbm=/usr' '--disable-inifile' '--without-qdbm' '--without-freetype-dir' '--without-t1lib' '--disable-gd-jis-conv' '--without-jpeg-dir' '--without-png-dir' '--without-xpm-dir' '--without-gd' '--with-imap=/usr' '--with-imap-ssl=/usr' '--without-mysqli' '--with-readline=/usr' '--without-libedit' '--without-mm' '--with-pic' '--with-pcre-regex=/usr' '--with-pcre-dir=/usr' '--with-config-file-path=/etc/php/cli-php5.4' '--with-config-file-scan-dir=/etc/php/cli-php5.4/ext-active' '--disable-embed' '--enable-cli' '--disable-cgi' '--disable-fpm' '--without-apxs2' * phase-helpers.sh, line 521: Called die * The specific snippet of code: * die econf failed * * If you need support, post the output of `emerge --info '=dev-lang/php-5.4.13'`, * the complete build log and the output of `emerge -pqv '=dev-lang/php-5.4.13'`. * The complete build log is located at '/var/tmp/portage/dev-lang/php-5.4.13/temp/build.log'. * The ebuild environment file is located at '/var/tmp/portage/dev-lang/php-5.4.13/temp/environment'. * Working directory: '/var/tmp/portage/dev-lang/php-5.4.13/work/sapis-build/cli' * S: '/var/tmp/portage/dev-lang/php-5.4.13/work/php-5.4.13' I hope this gives someone a hint...
Re: [gentoo-user] Serious problem with linode vm
On 2013-04-15 2:03 PM, Tanstaafl tansta...@libertytrek.org wrote: Ok, I think all I need to get our db back up is to remerge php, but it is failing. The last error appears to be the zlib check. I did already try emerge -1 sys-libs/zlib and retrying to emerge php, but got the same error: Ok, added -zlib to package.mask and it is compiling now... I just don't know if I need zlib support for our DB app... sigh If this doesn't work I'll try your suggestion of: Were this one of my systems (none of which is in a prod scenario, so take it with a grain of salt), I'd emerge -e --keep-going @system, and then emerge --resume a few times. You're stuck in something not unlike a bootstrap scenario. Thanks a lot Michael... first time anything like this has happened to me in a long time. I forgot what it is like to have users (and bosses) breathing down my neck like this...
Re: [gentoo-user] Serious problem with linode vm
Michael Mol mike...@gmail.com wrote: On 04/15/2013 12:51 PM, cov...@ccs.covici.com wrote: Michael Mol mike...@gmail.com wrote: On 04/15/2013 12:07 PM, Tanstaafl wrote: On 2013-04-15 11:51 AM, Tanstaafl tansta...@libertytrek.org wrote: I'm confused about how this works in a hosted virtual environment. My Dev server failed to come up after the migration, until their tech support suggested switching to the 64bit kernel... did that and it is fine now (or appears to be)... But the Prod server is still on the 32bit kernel... Should I switch it to 64bit and change the CFLAGS to the same as the dev server? Can you run a 64bit kernel on a system that was originally running/compiled with 32bit? I don't see why not. The 64-bit kernel provides all the hooks necessary for a 32-bit userspace. If you do this, be sure to set the configs to emulate 32-bit otherwise your 32-bit apps will not work! Which configs? Be specific; to a 32-bit x86 process[1], a 64-bit kernel looks pretty much like a 32-bit kernel. [1] I just realized I can no longer say 32-bit and expect it to exactly mean x86. Going forward in general conversations, it could well mean x32... I was thinking primarily of ia32 emulation -- I made a kernel and got burned by not having this on by mistake and then my 64-bit kernel would not execute any 32-bit program. -- Your life is like a penny. You're going to lose it. The question is: How do you spend it? John Covici cov...@ccs.covici.com
Re: [gentoo-user] Serious problem with linode vm
On 04/15/2013 02:08 PM, Tanstaafl wrote: On 2013-04-15 2:03 PM, Tanstaafl tansta...@libertytrek.org wrote: Ok, I think all I need to get our db back up is to remerge php, but it is failing. The last error appears to be the zlib check. I did already try emerge -1 sys-libs/zlib and retrying to emerge php, but got the same error: Ok, added -zlib to package.mask and it is compiling now... I just don't know if I need zlib support for our DB app... sigh If this doesn't work I'll try your suggestion of: Were this one of my systems (none of which is in a prod scenario, so take it with a grain of salt), I'd emerge -e --keep-going @system, and then emerge --resume a few times. You're stuck in something not unlike a bootstrap scenario. Thanks a lot Michael... first time anything like this has happened to me in a long time. I forgot what it is like to have users (and bosses) breathing down my neck like this... That system is going to require a great deal of cleanup and maintenance to get fully reliable again. Once everything's been rebuilt, you should be able to have zlib back, etc. It'll just take a while to to clean up. I repeat my suggestion that you set up an alternate server and aim to migrate to that. It's amazing what you can do with failover, replication, etc signature.asc Description: OpenPGP digital signature
Re: [gentoo-user] Serious problem with linode vm
On 2013-04-15 2:08 PM, Tanstaafl tansta...@libertytrek.org wrote: On 2013-04-15 2:03 PM, Tanstaafl tansta...@libertytrek.org wrote: Ok, I think all I need to get our db back up is to remerge php, but it is failing. The last error appears to be the zlib check. I did already try emerge -1 sys-libs/zlib and retrying to emerge php, but got the same error: Ok, added -zlib to package.mask and it is compiling now... I just don't know if I need zlib support for our DB app... sigh Ok, apparently it requires zlib... apache now starts, but I just get a totally blank web page instead of the login page. Oh well, onward...
Re: [gentoo-user] Serious problem with linode vm
On 2013-04-15 2:02 PM, Michael Mol mike...@gmail.com wrote: Were this one of my systems (none of which is in a prod scenario, so take it with a grain of salt), I'd emerge -e --keep-going @system, and then emerge --resume a few times. You're stuck in something not unlike a bootstrap scenario. Ok, before I start... Michael, if this were you, would you use the 32bit or 64bit kernel when doing the emerge -e --keep-going system? Again, the system was initially rolled out and was always 32 bit...
Re: [gentoo-user] Serious problem with linode vm
On 04/15/2013 02:54 PM, Tanstaafl wrote: On 2013-04-15 2:02 PM, Michael Mol mike...@gmail.com wrote: Were this one of my systems (none of which is in a prod scenario, so take it with a grain of salt), I'd emerge -e --keep-going @system, and then emerge --resume a few times. You're stuck in something not unlike a bootstrap scenario. Ok, before I start... Michael, if this were you, would you use the 32bit or 64bit kernel when doing the emerge -e --keep-going system? Again, the system was initially rolled out and was always 32 bit... If this were me, I would set up a clean install from scratch. No, I wouldn't use a x86 userspace with a x64 kernel, but that's because of the benefits I see with the 64-bit arch, not with any issues I'd be aware of from using an x64 kernel with an x32 userspace. To me, that's the fastest way to get a system I'd deem reliable. But it's a lot faster to do with distros other than Gentoo, and rather requires having an up-to-date install script if you intend to do it with Gentoo... You're in an ugly scenario, though, because you don't have the benefit of a spare environment to produce a prod setup within. You've mentioned you couldn't get the system to run at all with a 32-bit kernel on the new hardware. Fair enough. I wouldn't dare try changing the system from a 32-bit CHOST to a 64-bit CHOST, though; I've never walked that path before, even if there are those who have. It's certainly not something I'd do on a should-be-live prod system. In your position, if I had to use the existing system without a from-scratch build/install, I would continue with the 32-bit userland and 64-bit kernel. To me, that's the least risky of the alternatives, given the constraints involved. To be clear, that's also a last resort...I would lobby *hard* to do a clean from-scratch setup in a different VM before treading that path (even when I do major upgrades of rosettacode.org, I go through a brief period where I have two VMs as I migrate services from one to the other), and my keyboard might well not survive the impacts of my hands while I typed out commands to do it any other way; I'm very hard on keyboards when very angry. signature.asc Description: OpenPGP digital signature
PostgreSQL guy in the house? - WAS: Re: [gentoo-user] Serious problem with linode vm
On 2013-04-15 3:11 PM, Michael Mol mike...@gmail.com wrote: If this were me, I would set up a clean install from scratch. No, I wouldn't use a x86 userspace with a x64 kernel, but that's because of the benefits I see with the 64-bit arch, not with any issues I'd be aware of from using an x64 kernel with an x32 userspace. I understand and agree, and am doing that as we speak. I was just trying to get it back up and running quickly, but that didn't happen. Ok - now... is there a postgresql guy in the house? Can someone confirm that the command I need to use to dump the entire pg database for a full restore on a new/clean machine would be: pg_dumpall --username=username -o -f /home/myuser/mydb_backup.sql.gz ? Will that get everything? I'm also planning on stopping pg (it is running ok, it is PHP that is the problem), then just tar -pvczf /home/myuser/pg91_data.tar.gz /var/lib/postgresql/9.1/data will that suffice as another backup of all of the data that could be used for restoration? Thanks guys... this was not a fun day...
Re: [gentoo-user] Serious problem with linode vm
On 2013-04-15 12:10 PM, Michael Mol mike...@gmail.com wrote: Argh. Reply to your own posts if you need to append content. Otherwise, I can't easily address everything at once. Sorry, I usually do, but I'm kind of flustered right now... Anyway, you can (I believe) run a 64-bit kernel with a 32-bit CHOST. Your system is a tad hobbled that way, but it should work. It'd be like running multilib without the 64-bit side of things. I went ahead and switched back to the 32bit kernel, updated gcc, recompiled openssl/openssh and finally got ssh working again (whew, their Lish Web console sucks)... Set your CFLAGS on your prod server to that of your dev server, if your dev server is known to work. You're using -march=native on your prod server, which depends on gcc correctly detecting CPU features from the host. There was a thread on this list just a few days ago about how that can fail in virtualized environments. (You can enable/disable exposed features piecemeal, which could well confuse the heck out of gcc's detection heuristics...) I think I know now - it was glib that got compiled, but not while running a 64bit kernel... it must have been the march-native that screwed it up. I don't know which instruction is 'illegal' on your new host, so, yeah, the safest path is going to be emerging, well, everything. You don't want some --as-needed lib getting pulled in some time down the road, causing a real headscratcher of a crash. As the saying goes[1], Nuke everything from orbit. It's the only way to be sure. snip [1] Where'd that come from, anyway? Aliens? Thanks again Michael
Re: PostgreSQL guy in the house? - WAS: Re: [gentoo-user] Serious problem with linode vm
Tanstaafl tansta...@libertytrek.org wrote: On 2013-04-15 3:11 PM, Michael Mol mike...@gmail.com wrote: If this were me, I would set up a clean install from scratch. No, I wouldn't use a x86 userspace with a x64 kernel, but that's because of the benefits I see with the 64-bit arch, not with any issues I'd be aware of from using an x64 kernel with an x32 userspace. I understand and agree, and am doing that as we speak. I was just trying to get it back up and running quickly, but that didn't happen. Ok - now... is there a postgresql guy in the house? Can someone confirm that the command I need to use to dump the entire pg database for a full restore on a new/clean machine would be: pg_dumpall --username=username -o -f /home/myuser/mydb_backup.sql.gz ? Will that get everything? I'm also planning on stopping pg (it is running ok, it is PHP that is the problem), then just tar -pvczf /home/myuser/pg91_data.tar.gz /var/lib/postgresql/9.1/data will that suffice as another backup of all of the data that could be used for restoration? Thanks guys... this was not a fun day... Tanstaafl. The pg_dumpall command will generate SQL scripts to restore the entire datastructure and data needed to rebuild the entire database server. The SQL will not be compressed. So I would leave the .gz off the filename. You will also need the configuration files pg_hba.conf and postgresql.conf. (Doing this from memory on my mobile.) Best also have a quick check on the postgresql website and mailing list. The last migration to a new server was done by backing up every database seperately using the pg_dump command. This made restoring simpler because the template databases already exist when the database is running. The tar-command will also get nearly everything if you kept the default locations. Restoring that should also suffice if you restore it to a 9.1 postgresql. Don't forget the files in /etc/postgresql*/ Any questions. Put them on here. I'm off to my customer soon. Should be back on email in about 1.5 hours... -- Joost Roeleveld -- Sent from my Android phone with K-9 Mail. Please excuse my brevity.