Re: SOLVED - was Re: [gentoo-user] Serious problem with linode vm

2013-04-17 Thread Peter Humphrey
On Tuesday 16 April 2013 22:15:32 Michael Mol wrote:

 Or you might simply know what you're doing.
 
 http://en.gentoo-wiki.com/wiki/Hardware_CFLAGS#Determining_available_proc
 essor_features

Out of curiosity I had a look at that guide, but `$ echo  | gcc -
march=native -v -E - 21 | grep cc1` told me I have an i7 
processor. In fact it's an i5, which I think is an i7 without 
hyperthreading.

I decided to leave well alone. Was that too cautious?

-- 
Peter


SOLVED - was Re: [gentoo-user] Serious problem with linode vm

2013-04-16 Thread Tanstaafl

On 2013-04-15 2:02 PM, Michael Mol mike...@gmail.com wrote:

Were this one of my systems (none of which is in a prod scenario, so
take it with a grain of salt), I'd emerge -e --keep-going @system, and
then emerge --resume a few times. You're stuck in something not unlike a
bootstrap scenario.


Ok, well, the DB was down, and I had the data backed up, so last resort, 
I switched back to the 32bit kernel, rebooted, and started the first 
emerge -e --keep-going @system, and left for home to continue working on 
it from there...


It was done by the time I got home (about 25 minute drive), so didn't 
take nearly as long as I had feared - mostly because about 28 packages - 
most of them the ones that take a really long time (like glib, glibc and 
gcc) died almost immediately...


After the first one completed, I did emerge --resume until everything 
was emerged.


Then I started it all over again, and this time, *everything* recompiled 
successfully!


But, apache still wouldn't start up. The error was PHP related, so, I 
rebuilt that with emerge -vu (with 5.4 masked so it would pull in the 
latest update to 5.3 since emerging -vuk (reinstalling the quickpkg'd 
masked version) didn't work - and this time PHP successfully updated, 
and presto, everything is now working as expected!


I'm still planning on finishing up the new server (had already started 
on it) and migrating the DB to it, but now the pressure is off.


So, massive thanks! to Michael for the suggestion (had heard of totally 
rebuilding the entire system using -e and --keep-going, but never done 
it)... and of course, gentoo is amazing.


Charles



Re: SOLVED - was Re: [gentoo-user] Serious problem with linode vm

2013-04-16 Thread Michael Mol
On 04/16/2013 11:23 AM, Tanstaafl wrote:
 On 2013-04-15 2:02 PM, Michael Mol mike...@gmail.com wrote:
 Were this one of my systems (none of which is in a prod scenario, so
 take it with a grain of salt), I'd emerge -e --keep-going @system, and
 then emerge --resume a few times. You're stuck in something not unlike a
 bootstrap scenario.
 
 Ok, well, the DB was down, and I had the data backed up, so last resort,
 I switched back to the 32bit kernel, rebooted, and started the first
 emerge -e --keep-going @system, and left for home to continue working on
 it from there...
 
 It was done by the time I got home (about 25 minute drive), so didn't
 take nearly as long as I had feared - mostly because about 28 packages -
 most of them the ones that take a really long time (like glib, glibc and
 gcc) died almost immediately...
 
 After the first one completed, I did emerge --resume until everything
 was emerged.
 
 Then I started it all over again, and this time, *everything* recompiled
 successfully!
 
 But, apache still wouldn't start up. The error was PHP related, so, I
 rebuilt that with emerge -vu (with 5.4 masked so it would pull in the
 latest update to 5.3 since emerging -vuk (reinstalling the quickpkg'd
 masked version) didn't work - and this time PHP successfully updated,
 and presto, everything is now working as expected!
 
 I'm still planning on finishing up the new server (had already started
 on it) and migrating the DB to it, but now the pressure is off.
 
 So, massive thanks! to Michael for the suggestion (had heard of totally
 rebuilding the entire system using -e and --keep-going, but never done
 it)... and of course, gentoo is amazing.

To be clear, you didn't rebuild the entire system. You rebuilt core
packages. To rebuild the entire system, it'd be:

emerge -e @world # Plus whatever else there is.

You're still at risk of non-@system packages having risky opcodes.
Sounds like PHP turned out to be one of those. You will probably need to
rebuild others.

But I'm very glad I was able to help. :)




signature.asc
Description: OpenPGP digital signature


Re: SOLVED - was Re: [gentoo-user] Serious problem with linode vm

2013-04-16 Thread Tanstaafl

On 2013-04-16 11:28 AM, Michael Mol mike...@gmail.com wrote:

To be clear, you didn't rebuild the entire system. You rebuilt core
packages. To rebuild the entire system, it'd be:

emerge -e @world


Correct - which is why I said @system... ;)


# Plus whatever else there is.


Hmmm... are there really packages that belong to neither @system or 
@world? How would I go about finding/updating these?



You're still at risk of non-@system packages having risky opcodes.
Sounds like PHP turned out to be one of those. You will probably need to
rebuild others.


Correct again, and am planning on doing this this weekend, *after* I get 
Linode's backups enabled and get a good snapshot of the system in 
'working' (if possibly partially borked) condition...



But I'm very glad I was able to help. :)


Me too... :)



Re: SOLVED - was Re: [gentoo-user] Serious problem with linode vm

2013-04-16 Thread Michael Mol
On 04/16/2013 11:50 AM, Tanstaafl wrote:
 On 2013-04-16 11:28 AM, Michael Mol mike...@gmail.com wrote:
 To be clear, you didn't rebuild the entire system. You rebuilt core
 packages. To rebuild the entire system, it'd be:

 emerge -e @world
 
 Correct - which is why I said @system... ;)
 
 # Plus whatever else there is.
 
 Hmmm... are there really packages that belong to neither @system or
 @world? How would I go about finding/updating these?

I must have missed where you ran emerge -e @world. Oops. :)

But, yes, there are packages which don't belong to @system or @world.
@world consists of things you've explicitly asked for. @system consists
of things that are deemed inherently necessary for system operation.

Neither necessarily contains things like x11-libs/libX11, which may be
pulled in as a dependency of something in @world or @system (depending
on USE flags, etc). Various python modules would also not be found in
either @world or @system. @system and @world are explicit
statements...there are still the dependencies to worry about.

But if you ran emerge -e @world, the -e would have picked those up.

[snip]




signature.asc
Description: OpenPGP digital signature


Re: SOLVED - was Re: [gentoo-user] Serious problem with linode vm

2013-04-16 Thread Tanstaafl

On 2013-04-16 12:12 PM, Michael Mol mike...@gmail.com wrote:

I must have missed where you ran emerge -e @world. Oops. :)


I didn't... I was replying to your comment that implied that I thought I 
had rebuilt my entire 'system', when in fact I specified '@system', 
meaning, only those packages in @system... :)


That said... since this entire system was 32bit up until my little 
mistake, and I only updated/compiled a few packages, my thoughts are 
that the vast majority of the system should be 'ok' - or am I missing 
something (wouldn't surprise me if I am)...



But, yes, there are packages which don't belong to @system or @world.
@world consists of things you've explicitly asked for. @system consists
of things that are deemed inherently necessary for system operation.

Neither necessarily contains things like x11-libs/libX11, which may be
pulled in as a dependency of something in @world or @system (depending
on USE flags, etc). Various python modules would also not be found in
either @world or @system. @system and @world are explicit
statements...there are still the dependencies to worry about.

But if you ran emerge -e @world, the -e would have picked those up.


Gotcha... thanks for the clarification.



Re: SOLVED - was Re: [gentoo-user] Serious problem with linode vm

2013-04-16 Thread Michael Mol
On 04/16/2013 02:03 PM, Tanstaafl wrote:
 On 2013-04-16 12:12 PM, Michael Mol mike...@gmail.com wrote:
 I must have missed where you ran emerge -e @world. Oops. :)
 
 I didn't... I was replying to your comment that implied that I thought I
 had rebuilt my entire 'system', when in fact I specified '@system',
 meaning, only those packages in @system... :)
 
 That said... since this entire system was 32bit up until my little
 mistake, and I only updated/compiled a few packages, my thoughts are
 that the vast majority of the system should be 'ok' - or am I missing
 something (wouldn't surprise me if I am)...

Nuke the site from orbit...$FILL_IN_THE_BLANK :)

It's unfortunate there's no tool to perform as revdep-rebuild, except
checking that, e.g. a package was built with the current CHOST or CFLAGS
set. The fact that I can run 'emerge --info $atomname' to get the build
environment for a given $atomname tells me the system has enough
information that this is possible. I simply don't know the finer details
of where all this information lurks. But if I had such a tool, it would
be of immense use to me while installing new systems; no need to emerge
-e @world...



signature.asc
Description: OpenPGP digital signature


Re: SOLVED - was Re: [gentoo-user] Serious problem with linode vm

2013-04-16 Thread Paul Hartman
On Tue, Apr 16, 2013 at 1:09 PM, Michael Mol mike...@gmail.com wrote:

 It's unfortunate there's no tool to perform as revdep-rebuild, except
 checking that, e.g. a package was built with the current CHOST or CFLAGS
 set. The fact that I can run 'emerge --info $atomname' to get the build
 environment for a given $atomname tells me the system has enough
 information that this is possible. I simply don't know the finer details
 of where all this information lurks. But if I had such a tool, it would
 be of immense use to me while installing new systems; no need to emerge
 -e @world...

Check out /var/db/pkg/$CATEGORY/$PKGNAME/ -- there are text files
containing CFLAGS, CHOST and many others. You or someone like you
should be able to hack together a simple script to look for
differences. :)



Re: SOLVED - was Re: [gentoo-user] Serious problem with linode vm

2013-04-16 Thread Neil Bothwick
On Tue, 16 Apr 2013 13:18:51 -0500, Paul Hartman wrote:

  It's unfortunate there's no tool to perform as revdep-rebuild, except
  checking that, e.g. a package was built with the current CHOST or
  CFLAGS set. The fact that I can run 'emerge --info $atomname' to get
  the build environment for a given $atomname tells me the system has
  enough information that this is possible. I simply don't know the
  finer details of where all this information lurks. But if I had such
  a tool, it would be of immense use to me while installing new
  systems; no need to emerge -e @world...  
 
 Check out /var/db/pkg/$CATEGORY/$PKGNAME/ -- there are text files
 containing CFLAGS, CHOST and many others. You or someone like you
 should be able to hack together a simple script to look for
 differences. :)

% source /etc/portage/make.conf
% for f in /var/db/pkg/*/*/CFLAGS   
[[ $(cat $f) == $CFLAGS ]] || echo $f

It does give quite a few hits though, because ebuilds can strip out
flags. Of course, that in itself may be an indication that you have
over-ricered your CFLAGS ;-)


-- 
Neil Bothwick

There are only two tragedies in life: one is not getting what one wants;
and the other is getting it. - Oscar Wilde (1854-1900)


signature.asc
Description: PGP signature


Re: SOLVED - was Re: [gentoo-user] Serious problem with linode vm

2013-04-16 Thread Michael Mol
On 04/16/2013 04:53 PM, Neil Bothwick wrote:
 On Tue, 16 Apr 2013 13:18:51 -0500, Paul Hartman wrote:
 
 It's unfortunate there's no tool to perform as revdep-rebuild, 
 except checking that, e.g. a package was built with the current 
 CHOST or CFLAGS set. The fact that I can run 'emerge --info 
 $atomname' to get the build environment for a given $atomname 
 tells me the system has enough information that this is
 possible. I simply don't know the finer details of where all
 this information lurks. But if I had such a tool, it would be of 
 immense use to me while installing new systems; no need to
 emerge -e @world...
 
 Check out /var/db/pkg/$CATEGORY/$PKGNAME/ -- there are text files 
 containing CFLAGS, CHOST and many others. You or someone like you 
 should be able to hack together a simple script to look for 
 differences. :)
 
 % source /etc/portage/make.conf % for f in /var/db/pkg/*/*/CFLAGS [[
 $(cat $f) == $CFLAGS ]] || echo $f
 
 It does give quite a few hits though, because ebuilds can strip out 
 flags.

ebuilds should not generally be stripping out flags. Certainly there are
occasional and valid cases, but they're pretty rare. Heck, last time I
reported a bug where a flag was causing a build failure (since the
package was using a compiler different from the system compiler), I was
told it wasn't the packaging system's job to deal with that kind of bug.

 Of course, that in itself may be an indication that you have 
 over-ricered your CFLAGS ;-)

Or you might simply know what you're doing.

http://en.gentoo-wiki.com/wiki/Hardware_CFLAGS#Determining_available_processor_features

Highly valuable if you're going to use distcc.



signature.asc
Description: OpenPGP digital signature


[gentoo-user] Serious problem with linode vm

2013-04-15 Thread Tanstaafl

Hi all,

Help! :(

I have a serious problem with our production DB machine hosted on 
linode, and I hope someone can help me.


They recently updated their hardware, but taking advantage of it 
required 'migrating' our machines...


I migrated our dev server first, and it failed to boot after the 
migration, but a question to their support suggested changing to the 
most recent 64bit kernel - and this worked, it came up fine, and so did 
the dev database.


The production server appeared to migrate ok, and even booted, but the 
DB did not come up (postgresql)...


I had to remote in through their Lish console because SSH wasn't working 
either.


I attempted to change the kernel to the latest 64bit on this one too to 
see if that would work, but nothing...


I've changed the kernel back to 32bit, but there is weirdness going on...

Some things will compile ok (portage, gentoolkit, openssh, openssl), 
others won't (ncurses, libxml2).


Right now I'd just really like to get SSH working again so I can scp in 
and grab all of my data.


I've tried recompiling both (both compile/install ok), but when I try to 
start SSHD I get:


 # /etc/init.d/sshd start
/etc/init.d/sshd: line 18: 2079 Illegal instruction ${SSHD_BINARY} -t 
${SSHD_OPTS}

* ERROR: sshd failed to start

Anyone?



Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread Michael Mol
On 04/15/2013 11:37 AM, Tanstaafl wrote:
 Hi all,
 
 Help! :(

[snip]

 
 I've tried recompiling both (both compile/install ok), but when I try to
 start SSHD I get:
 
  # /etc/init.d/sshd start
 /etc/init.d/sshd: line 18: 2079 Illegal instruction ${SSHD_BINARY} -t
 ${SSHD_OPTS}
 * ERROR: sshd failed to start

^^ That screams 'CFLAGS' issue. Verify that the CFLAGS for your prod
server are the same (or close enough) to that of your dev server.

Guessing the new host has different CPU capabilities exposed to the
guest, either because of a differing hypervisor configuraiton, or
because of the different underlying hardware.



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread Tanstaafl

On 2013-04-15 11:42 AM, Michael Mol mike...@gmail.com wrote:

On 04/15/2013 11:37 AM, Tanstaafl wrote:

Hi all,

Help! :(


[snip]



I've tried recompiling both (both compile/install ok), but when I try to
start SSHD I get:

  # /etc/init.d/sshd start
/etc/init.d/sshd: line 18: 2079 Illegal instruction ${SSHD_BINARY} -t
${SSHD_OPTS}
* ERROR: sshd failed to start


^^ That screams 'CFLAGS' issue. Verify that the CFLAGS for your prod
server are the same (or close enough) to that of your dev server.

Guessing the new host has different CPU capabilities exposed to the
guest, either because of a differing hypervisor configuraiton, or
because of the different underlying hardware.


Thanks Michael - will check this as soon as I can (in the middle of 
another compile attempt now)...


So, if this is the case... would that mean I need to remerge @system 
(and eventually @world)?




Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread Tanstaafl

On 2013-04-15 11:42 AM, Michael Mol mike...@gmail.com wrote:

^^ That screams 'CFLAGS' issue. Verify that the CFLAGS for your prod
server are the same (or close enough) to that of your dev server.


Hmmm, they are different...

Dev (working) server has:

CFLAGS=-O2 -march=i686 -pipe

Prod server has:

CFLAGS=-march=native -O2 -pipe

But the Dev server is currently running a 64bit kernel...

I'm confused about how this works in a hosted virtual environment.

My Dev server failed to come up after the migration, until their tech 
support suggested switching to the 64bit kernel... did that and it is 
fine now (or appears to be)...


But the Prod server is still on the 32bit kernel...

Should I switch it to 64bit and change the CFLAGS to the same as the dev 
server?




Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread Tanstaafl

On 2013-04-15 11:42 AM, Michael Mol mike...@gmail.com wrote:

Guessing the new host has different CPU capabilities exposed to the
guest, either because of a differing hypervisor configuraiton, or
because of the different underlying hardware.


Hmmm again...

CHOST is the same on both:

CHOST=i686-pc-linux-gnu



Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread Tanstaafl

On 2013-04-15 11:51 AM, Tanstaafl tansta...@libertytrek.org wrote:

I'm confused about how this works in a hosted virtual environment.

My Dev server failed to come up after the migration, until their tech
support suggested switching to the 64bit kernel... did that and it is
fine now (or appears to be)...

But the Prod server is still on the 32bit kernel...

Should I switch it to 64bit and change the CFLAGS to the same as the dev
server?


Can you run a 64bit kernel on a system that was originally 
running/compiled with 32bit?




Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread Peter Wilmott

On 15/04/13 16:53, Tanstaafl wrote:

On 2013-04-15 11:42 AM, Michael Mol mike...@gmail.com wrote:

Guessing the new host has different CPU capabilities exposed to the
guest, either because of a differing hypervisor configuraiton, or
because of the different underlying hardware.


Hmmm again...

CHOST is the same on both:

CHOST=i686-pc-linux-gnu


Hi Tanstaafl,

Basically your issue is that your Gentoo system is compiled to use a 
specific instruction set, by doing this you get a very small performance 
increase on that exact CPU model at the cost of incompatibility with 
other CPUs. In a virtual environment, especially one where you do not 
control the hardware this is not a great idea since your provider can 
swap out your CPU with a different model and there isn't much you can do 
about it.


If I was you I'd boot off the rescue CD Linode provide, mount your root 
device, chroot in and set the following values in make.conf for both 
systems:


CFLAGS=-O2 -mtune=generic -pipe
CHOST=i686-pc-linux-gnu

These settings will build packages that will work on almost every modern 
CPU out there, once they are set you'll need to re-build @system and 
@world, hopefully the system will be able to cope with that or you're 
looking at a full re-build. You also want to be using 32 bit kernels on 
both systems since they are 32bit systems.


---
Peter



Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread Michael Mol
On 04/15/2013 11:53 AM, Tanstaafl wrote:
 On 2013-04-15 11:42 AM, Michael Mol mike...@gmail.com wrote:
 Guessing the new host has different CPU capabilities exposed to the
 guest, either because of a differing hypervisor configuraiton, or
 because of the different underlying hardware.
 
 Hmmm again...
 
 CHOST is the same on both:
 
 CHOST=i686-pc-linux-gnu
 

Argh. Reply to your own posts if you need to append content. Otherwise,
I can't easily address everything at once.

Anyway, you can (I believe) run a 64-bit kernel with a 32-bit CHOST.
Your system is a tad hobbled that way, but it should work. It'd be
like running multilib without the 64-bit side of things.

Set your CFLAGS on your prod server to that of your dev server, if your
dev server is known to work. You're using -march=native on your prod
server, which depends on gcc correctly detecting CPU features from the
host. There was a thread on this list just a few days ago about how that
can fail in virtualized environments. (You can enable/disable exposed
features piecemeal, which could well confuse the heck out of gcc's
detection heuristics...)

I don't know which instruction is 'illegal' on your new host, so, yeah,
the safest path is going to be emerging, well, everything. You don't
want some --as-needed lib getting pulled in some time down the road,
causing a real headscratcher of a crash. As the saying goes[1], Nuke
everything from orbit. It's the only way to be sure.

You might be best served by setting up a new VM from scratch and copying
over the bulk of your configuration (USE flags, daemon configurations,
etc.). It's certainly something you should look into once you get this
VM hobbling along again.

[1] Where'd that come from, anyway?



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread Michael Mol
On 04/15/2013 12:07 PM, Tanstaafl wrote:
 On 2013-04-15 11:51 AM, Tanstaafl tansta...@libertytrek.org wrote:
 I'm confused about how this works in a hosted virtual environment.

 My Dev server failed to come up after the migration, until their tech
 support suggested switching to the 64bit kernel... did that and it is
 fine now (or appears to be)...

 But the Prod server is still on the 32bit kernel...

 Should I switch it to 64bit and change the CFLAGS to the same as the dev
 server?
 
 Can you run a 64bit kernel on a system that was originally
 running/compiled with 32bit?
 

I don't see why not. The 64-bit kernel provides all the hooks necessary
for a 32-bit userspace.



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread covici
Michael Mol mike...@gmail.com wrote:

 On 04/15/2013 12:07 PM, Tanstaafl wrote:
  On 2013-04-15 11:51 AM, Tanstaafl tansta...@libertytrek.org wrote:
  I'm confused about how this works in a hosted virtual environment.
 
  My Dev server failed to come up after the migration, until their tech
  support suggested switching to the 64bit kernel... did that and it is
  fine now (or appears to be)...
 
  But the Prod server is still on the 32bit kernel...
 
  Should I switch it to 64bit and change the CFLAGS to the same as the dev
  server?
  
  Can you run a 64bit kernel on a system that was originally
  running/compiled with 32bit?
  
 
 I don't see why not. The 64-bit kernel provides all the hooks necessary
 for a 32-bit userspace.
 
If you do this, be sure to set the configs to emulate 32-bit otherwise
your 32-bit apps will not work!  


-- 
Your life is like a penny.  You're going to lose it.  The question is:
How do
you spend it?

 John Covici
 cov...@ccs.covici.com



Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread Tanstaafl

On 2013-04-15 12:51 PM, cov...@ccs.covici.com cov...@ccs.covici.com wrote:

Michael Mol mike...@gmail.com wrote:

On 04/15/2013 12:07 PM, Tanstaafl wrote:

Can you run a 64bit kernel on a system that was originally
running/compiled with 32bit?



I don't see why not. The 64-bit kernel provides all the hooks necessary
for a 32-bit userspace.



If you do this, be sure to set the configs to emulate 32-bit otherwise
your 32-bit apps will not work!


Well... I don't know what to say, but my dev server - the working one - 
wouldn't even boot with a 32bit kernel after the migration (that is what 
it was running before), and it is running fine right now (apache/php and 
postgresql db) on a 64bit kernel...




Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread Michael Mol
On 04/15/2013 12:51 PM, cov...@ccs.covici.com wrote:
 Michael Mol mike...@gmail.com wrote:
 
 On 04/15/2013 12:07 PM, Tanstaafl wrote:
 On 2013-04-15 11:51 AM, Tanstaafl tansta...@libertytrek.org wrote:
 I'm confused about how this works in a hosted virtual environment.

 My Dev server failed to come up after the migration, until their tech
 support suggested switching to the 64bit kernel... did that and it is
 fine now (or appears to be)...

 But the Prod server is still on the 32bit kernel...

 Should I switch it to 64bit and change the CFLAGS to the same as the dev
 server?

 Can you run a 64bit kernel on a system that was originally
 running/compiled with 32bit?


 I don't see why not. The 64-bit kernel provides all the hooks necessary
 for a 32-bit userspace.

 If you do this, be sure to set the configs to emulate 32-bit otherwise
 your 32-bit apps will not work!  

Which configs? Be specific; to a 32-bit x86 process[1], a 64-bit kernel
looks pretty much like a 32-bit kernel.


[1] I just realized I can no longer say 32-bit and expect it to
exactly mean x86. Going forward in general conversations, it could well
mean x32...



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread Tanstaafl

On 2013-04-15 11:42 AM, Michael Mol mike...@gmail.com wrote:

On 04/15/2013 11:37 AM, Tanstaafl wrote:

Hi all,

Help! :(


[snip]



I've tried recompiling both (both compile/install ok), but when I try to
start SSHD I get:

  # /etc/init.d/sshd start
/etc/init.d/sshd: line 18: 2079 Illegal instruction ${SSHD_BINARY} -t
${SSHD_OPTS}
* ERROR: sshd failed to start


^^ That screams 'CFLAGS' issue. Verify that the CFLAGS for your prod
server are the same (or close enough) to that of your dev server.

Guessing the new host has different CPU capabilities exposed to the
guest, either because of a differing hypervisor configuraiton, or
because of the different underlying hardware.


Ok, as I said, I got SSH working now and am making progress, updating 
@system a little at a time...


Before I started updating everything in @system though, I tried ncurses 
again (one of the first ones that failed on me), and it still dies with 
this:


INFO: setup
Package:sys-libs/ncurses-5.9-r2
Repository: gentoo
Maintainer: base-sys...@gentoo.org
USE:abi_x86_32 cxx elibc_glibc gpm kernel_linux unicode 
userland_GNU x86

FEATURES:   sandbox
INFO: unpack
Applying ncurses-5.8-gfbsd.patch ...
Applying ncurses-5.7-nongnu.patch ...
Applying ncurses-5.9-rxvt-unicode-9.15.patch ...
Applying ncurses-5.9-fix-clang-build.patch ...
ERROR: compile
ERROR: sys-libs/ncurses-5.9-r2 failed (compile phase):
  (no error message)

Call stack:
ebuild.sh, line   93:  Called src_compile
  environment, line 2340:  Called do_compile 'narrowc'
  environment, line  467:  Called die
The specific snippet of code:
  emake ${make_flags} || die

If you need support, post the output of `emerge --info 
'=sys-libs/ncurses-5.9-r2'`,
the complete build log and the output of `emerge -pqv 
'=sys-libs/ncurses-5.9-r2'`.
The complete build log is located at 
'/var/tmp/portage/sys-libs/ncurses-5.9-r2/temp/build.log'.
The ebuild environment file is located at 
'/var/tmp/portage/sys-libs/ncurses-5.9-r2/temp/environment'.

Working directory: '/var/tmp/portage/sys-libs/ncurses-5.9-r2/work/narrowc'
S: '/var/tmp/portage/sys-libs/ncurses-5.9-r2/work/ncurses-5.9'

Ideas?



Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread Michael Mol
On 04/15/2013 01:46 PM, Tanstaafl wrote:
 On 2013-04-15 11:42 AM, Michael Mol mike...@gmail.com wrote:
 On 04/15/2013 11:37 AM, Tanstaafl wrote:
 Hi all,

 Help! :(

 [snip]


 I've tried recompiling both (both compile/install ok), but when I try to
 start SSHD I get:

   # /etc/init.d/sshd start
 /etc/init.d/sshd: line 18: 2079 Illegal instruction ${SSHD_BINARY} -t
 ${SSHD_OPTS}
 * ERROR: sshd failed to start

 ^^ That screams 'CFLAGS' issue. Verify that the CFLAGS for your prod
 server are the same (or close enough) to that of your dev server.

 Guessing the new host has different CPU capabilities exposed to the
 guest, either because of a differing hypervisor configuraiton, or
 because of the different underlying hardware.
 
 Ok, as I said, I got SSH working now and am making progress, updating
 @system a little at a time...
 
 Before I started updating everything in @system though, I tried ncurses
 again (one of the first ones that failed on me), and it still dies with
 this:
 
 INFO: setup
 Package:sys-libs/ncurses-5.9-r2
 Repository: gentoo
 Maintainer: base-sys...@gentoo.org
 USE:abi_x86_32 cxx elibc_glibc gpm kernel_linux unicode
 userland_GNU x86
 FEATURES:   sandbox
 INFO: unpack
 Applying ncurses-5.8-gfbsd.patch ...
 Applying ncurses-5.7-nongnu.patch ...
 Applying ncurses-5.9-rxvt-unicode-9.15.patch ...
 Applying ncurses-5.9-fix-clang-build.patch ...
 ERROR: compile
 ERROR: sys-libs/ncurses-5.9-r2 failed (compile phase):
   (no error message)
 
 Call stack:
 ebuild.sh, line   93:  Called src_compile
   environment, line 2340:  Called do_compile 'narrowc'
   environment, line  467:  Called die
 The specific snippet of code:
   emake ${make_flags} || die
 
 If you need support, post the output of `emerge --info
 '=sys-libs/ncurses-5.9-r2'`,
 the complete build log and the output of `emerge -pqv
 '=sys-libs/ncurses-5.9-r2'`.
 The complete build log is located at
 '/var/tmp/portage/sys-libs/ncurses-5.9-r2/temp/build.log'.
 The ebuild environment file is located at
 '/var/tmp/portage/sys-libs/ncurses-5.9-r2/temp/environment'.
 Working directory: '/var/tmp/portage/sys-libs/ncurses-5.9-r2/work/narrowc'
 S: '/var/tmp/portage/sys-libs/ncurses-5.9-r2/work/ncurses-5.9'
 
 Ideas?

I'd guess that something used as part of ncurses's build process is failing.

Were this one of my systems (none of which is in a prod scenario, so
take it with a grain of salt), I'd emerge -e --keep-going @system, and
then emerge --resume a few times. You're stuck in something not unlike a
bootstrap scenario.

 




signature.asc
Description: OpenPGP digital signature


Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread Tanstaafl

On 2013-04-15 1:46 PM, Tanstaafl tansta...@libertytrek.org wrote:

On 2013-04-15 11:42 AM, Michael Mol mike...@gmail.com wrote:

On 04/15/2013 11:37 AM, Tanstaafl wrote:

Hi all,

Help! :(


[snip]



I've tried recompiling both (both compile/install ok), but when I try to
start SSHD I get:

  # /etc/init.d/sshd start
/etc/init.d/sshd: line 18: 2079 Illegal instruction ${SSHD_BINARY} -t
${SSHD_OPTS}
* ERROR: sshd failed to start


^^ That screams 'CFLAGS' issue. Verify that the CFLAGS for your prod
server are the same (or close enough) to that of your dev server.

Guessing the new host has different CPU capabilities exposed to the
guest, either because of a differing hypervisor configuraiton, or
because of the different underlying hardware.


Ok, as I said, I got SSH working now and am making progress, updating
@system a little at a time...


Ok, I think all I need to get our db back up is to remerge php, but it 
is failing.


The last error appears to be the zlib check.

I did already try

emerge -1 sys-libs/zlib

and retrying to emerge php, but got the same error:

checking for ZLIB support... yes
checking if the location of ZLIB install directory is defined... no
checking for zlib version = 1.2.0.4...
configure: error: libz version greater or equal to 1.2.0.4 required

!!! Please attach the following file when seeking support:
!!! /var/tmp/portage/dev-lang/php-5.4.13/work/sapis-build/cli/config.log
 * ERROR: dev-lang/php-5.4.13 failed (configure phase):
 *   econf failed
 *
 * Call stack:
 *  ebuild.sh, line   93:  Called src_configure
 *environment, line 4080:  Called econf 
'--prefix=/usr/lib/php5.4' '--mandir=/usr/lib/php5.4/man' 
'--infodir=/usr/lib/php5.4/info' '--libdir=/usr/lib/php5.4/lib' 
'--with-libdir=lib' '--without-pear' '--disable-maintainer-zts' 
'--disable-bcmath' '--with-bz2=/usr' '--disable-calendar' 
'--enable-ctype' '--without-curl' '--without-curlwrappers' 
'--enable-dom' '--without-enchant' '--disable-exif' '--enable-fileinfo' 
'--enable-filter' '--disable-ftp' '--with-gettext=/usr' '--without-gmp' 
'--enable-hash' '--without-mhash' '--with-iconv' '--disable-intl' 
'--disable-ipv6' '--enable-json' '--without-kerberos' '--enable-libxml' 
'--with-libxml-dir=/usr' '--enable-mbstring' '--with-mcrypt=/usr' 
'--without-mssql' '--with-onig=/usr' '--with-openssl=/usr' 
'--with-openssl-dir=/usr' '--disable-pcntl' '--enable-phar' 
'--disable-pdo' '--with-pgsql=/usr' '--enable-posix' 
'--with-pspell=/usr' '--without-recode' '--enable-simplexml' 
'--disable-shmop' '--without-snmp' '--disable-soap' '--enable-sockets' 
'--without-sqlite3' '--without-sybase-ct' '--disable-sysvmsg' 
'--disable-sysvsem' '--disable-sysvshm' '--without-tidy' 
'--enable-tokenizer' '--disable-wddx' '--enable-xml' 
'--disable-xmlreader' '--disable-xmlwriter' '--with-xmlrpc' 
'--without-xsl' '--disable-zip' '--with-zlib=/usr' '--disable-debug' 
'--enable-dba' '--without-cdb' '--with-db4=/usr' '--disable-flatfile' 
'--with-gdbm=/usr' '--disable-inifile' '--without-qdbm' 
'--without-freetype-dir' '--without-t1lib' '--disable-gd-jis-conv' 
'--without-jpeg-dir' '--without-png-dir' '--without-xpm-dir' 
'--without-gd' '--with-imap=/usr' '--with-imap-ssl=/usr' 
'--without-mysqli' '--with-readline=/usr' '--without-libedit' 
'--without-mm' '--with-pic' '--with-pcre-regex=/usr' 
'--with-pcre-dir=/usr' '--with-config-file-path=/etc/php/cli-php5.4' 
'--with-config-file-scan-dir=/etc/php/cli-php5.4/ext-active' 
'--disable-embed' '--enable-cli' '--disable-cgi' '--disable-fpm' 
'--without-apxs2'

 *   phase-helpers.sh, line  521:  Called die
 * The specific snippet of code:
 *  die econf failed
 *
 * If you need support, post the output of `emerge --info 
'=dev-lang/php-5.4.13'`,
 * the complete build log and the output of `emerge -pqv 
'=dev-lang/php-5.4.13'`.
 * The complete build log is located at 
'/var/tmp/portage/dev-lang/php-5.4.13/temp/build.log'.
 * The ebuild environment file is located at 
'/var/tmp/portage/dev-lang/php-5.4.13/temp/environment'.
 * Working directory: 
'/var/tmp/portage/dev-lang/php-5.4.13/work/sapis-build/cli'

 * S: '/var/tmp/portage/dev-lang/php-5.4.13/work/php-5.4.13'

I hope this gives someone a hint...



Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread Tanstaafl

On 2013-04-15 2:03 PM, Tanstaafl tansta...@libertytrek.org wrote:


Ok, I think all I need to get our db back up is to remerge php, but it
is failing.

The last error appears to be the zlib check.

I did already try

emerge -1 sys-libs/zlib

and retrying to emerge php, but got the same error:


Ok, added -zlib to package.mask and it is compiling now... I just don't 
know if I need zlib support for our DB app... sigh


If this doesn't work I'll try your suggestion of:


Were this one of my systems (none of which is in a prod scenario, so
take it with a grain of salt), I'd emerge -e --keep-going @system, and
then emerge --resume a few times. You're stuck in something not unlike a
bootstrap scenario.


Thanks a lot Michael... first time anything like this has happened to me 
in a long time. I forgot what it is like to have users (and bosses) 
breathing down my neck like this...




Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread covici
Michael Mol mike...@gmail.com wrote:

 On 04/15/2013 12:51 PM, cov...@ccs.covici.com wrote:
  Michael Mol mike...@gmail.com wrote:
  
  On 04/15/2013 12:07 PM, Tanstaafl wrote:
  On 2013-04-15 11:51 AM, Tanstaafl tansta...@libertytrek.org wrote:
  I'm confused about how this works in a hosted virtual environment.
 
  My Dev server failed to come up after the migration, until their tech
  support suggested switching to the 64bit kernel... did that and it is
  fine now (or appears to be)...
 
  But the Prod server is still on the 32bit kernel...
 
  Should I switch it to 64bit and change the CFLAGS to the same as the dev
  server?
 
  Can you run a 64bit kernel on a system that was originally
  running/compiled with 32bit?
 
 
  I don't see why not. The 64-bit kernel provides all the hooks necessary
  for a 32-bit userspace.
 
  If you do this, be sure to set the configs to emulate 32-bit otherwise
  your 32-bit apps will not work!  
 
 Which configs? Be specific; to a 32-bit x86 process[1], a 64-bit kernel
 looks pretty much like a 32-bit kernel.
 
 
 [1] I just realized I can no longer say 32-bit and expect it to
 exactly mean x86. Going forward in general conversations, it could well
 mean x32...
 

I was thinking primarily of ia32 emulation -- I made a kernel and got
burned by not having this on by mistake and then my 64-bit kernel  would
not execute any 32-bit program.

-- 
Your life is like a penny.  You're going to lose it.  The question is:
How do
you spend it?

 John Covici
 cov...@ccs.covici.com



Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread Michael Mol
On 04/15/2013 02:08 PM, Tanstaafl wrote:
 On 2013-04-15 2:03 PM, Tanstaafl tansta...@libertytrek.org wrote:

 Ok, I think all I need to get our db back up is to remerge php, but it
 is failing.

 The last error appears to be the zlib check.

 I did already try

 emerge -1 sys-libs/zlib

 and retrying to emerge php, but got the same error:
 
 Ok, added -zlib to package.mask and it is compiling now... I just don't
 know if I need zlib support for our DB app... sigh
 
 If this doesn't work I'll try your suggestion of:
 
 Were this one of my systems (none of which is in a prod scenario, so
 take it with a grain of salt), I'd emerge -e --keep-going @system, and
 then emerge --resume a few times. You're stuck in something not unlike a
 bootstrap scenario.
 
 Thanks a lot Michael... first time anything like this has happened to me
 in a long time. I forgot what it is like to have users (and bosses)
 breathing down my neck like this...
 

That system is going to require a great deal of cleanup and maintenance
to get fully reliable again. Once everything's been rebuilt, you should
be able to have zlib back, etc. It'll just take a while to to clean up.

I repeat my suggestion that you set up an alternate server and aim to
migrate to that. It's amazing what you can do with failover,
replication, etc



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread Tanstaafl

On 2013-04-15 2:08 PM, Tanstaafl tansta...@libertytrek.org wrote:

On 2013-04-15 2:03 PM, Tanstaafl tansta...@libertytrek.org wrote:


Ok, I think all I need to get our db back up is to remerge php, but it
is failing.

The last error appears to be the zlib check.

I did already try

emerge -1 sys-libs/zlib

and retrying to emerge php, but got the same error:


Ok, added -zlib to package.mask and it is compiling now... I just don't
know if I need zlib support for our DB app... sigh


Ok, apparently it requires zlib... apache now starts, but I just get a 
totally blank web page instead of the login page.


Oh well, onward...



Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread Tanstaafl

On 2013-04-15 2:02 PM, Michael Mol mike...@gmail.com wrote:

Were this one of my systems (none of which is in a prod scenario, so
take it with a grain of salt), I'd emerge -e --keep-going @system, and
then emerge --resume a few times. You're stuck in something not unlike a
bootstrap scenario.


Ok, before I start...

Michael, if this were you, would you use the 32bit or 64bit kernel when 
doing the emerge -e --keep-going system?


Again, the system was initially rolled out and was always 32 bit...




Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread Michael Mol
On 04/15/2013 02:54 PM, Tanstaafl wrote:
 On 2013-04-15 2:02 PM, Michael Mol mike...@gmail.com wrote:
 Were this one of my systems (none of which is in a prod scenario, so
 take it with a grain of salt), I'd emerge -e --keep-going @system, and
 then emerge --resume a few times. You're stuck in something not unlike a
 bootstrap scenario.
 
 Ok, before I start...
 
 Michael, if this were you, would you use the 32bit or 64bit kernel when
 doing the emerge -e --keep-going system?
 
 Again, the system was initially rolled out and was always 32 bit...
 

If this were me, I would set up a clean install from scratch. No, I
wouldn't use a x86 userspace with a x64 kernel, but that's because of
the benefits I see with the 64-bit arch, not with any issues I'd be
aware of from using an x64 kernel with an x32 userspace.

To me, that's the fastest way to get a system I'd deem reliable. But
it's a lot faster to do with distros other than Gentoo, and rather
requires having an up-to-date install script if you intend to do it with
Gentoo...

You're in an ugly scenario, though, because you don't have the benefit
of a spare environment to produce a prod setup within.

You've mentioned you couldn't get the system to run at all with a 32-bit
kernel on the new hardware. Fair enough. I wouldn't dare try changing
the system from a 32-bit CHOST to a 64-bit CHOST, though; I've never
walked that path before, even if there are those who have. It's
certainly not something I'd do on a should-be-live prod system.

In your position, if I had to use the existing system without a
from-scratch build/install, I would continue with the 32-bit userland
and 64-bit kernel. To me, that's the least risky of the alternatives,
given the constraints involved.

To be clear, that's also a last resort...I would lobby *hard* to do a
clean from-scratch setup in a different VM before treading that path
(even when I do major upgrades of rosettacode.org, I go through a brief
period where I have two VMs as I migrate services from one to the
other), and my keyboard might well not survive the impacts of my hands
while I typed out commands to do it any other way; I'm very hard on
keyboards when very angry.




signature.asc
Description: OpenPGP digital signature


PostgreSQL guy in the house? - WAS: Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread Tanstaafl

On 2013-04-15 3:11 PM, Michael Mol mike...@gmail.com wrote:

If this were me, I would set up a clean install from scratch. No, I
wouldn't use a x86 userspace with a x64 kernel, but that's because of
the benefits I see with the 64-bit arch, not with any issues I'd be
aware of from using an x64 kernel with an x32 userspace.


I understand and agree, and am doing that as we speak.

I was just trying to get it back up and running quickly, but that didn't 
happen.


Ok - now... is there a postgresql guy in the house?

Can someone confirm that the command I need to use to dump the entire pg 
database for a full restore on a new/clean machine would be:


pg_dumpall --username=username -o -f /home/myuser/mydb_backup.sql.gz

?

Will that get everything?

I'm also planning on stopping pg (it is running ok, it is PHP that is 
the problem), then just


tar -pvczf /home/myuser/pg91_data.tar.gz /var/lib/postgresql/9.1/data

will that suffice as another backup of all of the data that could be 
used for restoration?


Thanks guys... this was not a fun day...



Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread Tanstaafl

On 2013-04-15 12:10 PM, Michael Mol mike...@gmail.com wrote:

Argh. Reply to your own posts if you need to append content. Otherwise,
I can't easily address everything at once.


Sorry, I usually do, but I'm kind of flustered right now...


Anyway, you can (I believe) run a 64-bit kernel with a 32-bit CHOST.
Your system is a tad hobbled that way, but it should work. It'd be
like running multilib without the 64-bit side of things.


I went ahead and switched back to the 32bit kernel, updated gcc, 
recompiled openssl/openssh and finally got ssh working again (whew, 
their Lish Web console sucks)...



Set your CFLAGS on your prod server to that of your dev server, if your
dev server is known to work. You're using -march=native on your prod
server, which depends on gcc correctly detecting CPU features from the
host. There was a thread on this list just a few days ago about how that
can fail in virtualized environments. (You can enable/disable exposed
features piecemeal, which could well confuse the heck out of gcc's
detection heuristics...)


I think I know now - it was glib that got compiled, but not while 
running a 64bit kernel... it must have been the march-native that 
screwed it up.



I don't know which instruction is 'illegal' on your new host, so, yeah,
the safest path is going to be emerging, well, everything. You don't
want some --as-needed lib getting pulled in some time down the road,
causing a real headscratcher of a crash. As the saying goes[1], Nuke
everything from orbit. It's the only way to be sure.


snip


[1] Where'd that come from, anyway?


Aliens?

Thanks again Michael



Re: PostgreSQL guy in the house? - WAS: Re: [gentoo-user] Serious problem with linode vm

2013-04-15 Thread J. Roeleveld
Tanstaafl tansta...@libertytrek.org wrote:

On 2013-04-15 3:11 PM, Michael Mol mike...@gmail.com wrote:
 If this were me, I would set up a clean install from scratch. No, I
 wouldn't use a x86 userspace with a x64 kernel, but that's because of
 the benefits I see with the 64-bit arch, not with any issues I'd be
 aware of from using an x64 kernel with an x32 userspace.

I understand and agree, and am doing that as we speak.

I was just trying to get it back up and running quickly, but that
didn't 
happen.

Ok - now... is there a postgresql guy in the house?

Can someone confirm that the command I need to use to dump the entire
pg 
database for a full restore on a new/clean machine would be:

pg_dumpall --username=username -o -f /home/myuser/mydb_backup.sql.gz

?

Will that get everything?

I'm also planning on stopping pg (it is running ok, it is PHP that is 
the problem), then just

tar -pvczf /home/myuser/pg91_data.tar.gz /var/lib/postgresql/9.1/data

will that suffice as another backup of all of the data that could be 
used for restoration?

Thanks guys... this was not a fun day...

Tanstaafl.

The pg_dumpall command will generate SQL scripts to restore the entire 
datastructure and data needed to rebuild the entire database server.
The SQL will not be compressed. So I would leave the .gz off the filename.

You will also need the configuration files pg_hba.conf and postgresql.conf. 
(Doing this from memory on my mobile.)

Best also have a quick check on the postgresql website and mailing list. 

The last migration to a new server was done by backing up every database 
seperately using the pg_dump command.
This made restoring simpler because the template databases already exist when 
the database is running.

The tar-command will also get nearly everything if you kept the default 
locations. Restoring that should also suffice if you restore it to a 9.1 
postgresql.
Don't forget the files in /etc/postgresql*/

Any questions. Put them on here. I'm off to my customer soon. Should be back on 
email in about 1.5 hours...

--
Joost Roeleveld
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.