Just a final word on this...

The problem is effectively resolved... i was able to rebuild the system,
then world with zero issues.  I then ran revdep-rebuild, no issues and no
broken links found, I then recompiled pkgs with deps against glibc and ran
revdep-rebuild again.  The whole thing ran at full capacity and with zero
errors.

I don't know if I felt as good as this when I found the "root cause"...  I
just know that having "root" again feels great!  ;)

Okay...  and now let's upgrade the kernel...  ;P

Thanks again,
  Simon



On Sat, Jan 8, 2011 at 3:16 PM, Mark Knecht <[email protected]> wrote:

> Glad you have a root cause/solution.
>
> On Sat, Jan 8, 2011 at 10:49 AM, Simon <[email protected]> wrote:
> <SNIP>
> > The virtual HD is physically on a raid (unknown config).  Mark, the
> sector
> > size issue you mention, does it have to do with aligning real HD sectors
> > with filesystem sectors (so that stuff like read-ahead will get
> > no-more-no-less than what the kernel wants)?  I've read about this kind
> of
> > setup when I was interested in RAID long ago...  Now that I know my hd is
> > actually on a raid, maybe i could benefit some I/O performance
> improvements
> > by tuning this a bit!
> >
>
> As it's RAID underneath it's likely set up correctly. The issue I had
> in mind was the disk being a 4K/sector disk but the person who built
> the partition not knowing to align the partition to a 4K boundary.
> That can cause a _huge_ slowdown.
>
> I doubt that's the case here. As this is a hosting service they likely
> know what they are doing in that area, and if it wasn't done correctly
> you would have noticed it before I think.
>
> > Anyway, I was told by the support team that another user on the same
> > physical machine (remember it's a xen VPS) was doing I/O intensive stuff
> > which could have "I/O starved" my system.  I don't understand how
> starving
> > or even doing some kind of DoS attack could lead to a complete freeze on
> the
> > console, but eh...
>
> Makes sense actually. The other guy took all the disk I/O leaving you
> with none. If you can't get to the disk then you cannot read ebuilds
> or write compiled code, or at least not fast.
>
> > They offered to migrate my system to another physical
> > machine, and after that...  I was able to perform a complete 'emerge -e
> > system' in one shot without a scratch, I even did it with --jobs=2 and
> > MAKEOPTS="-j4".  After that, I started a complete "emerge --keep-going
> > --jobs=2 world" with MAKEOPTS="-j8"...  (i got 4 cores:  dual xeon 2Ghz)
> >
>
> So now you're in good shape...until some user on the new system starts
> hogging all the disk I/O and holds you up again.
>
> > This last emerge is still going on as I write this and is emerging pkg
> 522
> > of 620 !!  And there were no build errors so far...
> >
> > It's emerging glibc at the moment, so once the big emerge is finished,
> I'll
> > probably recompile all pkgs that depend on glibc.  I believe glibc was
> > actually updated during my very initial update on monday and I haven't
> come
> > to do that...  but I guess everything will go smoothly from here.
> >
> > Thanks again for all your help guys!
> >   Simon
>
> Good that you got to the root of the problem.
>
> Good luck,
> Mark
>
>

Reply via email to