[Bug 1036366] Re: software RAID arrays fail to start on boot

2012-08-23 Thread Doug Jones
Please ignore this bug report.

I now believe this problem was caused by a configuration error in my
RAID setup.  In particular, two different arrays had the same 'name' (I
think this is the 'name' recorded in the array superblocks  --  IMHO,
'name' has become an over-conflated term in the context of linux
software-RAID).  Apparently having duplicate names is not a good idea.

I have no recollection of ever explicitly assigning these names;  I
think this was done automatically by mdadm, several versions ago
(probably over a year ago).  Since many bugs in mdadm have been fixed
since then, we should probably assume that this issue has been fixed
unless somebody reports similar symptoms again.

I have fixed my system by re-creating one of the arrays without that
duplicate name.  I have now rebooted several times, and the symptoms
have not recurred.

If there is a bug here, it is that it is (or was?) possible to create
two arrays with the same name, with no obvious warning given at the
time.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1036366

Title:
  software RAID arrays fail to start on boot

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1036366/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1036366] Re: software RAID arrays fail to start on boot

2012-08-23 Thread Doug Jones
You're welcome, Dmitrijs.

Now that this system is finally behaving itself (for the first time in
the better part of a year!)  I can now look at this properly functioning
configuration and compare it with the previously broken one.  It is
becoming more clear what happened.

(Note:  all of the following are things I have deduced by reading a vast
amount of material from different sources [and giving myself many
headaches in the process], and some of it may not be an entirely
accurate description of what's really happening.)

The superblock contains a field called name.  It's not like /dev/md1 or 
/dev/md/1 or /dev/md1p1 or anything like that.  On my system, it's more like 5. 
 As it happened I had two very different arrays (different array levels, sizes, 
etc.) that both had that name.  When you run Disk Utility and select a RAID 
array, this is the Name displayed in the right pane;  if it's empty, the pane 
shows Name:  - 
but on my system two arrays showed Name:  5.  I didn't choose this name;  I 
think mdadm assigned it because each array happened to be mounted at /dev/md5 
at the time it was created, and the two arrays were created at different times 
(of course these device names change arbitrarily whenever you boot).

But of course it's more complicated than that.  That's just part of the
name;  the superblock actually contains a 'fully qualified name' that is
of the form hostname:name and Disk Utility only displays the last part
of it.  The hostname part is just the hostname at the time the array is
created.

My system has a long history.  A year ago, its hostname was different,
and one of the arrays was created then.  After the system became
unstable (when I upgraded to Oneiric and gained a particularly buggy
version of mdadm) I stopped using it and backed all the data off.

When Precise became available, I did a fresh install onto a non-RAID
partition and left all the existing RAID partitions in place for testing
purposes.  Because I was no longer going to use this system as a file
server, but as a test machine, I gave it a different hostname
(precisetest).  A bit later I added another array and mdadm assigned it
the name 5, presumably because it was sitting at /dev/md5 at the time.
I did not even notice this at first.  Of course, the fully qualified
names stored in the superblocks were actually different, having
different hostnames on the front, so even though Disk Utility showed the
name 5 on both arrays, they really had different full names.

Although RAID was really messed up on this system, that was only a
problem at boot time.  After booting, I could go into Disk Utility and
manually start all affected arrays.  Once this was done, the system
worked great, until the next reboot.  RAID was working;  I could access
files on any array.  I came to the (perhaps incorrect?) conclusion that
these two arrays having the same name was not a problem.  After all, I
could look at mdadm.conf and see that the arrays really had different
names (the fully qualified names are shown there).

Now I am thinking that it is not sufficient that the fully qualified
names be unique.  I think the part of the name after the : has to be
unique too, otherwise problems happen at boot, at least on Ubuntu
Precise.  But I don't think mdadm upstream intended it to be that way.

So:  some part of the boot process is getting hung up on these
(apparently) duplicate names, because it is looking at just the short
names instead of the fully qualified names.  (In the udev scripts
perhaps?)  If that code looked at hostname:name instead of just name,
perhaps this problem would disappear.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1036366

Title:
  software RAID arrays fail to start on boot

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1036366/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1036366] [NEW] software RAID arrays fail to start on boot

2012-08-13 Thread Doug Jones
Public bug reported:

Some software RAID arrays fail to start on boot.  Exactly two of my
arrays (but not always the same two!) do not start, on every single
boot, and I have done 24 boots since I started taking detailed notes.

Have been running Ubuntu 12.04 with latest updates.  Two days ago I
selectively upgraded mdadm to 3.2.5 from -proposed, as suggested in bug
#942106;  that upgrade helped some other people, but not me.  Over the
last few months, various updates in kernel and mdadm have resulted in
great improvement of symptoms, but no complete cure so far.

Note that the following symptoms once regularly occurred on this system, but 
have NOT occurred in the past few weeks:
  - Having to wait for a degraded array to resync
  - Having to manually re-attach a component (usually a spare) that had become 
detached
  - Having to drop to the command line to zero a superblock before reattaching 
a component
  - Having an array containing swap fail to start
  - Having to use anything other than Disk Utility to get arrays running 
properly again


This system has six SATA drives on two controllers.  It contains seven RAID 
arrays, including RAID 1, RAID 10, and RAID 6;  all are listed in fstab.  Some 
use 0.90.0 metadata and some use 1.2 metadata.  The root filesystem is not on a 
RAID array (at least not any more;  I got tired of that REAL fast) but 
everything else (including /boot and all swap) is on RAID.  One array is used 
for /boot, two for swap, and the other four are just there for testing purposes.

BOOT_DEGRADED is set.  All partitions are GPT.  Not using LUKS or LVM.
All drives are 2TB and by various manufacturers, and I suspect some have
512B physical sectors and some have 2KB sectors.  This is an AMD64
system with 8GB RAM.


This system has had about four different versions of Ubuntu on it over the last 
few years, and has had multiple RAID arrays on it from the beginning.  (This is 
why some of the arrays are still using 0.90.0 metadata, and why there are so 
many arrays;  some arrays are old partitions containing root and home and such 
from earlier incarnations.)  RAID worked fine until the system was upgraded to 
Oneiric early in 2012 (no, the problem did not start with Precise).

I have carefully tested the system every time an updated kernel or mdadm
has appeared, ever since the problem started.  The behavior has
gradually improved over the last several months.  This latest proposed
version of mdadm (3.2.5), thankfully, did not result in regressions, but
also did not result in significant improvement on this system;  have
rebooted five times since then and the behavior is consistent.


When the problem first started, on Oneiric, I had the root file system on RAID. 
 This was unpleasant.  I stopped using the system for a while, as I had another 
one running Maverick, which was reliable.

When I noticed some discussion of possibly related bugs on the Linux
RAID list (I've been lurking there for years) I decided to test the
system some more.  By then Precise was out, so I upgraded.  That did not
help.  Eventually I backed up all data onto another system and did a
clean install of Precise on a non-RAID partition, which made the system
tolerable.  I left /boot on a RAID1 array (on all six drives), but that
does not prevent the system from booting even if /boot does not start
during Ubuntu startup (I assume because GRUB can find /boot even if
Ubuntu later can't).

I started taking detailed notes in May (seven cramped pages so far).
Have rebooted 24 times since then.  On every boot, exactly two arrays
did not start.  Which arrays they were, varied from boot to boot;  could
be any of the arrays (but recently, swap arrays are not affected).  No
apparent correlation with metadata type or RAID level.

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: mdadm 3.2.5-1ubuntu0.2
ProcVersionSignature: Ubuntu 3.2.0-29.46-generic 3.2.24
Uname: Linux 3.2.0-29-generic x86_64
ApportVersion: 2.0.1-0ubuntu12
Architecture: amd64
Date: Mon Aug 13 12:10:36 2012
InstallationMedia: Ubuntu 12.04 LTS Precise Pangolin - Release amd64 
(20120425)
MDadmExamine.dev.sda:
 /dev/sda:
MBR Magic : aa55
 Partition[0] :   3907029167 sectors at1 (type ee)
MDadmExamine.dev.sda1: Error: command ['/sbin/mdadm', '-E', '/dev/sda1'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sda1.
MDadmExamine.dev.sda11: Error: command ['/sbin/mdadm', '-E', '/dev/sda11'] 
failed with exit code 1: mdadm: No md superblock detected on /dev/sda11.
MDadmExamine.dev.sda4: Error: command ['/sbin/mdadm', '-E', '/dev/sda4'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sda4.
MDadmExamine.dev.sda5: Error: command ['/sbin/mdadm', '-E', '/dev/sda5'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sda5.
MDadmExamine.dev.sda6: Error: command ['/sbin/mdadm', '-E', '/dev/sda6'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sda6.
MDadmExamine.dev.sda7: Error: command 

[Bug 1036366] Re: software RAID arrays fail to start on boot

2012-08-13 Thread Doug Jones
-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1036366

Title:
  software RAID arrays fail to start on boot

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1036366/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 942106] Re: software raid doesn't assemble before mount on boot

2012-08-13 Thread Doug Jones
@Brian, Dmitrijs:

Thanks.

I have filed Bug # 1036366 to report the symptoms not resolved by this
recent fix.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/942106

Title:
  software raid doesn't assemble before mount on boot

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/942106/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 995445] Re: package gpsmanshp 1.2.1-1 failed to install/upgrade: subprocess installed post-installation script returned error exit status 127

2012-08-13 Thread Doug Jones
@Atheg:

(Note that I am totally guessing about this, and all of the following
may be less than helpful)

AFAIK, installing tcl8.4 doesn't roll back anything.  I just checked my
system, and I now have both 8.4 and 8.5 installed where I only had 8.5
before.  I don't know if this presents any problems as I never
explicitly use TCL myself and know little about it.  However:

I just tried this in a terminal:


me@precise:~$ tclsh
% 
% 
% exit
me@precise:~$ tclsh8.4
% 
% 
% 
% exit
me@precise:~$ tclsh8.5
% 
% 
% 
% 
% exit
me@precise:~$ 

So I can explicitly call up either version of tclsh.  They are both
installed and working.  I don't know how to tell which one comes up when
I just type tclsh.


The idea of doing 
   sudo apt-get install tcl8.4
to resolve the problem was just an attempt to make a single error message go 
away, and see what happens after that.  As it turns out, it appears to have 
completely fixed the problem.  But it is just a workaround, and I don't know if 
it will negatively impact anything else you are doing.  Somebody more expert on 
the use of TCL should chime in on this.

The error message was coming from a script called gpsmanshp.postinst
that is explicitly referring to tclsh8.4, a package that is not
installed on a default Precise installation (and there is apparently no
dependency listed with gpsmanshp, so it is not automatically installed
when gpsmanshp is).  Perhaps manually editing that script (replacing 8.4
with 8.5 in one line) would also fix the problem, but I haven't tested
that and don't know what ripple effects that would have on anything
else, if any.

It has been over a month since I tried that fix, and it doesn't seem to
have hurt anything.  It did make an extremely annoying recurring error
message go away.  And it did complete the installation (I think!) of
gpsmanshp, a package that I know virtually nothing about and haven't
even had time to look at since then.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/995445

Title:
  package gpsmanshp 1.2.1-1 failed to install/upgrade: subprocess
  installed post-installation script returned error exit status 127

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apt/+bug/995445/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 995445] Re: package gpsmanshp 1.2.1-1 failed to install/upgrade: subprocess installed post-installation script returned error exit status 127

2012-08-13 Thread Doug Jones
@Atheg:  (continuing previous comment)

I just had a look at the Tcl docs.  I learned about the info patchlevel
command.  This is what it does on my system:

me@precise:~$ tclsh
% info patchlevel
8.5.11
% exit
me@precise:~$ tclsh8.4
% info patchlevel
8.4.19
% exit
me@precise:~$ tclsh8.5
% info patchlevel
8.5.11
% exit
me@precise:~$ 


So if I ask for tclsh without specifying a specific version, it is giving me 
8.5, the latest.  But AFAIK the most recent version that was installed is 8.4.  
So apparently tclsh will take you to the most recent version number, not the 
last one installed.  I imagine this is the behavior you desire.

The gpsmanshp.postinst script will still use the 8.4 version it thinks
it needs because it explicitly calls up that version.

Hope this helps.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/995445

Title:
  package gpsmanshp 1.2.1-1 failed to install/upgrade: subprocess
  installed post-installation script returned error exit status 127

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apt/+bug/995445/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 942106] Re: software raid doesn't assemble before mount on boot

2012-08-11 Thread Doug Jones
This does NOT fix this issue for me.

My system still boots up with some RAID arrays not running.  Every
single time.

This system has six SATA drives on two controllers.  It contains seven
RAID arrays, a mix of RAID 1, RAID 10, and RAID 6;  all are listed in
fstab.  Some use 0.90.0 metadata and some use 1.2 metadata.  The root
filesystem is not on a RAID array (at least not any more, I got tired of
that REAL fast) but everything else (including /boot and all swap) is on
RAID.  BOOT_DEGRADED is set.  All partitions are GPT.  Not using LUKS or
LVM.  All drives are 2TB and by various manufacturers, and I suspect
some have 512B physical sectors and some have 2KB sectors.  This is an
AMD64 system with 8GB RAM.


This system has had about four different versions of Ubuntu on it, and has had 
multiple RAID arrays on it from the beginning.  (This is why some of the arrays 
are still using 0.90.0 metadata.)  RAID worked fine until the system was 
upgraded to Oneiric early in 2012 (no, it did not start with Precise).

I have carefully tested the system every time an updated kernel or mdadm
has appeared, since the problem started with Oneiric.  The behavior has
gradually improved over the last several months.  This latest version of
mdadm (3.2.5) did not result in significant improvement;  have rebooted
four times since then and the behavior is consistent.


When the problem first started, on Oneiric, I had the root file system on RAID. 
 This was unpleasant.  I stopped using the system for a while, as I had another 
one running Maverick.

When I noticed some discussion of possibly related bugs on the Linux
RAID list (I've been lurking there for years) I decided to test the
system some more.  By then Precise was out, so I upgraded.  That did not
help.  Eventually I backed up all data onto another system and did a
clean install of Precise on a non-RAID partition, which made the system
tolerable.  I left /boot on a RAID1 array (on all six drives), but that
does not prevent the system from booting even if /boot does not start
during Ubuntu startup (I assume because GRUB can find /boot even if
Ubuntu later can't).

I started taking detailed notes in May (seven cramped pages so far).
Have rebooted 23 times since then.  On every boot, exactly two arrays
did not start.  Which arrays they were, varied from boot to boot;  could
be any of the arrays.  No apparent correlation with metadata type or
RAID level.

This mdadm 3.2.5 is the first time I have resorted to doing a forced
upgrade from -proposed;  before, I always just waited for a regular
update.  The most significant improvements happened with earlier regular
updates.  It has been a while since I had to wait for a degraded array
to resync, or manually re-attach a component (usually a spare) that had
become detached, or drop to the command line to zero a superblock before
reattaching a component.  It has been a while since an array containing
swap has failed to start.

This issue has now become little more than an annoyance.  I can now
boot, wait for first array to not start, hit S, wait for the second, hit
S, wait for the login screen, log in, wait for Unity desktop, start Disk
Utility, manually start the two arrays that didn't start, then check all
the other arrays to see if anything else has happened.  Takes about five
minutes.  But I am still annoyed.

If you want to replicate this behavior consistently, get yourself seven
arrays.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/942106

Title:
  software raid doesn't assemble before mount on boot

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/942106/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 942106] Re: software raid doesn't assemble before mount on boot

2012-08-11 Thread Doug Jones
@Dmitrijs:

I must agree with your assessment.  I cannot really tell which (of many)
existing bug reports are most relevant to my system, so I've been
chiming in on a bunch of them, just in case they provide clues helpful
to others  :-)

This latest mdadm update does not have any negative impact on me, and
clearly helps others, so +1 from me.

Will do sudo apport mdadm.  And thanks for the link to that spec.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/942106

Title:
  software raid doesn't assemble before mount on boot

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/942106/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 942106] Re: software raid doesn't assemble before mount on boot

2012-08-11 Thread Doug Jones
@Dmitrijs:

sudo apport mdadm does nothing.  I know that apport is installed and the
service is running.  Am I doing this wrong?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/942106

Title:
  software raid doesn't assemble before mount on boot

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/942106/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 995445] Re: package gpsmanshp 1.2.1-1 failed to install/upgrade: subprocess installed post-installation script returned error exit status 127

2012-07-02 Thread Doug Jones
On my system, tcl8.5 is installed, but tcl8.4 is not.  So:

   sudo apt-get install tcl8.4

This resulted in:

Reading package lists... Done
Building dependency tree   
Reading state information... Done
Suggested packages:
  tclreadline
The following NEW packages will be installed:
  tcl8.4
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
1 not fully installed or removed.
Need to get 870 kB of archives.
After this operation, 3,375 kB of additional disk space will be used.
Get:1 http://us.archive.ubuntu.com/ubuntu/ precise/main tcl8.4 amd64 
8.4.19-4ubuntu3 [870 kB]
Fetched 870 kB in 6s (124 kB/s)
Selecting previously unselected package tcl8.4.
(Reading database ... 253204 files and directories currently installed.)
Unpacking tcl8.4 (from .../tcl8.4_8.4.19-4ubuntu3_amd64.deb) ...
Processing triggers for man-db ...
Setting up gpsmanshp (1.2.1-1) ...
warning: error while loading gpsmanshp.so: couldn't load file ./gpsmanshp.so: 
./gpsmanshp.so: undefined symbol: DBFClose
Setting up tcl8.4 (8.4.19-4ubuntu3) ...
Processing triggers for libc-bin ...
ldconfig deferred processing now taking place


...which leads me to suspect that gpsmanshp is now installed.

And, now I don't get error messages whenever I try to install packages.

So, either this is a dependency problem, or perhaps gpsmanshp.postinst
just needs to be edited to refer to tclsh8.5 instead of tclsh8.4.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/995445

Title:
  package gpsmanshp 1.2.1-1 failed to install/upgrade: subprocess
  installed post-installation script returned error exit status 127

To manage notifications about this bug go to:
https://bugs.launchpad.net/apt/+bug/995445/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 975839] Re: package gpsmanshp 1.2.1-1 failed to install/upgrade: le sous-processus script post-installation installé a retourné une erreur de sortie d'état 127

2012-06-21 Thread Doug Jones
*** This bug is a duplicate of bug 995445 ***
https://bugs.launchpad.net/bugs/995445

** This bug has been marked a duplicate of bug 995445
   package gpsmanshp 1.2.1-1 failed to install/upgrade: subprocess installed 
post-installation script returned error exit status 127

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/975839

Title:
  package gpsmanshp 1.2.1-1 failed to install/upgrade: le sous-processus
  script post-installation installé a retourné une erreur de sortie
  d'état 127

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/gpsmanshp/+bug/975839/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 963536] Re: package gpsmanshp 1.2.1-1 failed to install/upgrade: Unterprozess installiertes post-installation-Skript gab den Fehlerwert 127 zurück

2012-06-21 Thread Doug Jones
*** This bug is a duplicate of bug 995445 ***
https://bugs.launchpad.net/bugs/995445

** This bug has been marked a duplicate of bug 995445
   package gpsmanshp 1.2.1-1 failed to install/upgrade: subprocess installed 
post-installation script returned error exit status 127

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/963536

Title:
  package gpsmanshp 1.2.1-1 failed to install/upgrade: Unterprozess
  installiertes post-installation-Skript gab den Fehlerwert 127 zurück

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/gpsmanshp/+bug/963536/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 995445] Re: package gpsmanshp 1.2.1-1 failed to install/upgrade: subprocess installed post-installation script returned error exit status 127

2012-06-21 Thread Doug Jones
Following this installation failure, an error message appears at the end
of every upgrade of any software package.  So affected users see error
messages almost every day, even if they never even try to run gpsmanshp.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/995445

Title:
  package gpsmanshp 1.2.1-1 failed to install/upgrade: subprocess
  installed post-installation script returned error exit status 127

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/gpsmanshp/+bug/995445/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 990913] Re: RAID goes into degrade mode on every boot 12.04 LTS server

2012-05-30 Thread Doug Jones
Precise is using a 3.2.0 kernel.  There is a known MD bug that affects
some 3.2.x and 3.3.x kernels, that seems like it might be relevant to
this problem.  See:

http://www.spinics.net/lists/raid/msg39004.html

and the rest of that thread.  Note the mention of possible racing in
scripts.

Unfortunately for us, the lead MD developer does not test with Ubuntu,
or with any other Debian-based distro.  (He only uses SUSE.)  So if
there are any complex race conditions or other problems created by
Ubuntu's udev scripts or configs or whatever, he might not uncover them
in his testing, and the level of assistance he can provide is limited.
(He and the others on the linux-raid list are indeed helpful, but I'm
not sure that very many of them use Ubuntu, and the level of the
discussion there is fairly technical and probably well beyond what most
Ubuntu users could follow.)

Now that Canonical has announced the plan to eliminate the Alternate
installer and merge all installer functionality (presumably including
RAID) into the regular Desktop installer, it seems likely that the
number of users setting up RAID arrays will increase.  (I am using
Desktop myself, not Server).

For some time now, it has been possible to set up and (to a limited
degree) manage software RAID arrays on Ubuntu without any knowledge of
the command line.  So there are Desktop users who are using RAID arrays,
thinking they are safeguarding their data.  But when the complex
creature known as linux software RAID breaks down, as it has with this
bug, they are quickly in over their heads.  Given that RAID bugs can
destroy the user's data, just about the worst thing that can happen, it
would seem prudent to either (1) actively discourage non-expert users
from using RAID, or (2) make Ubuntu's implementation of RAID far more
reliable.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/990913

Title:
  RAID goes into degrade mode on every boot 12.04 LTS server

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/990913/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 990913] Re: RAID goes into degrade mode on every boot 12.04 LTS server

2012-05-22 Thread Doug Jones
Since my last comment, an updated kernel arrived via Update Manager.
Its changelog included the following:

   * md: fix possible corruption of array metadata on shutdown.
- LP: #992038

This seems possibly relevant.  I updated, and have now rebooted several
times.  The RAID degradation is still happening, on every reboot.  As
before, the system runs just fine after I finish fixing up RAID.

I am now keeping detailed notes on which partitions are being degraded.
Since it takes me anywhere from fifteen minutes to several hours to
accomplish each reboot and ensuing repair, and I have other things to do
as well, it will be a while before meaningful statistics are
accumulated.

Further details I forgot to mention earlier:  This is an AMD64 system
with 8GB of ECC RAM.  Have attached most recent dmesg.


** Attachment added: dmesg.txt
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/990913/+attachment/3157872/+files/dmesg.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/990913

Title:
  RAID goes into degrade mode on every boot 12.04 LTS server

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/990913/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 990913] Re: RAID goes into degrade mode on every boot 12.04 LTS server

2012-05-17 Thread Doug Jones
I have now installed Precise on my system.  (I had intended to install
as a multiboot, along with the existing Oneiric, but apparently the
alternate installer could not recognize my existing /boot RAID1
partition, so now I can't boot Oneiric.  But that's another story...)

Note that the title of the original bug report refers to 12.04 Server,
but I have a Desktop system, installed with the Alternate disk.

This time I installed / on a non-RAID partition.  My pre-existing RAID
partitions are now mounted as directories in /media, except for /boot,
which is still on the same MD partition as before.

I have now rebooted several times since installing 12.04.  The previous
behavior of hanging during shutdown has not recurred.  Also, pleasantly,
the previous behavior of hanging during boot (between the GRUB splash
and the Ubuntu splash) has also not recurred.

I am getting error messages on the Ubuntu splash screen (under the
crawling dots) about file systems not being found.  I have seen these
occasionally for many years, and have become quite accustomed to them.
It says I can wait, or hit S to skip, or do something manually;  I wait
for a while, but soon give up on that and hit S because waiting NEVER
accomplishes anything.  I'm not sure why that option is even mentioned.

Fortunately, this has not been happening with my /, so I can
successfully log into Ubuntu.

Once there, I start up palimpsest (Disk Utility) and look at the RAID
partitions.  Generally, about half of them are degraded or simply not
started.

The ones that are not started are the ones mentioned in the error
messages on the splash screen.  I can start them from palimpsest;
sometimes they start degraded, sometimes not.

After about an hour of work, all of the degraded partitions are fully
synchronized.  I usually have to re-attach some components as well.
Haven't lost any data yet.

Sometimes I cannot re-attach a component using palimpsest and have to
drop to the command line, zero the superblock, and then add the
component.  This has always worked so far.  I only noticed this
particular behavior since installing Precise.

In short:  On this system, RAID usually degrades upon reboot.  It did
this with Oneiric (but only starting a few weeks ago) and it does this
with a freshly installed Precise.

Around the time this behavior started with Oneiric, I did a lot of
maintenance work on this hardware, including:

1) swapping out one hard drive

2) putting some 1.2 metadata RAID partitions on, where previously all
were 0.90 metadata

I have not noticed any correlation between metadata version and
degradation.  Any of them can get degraded, in an apparently random
fashion.

Between reboots, the system runs just fine.  Hard drive SMART health
appears stable.  The newest hard drive is reported as healthy.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/990913

Title:
  RAID goes into degrade mode on every boot 12.04 LTS server

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/990913/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 990913] Re: RAID goes into degrade mode on every boot 12.04 LTS server

2012-05-12 Thread Doug Jones
I am having similar problems.  I am running Oneiric.  I am NOT using
LUKS or LVM.

Symptoms vary in severity a lot.  Sometimes it simply drops a spare, and
it's listed in palimpsest as not attached.  One click of the button
and it's reattached, and shown as spare.

But then sometimes it gets really hairy.  These nightmares usually start
when I shut down the system, and it appears to hang during shutdown.

Now that this has been going on for a while, I *always* check the status
of my drives and arrays immediately before shutting down.  First I shut
down all apps, then I start palimpsest.  I check the SMART health of all
drives (all are healthy, except for one that has one bad block, and that
never changes).  Then I check the arrays;  all are running and idle.  I
also drill down and check the array components, to make sure they are
all attached.  If I find one that isn't attached, I attach it.  I don't
shut down until everything looks good.

Then I shut down, and cross all my fingers and toes.

About half the time, shutdown never completes.  It hangs on the purple
screen, with Ubuntu and five dots that don't crawl.  I watch the drive
activity light; nothing.  No drive activity at all.

Then I wait and wait and wait, wasting my valuable time (well, valuable
to me anyway) until I get fed up.

Then I do what my mommy always told me, and shut down with Alt-SysRq

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/990913

Title:
  RAID goes into degrade mode on every boot 12.04 LTS server

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/990913/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 990913] Re: RAID goes into degrade mode on every boot 12.04 LTS server

2012-05-12 Thread Doug Jones
(Ooops, apparently hit the wrong key...   continuing the previous
comment)

...shut down with Alt-SysRq REISUB.  This has no effect whatsoever.  The
screen doesn't change;  the drive activity light does nothing.

Finally, after stewing for a while longer, I hold down the power switch
until I hear all the fans powering down.


Then I boot up.  I see no error messages.  Everything seems to be working fine, 
except the part about having to boot it three or four times before it actually 
gets past the GRUB splash screen and arrives at the Ubuntu splash screen.  
After that, everything looks great...  I log in, and get to Unity, and I never 
saw any error message going by.

Then, the first thing I do is start up palimpsest and check the drives
and arrays.  The drives are always fine, but generally about half of the
arrays are degraded.  Sometimes it will start re-syncing one of the
arrays all by itself;  usually it starts with an array that I don't care
so much about, and I can't do anything about the ones with more
important data until later, because apparently palimpsest can only
change one RAID-related thing at a time.   Which means that sometimes I
have to wait for mny hours to start working on the next array.

The worst I've seen was the time it detached two drives from my RAID6
array.  Very scary.

I have one RAID6 array, one RAID10 array, and several RAID1 arrays.  I
think all of them have degraded at one time or another.  This bug seems
to be an equal opportunity degrader.  Usually I find two or three of the
larger arrays are degraded, plus several detached spares on other
arrays.

This system has six 2TB drives.  I think some of them have 512 byte
sectors, and some have 2048 byte sectors;  how the heck do you tell,
anyway?  All use GPT partitions, and care has been taken to align all
partitions on 1MB boundaries (palimpsest actually reports if it finds
alignment issues).

The system has two SATA controllers.  I put four drives on one
controller, and two on the other, and for the RAID1 and RAID10 arrays I
make sure there are no mirrors where both parts are on the same
controller, or both parts on drives made by the same company.  Except,
that isn't really true any more;  whenever something gets degraded and I
have to re-attach and re-sync, the array members often get rearranged.
I think most of my spares are now concentrated on a couple of drives,
which isn't really what I had planned.  I've given up on rearranging the
drives to my liking, for the duration.

In fact, for the duration, I've given up on this system.  I've been
gradually moving data off it, onto another system, which is running
Maverick, and it will continue to run Maverick because it doesn't try to
rearrange my data storage every time I look at it sideways.  (Verry
gradually, since NFS has been broken for the better part of a year...)

This nice expensive Oneiric system will be dedicated to the task of
rebooting, re-attaching, and re-syncing, until Oneiric starts to behave
itself.  I am planning to also install Precise (multiboot) so I can test
that too.  Attempting an OS install while partitions are borking
themselves on every other reboot sounds like fun.

BTW, I watched the UDS Software RAID reliability session video from
last Tuesday:

https://www.youtube.com/watch?v=RpC-dkgN37Mlist=UUWUDCz-
Q0m4qK7lkK4CevQAindex=2feature=plcp

I was quite pleased to see that people are working on these problems.

(But I was particularly surprised to learn how many people there were
completely unaware that Ubuntu rearranges device names (i.e. /dev/sda
etc.) at each reboot.   I noticed that a really long time ago.)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/990913

Title:
  RAID goes into degrade mode on every boot 12.04 LTS server

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/990913/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 848823] Re: nfs-kernel-server requires a real interface to be up

2012-04-03 Thread Doug Jones
This is affecting me as well.  It started when I upgraded from 11.04 to
11.10.

I am identifying machines with IPs, not DNS.  I set it up as per this
tutorial:

https://mostlylinux.wordpress.com/network/nfshowto/

These instructions work fine for 10.04, 10.10, and 11.04.  (However, the
commands it describes for restarting services no longer apply in 10.10
and later;  my workaround has been to simply reboot the machine instead,
and that works fine).

I still have machines running those older versions, and they still talk
to each other,  but the one running 11.10 is incommunicado.

I did get the error message Tim mentioned about /etc/exports.d: No such
file or directory, but I simply created an empty directory at
/etc/exports.d and that message no longer appears.  But NFS still
doesn't work.

I have tried the workaround Joseph Brown describes but that doesn't seem
to help.

I would like to try the workaround where one uses
/etc/network/interfaces instead of network-manager, but I have no idea
how to do that.  If someone could spell that out, at the user-
friendliness level of the tutorial I mentioned above, that would be
great.

BTW, I know at least two other people who are using NFS on 10.04 LTS,
using basically the same setup I have (from that same tutorial), and who
probably would be upgrading to the upcoming LTS if they hadn't already
been warned about NFS being borked.  I wonder how many other LTS users
are about to get a nasty surprise.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/848823

Title:
  nfs-kernel-server requires a real interface to be up

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ifupdown/+bug/848823/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 666038] Re: error while creating logical partition

2010-10-25 Thread Doug Jones
I just tested this on Maverick final release (AMD64 alternate
installer).  Same result.

-- 
error while creating logical partition
https://bugs.launchpad.net/bugs/666038
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 666038] [NEW] error while creating logical partition

2010-10-24 Thread Doug Jones
Public bug reported:

Binary package hint: gnome-disk-utility

Running Maverick release candidate, with all the updates applied since
final release (which happened two weeks ago).

Ran palimpsest to add partitions to an empty SATA drive, in preparation
for from-scratch installation of Maverick final.


To reproduce:

Start with empty 500GB drive.

Add three empty primary partitions:  1GB, 189GB, 20GB.  Then add an
extended partition, using all remaining space.

Add one empty 20GB logical partition within the extended.


Error reported:

Error creating partition

An error occurred while performing an operation on 500 GB Hard Disk
(ATA Hitachi HTS545050B9A300):  The operation failed

Details:

Error creating partition: helper exited with exit code 1: In 
part_add_partition: device_file=/dev/sde, start=210007848960, size=200, 
type=0x83
Entering MS-DOS parser (offset=0, size=500107862016)
MSDOS_MAGIC found
looking at part 0 (offset 32256, size 1003451904, type 0x83)
new part entry
looking at part 1 (offset 1003484160, size 189000483840, type 0x83)
new part entry
looking at part 2 (offset 190003968000, size 20003880960, type 0x83)
new part entry
looking at part 3 (offset 210007848960, size 290097400320, type 0x05)
Entering MS-DOS extended parser (offset=210007848960, size=290097400320)
readfrom = 210007848960
MSDOS_MAGIC found
Exiting MS-DOS extended parser
Exiting MS-DOS parser
MSDOS partition table detected
containing partition table scheme = 1
got it
got disk
new partition
added partition start=210007881216 size=20003848704
committed to disk
Error doing BLKPG ioctl with BLKPG_ADD_PARTITION for partition 5 of size 
210007881216 at offset 20003848704 on /dev/sde: Device or resource busy


Note that I also tried this with a different 500GB drive, made by a
different company, and got the same result.

I see that there are other bugs filed against Lucid that seem to afflict
500GB drives but not other commonly used sizes.  I was under the
impression that these bugs had been fixed by the time Maverick release
candidate appeared, but perhaps all of them weren't, or perhaps this is
unrelated.

ProblemType: Bug
DistroRelease: Ubuntu 10.10
Package: gnome-disk-utility 2.30.1-2
ProcVersionSignature: Ubuntu 2.6.35-22.35-generic-pae 2.6.35.4
Uname: Linux 2.6.35-22-generic-pae i686
Architecture: i386
Date: Sun Oct 24 13:19:47 2010
ExecutablePath: /usr/bin/palimpsest
InstallationMedia: Ubuntu 10.10 Maverick Meerkat - Release Candidate i386 
(20100928)
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: gnome-disk-utility
XsessionErrors:
 (polkit-gnome-authentication-agent-1:1756): GLib-CRITICAL **: 
g_once_init_leave: assertion `initialization_value != 0' failed
 (nautilus:1751): GConf-CRITICAL **: gconf_value_free: assertion `value != 
NULL' failed

** Affects: gnome-disk-utility (Ubuntu)
 Importance: Undecided
 Status: New


** Tags: apport-bug i386 maverick

-- 
error while creating logical partition
https://bugs.launchpad.net/bugs/666038
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 666038] Re: error while creating logical partition

2010-10-24 Thread Doug Jones


-- 
error while creating logical partition
https://bugs.launchpad.net/bugs/666038
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 569900] Re: mount: mounting /dev/md0 on /root/ failed: Invalid argument

2010-09-02 Thread Doug Jones
This bit me too.  Two 500GB drives, RAID1, using 10.04.1 alternate 386
installer.

Reading through all these comments, and those on similar (possibly
related) bugs, it seems like this is caused by an arithmetic error in
some code that figures out where things ought to be on the disk.

Suddenly I am reminded of another arithmetic error that cropped up in
gparted recently, relating to the switchover from align-to-cylinder to
align-to-megabyte.  Didn't the default partition alignment method just
change in Lucid?

Very suspicious...

-- 
mount: mounting /dev/md0 on /root/ failed: Invalid argument
https://bugs.launchpad.net/bugs/569900
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 199393] Re: servicemenu for amarok has an invalid menu entry addAsPodcast

2008-04-29 Thread Doug Jones
I looked at the amarok_addaspodcast.desktop file and compared it with a
number of other amarok*.desktop files in the same folder.  This one
differed in that it had no Exec line in it.  I added one after the Icon
line:

Exec=amarok -a %u

I have no idea if this is actually the correct line to use, as the
amarok docs I found were not very instructive on this issue.  (I'm sure
it helps that I know little about amarok.)

But, adding this line to the file eliminated the bad behavior.

In any case, it would be a mistake to consider this merely a bug in a
.desktop file.  The biggest bug is in dolphin itself.  An app shouldn't
start behaving like a popup-mad web browser from the bad old days just
because a config file is wrong.  If it finds a bad .desktop file, it
should skip it (and maybe log the error somewhere).

I tried dragging a selection box around a list of files in a dolphin
window and it went crazy with endless identical dialog boxes that I just
couldn't close fast enough.  And the pane to the right with context-
sensitive options started replicating itself down the window.  I
couldn't close dolphin, and eventually had to restart the xserver.

So this appears to be two bugs:  One in dolphin itself, and one in the
file amarok_addaspodcast.desktop .

-- 
servicemenu for amarok has an invalid menu entry addAsPodcast
https://bugs.launchpad.net/bugs/199393
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs