Re: [gentoo-user] Diagnosing file corruption

2015-08-16 Thread Bryan Gardiner
On Thu, Aug 06, 2015 at 12:00:30PM +1000, wraeth wrote:
 On 06/08/15 10:34, Bryan Gardiner wrote:
  After I make a fresh backup of my files, how would you recommend 
  troubleshooting this?  Run memtest or a hard drive tester?  Since
  the files seemingly corrupted themselves after install without
  being touched, I'm highly suspicious of the hard drive, but would
  like to rule other things out (if say for example that
  CONFIG_X86_INTEL_PSTATE CPU clock booster is dangerous, or
  nvidia-drivers, or ...).  Haven't checked for corruption on /home
  yet.
 
 One key question that doesn't seem to have been asked yet: have you
 performed an fsck on the partition? You could try booting to a livecd
 environment and running
 
   fsck -fc /dev/sdXY
 
 (adjusting for your device schema accordingly) on your apparently
 failing partition(s) to see if there is a filesystem corruption...

Thanks very much for the suggestions, everyone.  I ended up using fsck
-fc and -fcc, which resulted in no bad blocks being detected.  I also
wanted to make sure no other files in that range of disk were
corrupted, so I extracted the extents used by the bad files:

  cat bad-files | while read file; do
  echo  ${file} 
  debugfs -R dump_extents ${file} /dev/mikasa-vg/gentoo
  done bad-extents

found the files in the regions between the bad files:

  for block in $(seq 5302485 5302486) $(seq 5302489 5302498) $(seq 5302504 
5302508); do
  inode=$(debugfs -R icheck ${block} /dev/mikasa-vg/gentoo 2/dev/null | 
perl -ne 'if (/^\d+\s+(\d+)$/) {print $1, \n}')
  if [[ -n $inode ]]; then
  echo ${block} ${inode} $(debugfs -R ncheck ${inode} 
/dev/mikasa-vg/gentoo 2/dev/null | awk 'NR==2 {print $2}')
  else
  echo ${block}
  fi
  done

and file'd those to make sure that they were okay.  This is only a
personal computer, so I'm going to call this a one-off issue and move
on, and leave the stronger approaches for another day.

Thanks again!
Bryan

-- 
If people do not believe that mathematics is simple, it is only
because they do not realize how complicated life is - von Neumann


signature.asc
Description: Digital signature


Re: [gentoo-user] Diagnosing file corruption

2015-08-06 Thread Bob Wya
On 6 August 2015 at 01:34, Bryan Gardiner b...@khumba.net wrote:

 Hello list,

 

 This is the disk:

   *-disk
 description: ATA Disk
 product: ST1000LM024 HN-M
 vendor: Seagate
 physical id: 0.0.0
 bus info: scsi@4:0.0.0
 logical name: /dev/sda
 version: 0001
 size: 931GiB (1TB)
 capabilities: gpt-1.00 partitioned partitioned:gpt
 configuration: ansiversion=5
 guid=---- sectorsize=4096

 Thanks for any help you can provide,
 Bryan


Complex question. Simple answer... Spinrite :-)

-- 

All the best,
Robert


Re: [gentoo-user] Diagnosing file corruption

2015-08-05 Thread Fernando Rodriguez
On Wednesday, August 05, 2015 5:34:43 PM Bryan Gardiner wrote:
 Hello list,
 
 On my most recent update, I had some build failures that led me to
 find that some files on my root partition have been corrupted.  This
 is a new Asus N550JK laptop, a mostly-stable amd64 install with
 gentoo-sources-4.0.5 and ext4-root-in-LVM-in-LUKS-on-HDD, and Debian
 lives in there too (no problems showed up verifying Debian's packages;
 I installed Debian on Jul 1 and used it for a week before getting time
 to set up Gentoo).
 
 These are the package merge times, package names, and files that I
 found to be corrupted via qcheck (there were also a couple Python
 headers that I fixed by rebuilding).  They appear to be filled with
 random data.  The binpkg contents in /usr/portage/packages are okay,
 so I don't know when the files were corrupted; their mtimes haven't
 been updated since the packages were installed.
 
 Thu-Jul-30-22:40:23-2015 app-arch/p7zip-9.20.1-r5 
/usr/lib64/p7zip/Lang/va.txt
 Thu-Jul-30-22:40:23-2015 app-arch/p7zip-9.20.1-r5 
/usr/lib64/p7zip/help/cmdline/switches/large_pages.htm
 Sun-Jul-19-22:34:30-2015 dev-libs/libzip-1.0.1 
/usr/share/man/man3/zip_error_get_sys_type.3.bz2
 Sun-Jul-26-22:35:28-2015 dev-python/pygments-2.0.1-r1 
/usr/lib64/python2.7/site-packages/pygments/styles/pastie.pyc
 Wed-Jul-08-23:34:56-2015 media-libs/tiff-4.0.3-r6 
/usr/share/man/man3/TIFFGetField.3tiff.bz2
 Thu-Jul-30-10:05:31-2015 sci-mathematics/scilab-5.5.2 
/usr/share/scilab/modules/compatibility_functions/macros/%b_l_s.bin
 -(from-stage3-on-Jul-8)- sys-apps/acl-2.2.52-r1 
/usr/share/man/man3/acl_set_file.3.bz2
 
 I haven't had any unclean shutdowns, it looks like OpenRC is
 unmounting things cleanly on shutdown, and suspend appears to work
 fine.
 
 After I make a fresh backup of my files, how would you recommend
 troubleshooting this?  Run memtest or a hard drive tester?  Since the
 files seemingly corrupted themselves after install without being
 touched, I'm highly suspicious of the hard drive, but would like to
 rule other things out (if say for example that CONFIG_X86_INTEL_PSTATE
 CPU clock booster is dangerous, or nvidia-drivers, or ...).  Haven't
 checked for corruption on /home yet.
 
 This is the disk:
 
   *-disk
 description: ATA Disk
 product: ST1000LM024 HN-M
 vendor: Seagate
 physical id: 0.0.0
 bus info: scsi@4:0.0.0
 logical name: /dev/sda
 version: 0001
 size: 931GiB (1TB)
 capabilities: gpt-1.00 partitioned partitioned:gpt
 configuration: ansiversion=5
 guid=---- sectorsize=4096
 
 Thanks for any help you can provide,
 Bryan

You can use badblocks to rule out a bad drive (be sure to read the 
documentation first if you haven't). But I would guess that something LUKS 
related is more likely. There may be clues in your log files (probably around 
the time when you installed these packages).

-- 
Fernando Rodriguez



Re: [gentoo-user] Diagnosing file corruption

2015-08-05 Thread wraeth
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 06/08/15 10:34, Bryan Gardiner wrote:
 After I make a fresh backup of my files, how would you recommend 
 troubleshooting this?  Run memtest or a hard drive tester?  Since
 the files seemingly corrupted themselves after install without
 being touched, I'm highly suspicious of the hard drive, but would
 like to rule other things out (if say for example that
 CONFIG_X86_INTEL_PSTATE CPU clock booster is dangerous, or
 nvidia-drivers, or ...).  Haven't checked for corruption on /home
 yet.

One key question that doesn't seem to have been asked yet: have you
performed an fsck on the partition? You could try booting to a livecd
environment and running

  fsck -fc /dev/sdXY

(adjusting for your device schema accordingly) on your apparently
failing partition(s) to see if there is a filesystem corruption...

- -- 
wraeth wra...@wraeth.id.au
GnuPG Key: B2D9F759
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iF4EAREIAAYFAlXCv7kACgkQXcRKerLZ91npQwD/U41L/qmK8g7d0bWx6tR3SxbW
4bGheAvX3lWJvgMnG9QA/AuO7wnaKTcWeqoT7c+R7e8UHaaOfwaoS1w2J2hGVINJ
=Ykkl
-END PGP SIGNATURE-



[gentoo-user] Diagnosing file corruption

2015-08-05 Thread Bryan Gardiner
Hello list,

On my most recent update, I had some build failures that led me to
find that some files on my root partition have been corrupted.  This
is a new Asus N550JK laptop, a mostly-stable amd64 install with
gentoo-sources-4.0.5 and ext4-root-in-LVM-in-LUKS-on-HDD, and Debian
lives in there too (no problems showed up verifying Debian's packages;
I installed Debian on Jul 1 and used it for a week before getting time
to set up Gentoo).

These are the package merge times, package names, and files that I
found to be corrupted via qcheck (there were also a couple Python
headers that I fixed by rebuilding).  They appear to be filled with
random data.  The binpkg contents in /usr/portage/packages are okay,
so I don't know when the files were corrupted; their mtimes haven't
been updated since the packages were installed.

Thu-Jul-30-22:40:23-2015 app-arch/p7zip-9.20.1-r5 /usr/lib64/p7zip/Lang/va.txt
Thu-Jul-30-22:40:23-2015 app-arch/p7zip-9.20.1-r5 
/usr/lib64/p7zip/help/cmdline/switches/large_pages.htm
Sun-Jul-19-22:34:30-2015 dev-libs/libzip-1.0.1 
/usr/share/man/man3/zip_error_get_sys_type.3.bz2
Sun-Jul-26-22:35:28-2015 dev-python/pygments-2.0.1-r1 
/usr/lib64/python2.7/site-packages/pygments/styles/pastie.pyc
Wed-Jul-08-23:34:56-2015 media-libs/tiff-4.0.3-r6 
/usr/share/man/man3/TIFFGetField.3tiff.bz2
Thu-Jul-30-10:05:31-2015 sci-mathematics/scilab-5.5.2 
/usr/share/scilab/modules/compatibility_functions/macros/%b_l_s.bin
-(from-stage3-on-Jul-8)- sys-apps/acl-2.2.52-r1 
/usr/share/man/man3/acl_set_file.3.bz2

I haven't had any unclean shutdowns, it looks like OpenRC is
unmounting things cleanly on shutdown, and suspend appears to work
fine.

After I make a fresh backup of my files, how would you recommend
troubleshooting this?  Run memtest or a hard drive tester?  Since the
files seemingly corrupted themselves after install without being
touched, I'm highly suspicious of the hard drive, but would like to
rule other things out (if say for example that CONFIG_X86_INTEL_PSTATE
CPU clock booster is dangerous, or nvidia-drivers, or ...).  Haven't
checked for corruption on /home yet.

This is the disk:

  *-disk
description: ATA Disk
product: ST1000LM024 HN-M
vendor: Seagate
physical id: 0.0.0
bus info: scsi@4:0.0.0
logical name: /dev/sda
version: 0001
size: 931GiB (1TB)
capabilities: gpt-1.00 partitioned partitioned:gpt
configuration: ansiversion=5
guid=---- sectorsize=4096

Thanks for any help you can provide,
Bryan


signature.asc
Description: Digital signature