Re: Is there way to get filename for specific LBA?

2011-09-02 Thread Marcin Wisnicki
On Wed, 31 Aug 2011 20:50:18 -0700, Carl Johnson wrote:

 
 It looks like the best bet would be fsdb, assuming that it is a UFS file
 system.  That does have a 'findblk' command to find a file containing a
 block, but you would need to calculate the block offset in the
 filesystem first.  It doesn't look like it would be easy, as was said
 earlier.

I have a ruby script for this that wraps various commands.

You pipe an error log to it and it finds files:

  blocks2file.rb  /var/log/messages

Currently, it looks only for geom errors (with byte offsets) but that can 
be easily adjusted.
It helped me find the source of my problems in the past but I haven't 
worked on it since.

Here it is: https://github.com/mwisnicki/freebsd-block2file

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Is there way to get filename for specific LBA?

2011-09-01 Thread Doug Hardie

On 31 August 2011, at 20:50, Carl Johnson wrote:

 per...@pluto.rain.com writes:
 
 Robert Bonomi bon...@mail.r-bonomi.com wrote:
 
 Aug 31 05:13:24 da kernel: ad6: WARNING - READ_DMA UDMA ICRC
 error (retrying request) LBA=107491647
 ... I looked at bsdlabel a   it's partition f, /home. But what
 is the file name?
 
 There's *no* easy way to find out.  You'll have to grovel through
 all the filesystem metadata, and the layers of index blocks for
 every file until you find the 'rgiht' one.
 
 This is what icheck -B was for, but icheck(8) no longer exists and
 that particular bit of functionality does not seem to be provided in
 fsck(8).
 
 One current userland utility (other than fsck) which does know
 how to grovel through the metadata and index blocks is dump(8),
 but you'd have to hack on it to report which inode was using a
 particular block.
 
 It looks like the best bet would be fsdb, assuming that it is a UFS
 file system.  That does have a 'findblk' command to find a file
 containing a block, but you would need to calculate the block offset in
 the filesystem first.  It doesn't look like it would be easy, as was
 said earlier.

I created a utility some years ago that did that for UFS.  I believe it works 
for UFS2 but haven't verified it.  If you want to try it, send me a note and 
I'll ship you the code direct.

-- Doug___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Is there way to get filename for specific LBA?

2011-09-01 Thread Polytropon
On Wed, 31 Aug 2011 20:50:18 -0700, Carl Johnson wrote:
 per...@pluto.rain.com writes:
  One current userland utility (other than fsck) which does know
  how to grovel through the metadata and index blocks is dump(8),
  but you'd have to hack on it to report which inode was using a
  particular block.
 
 It looks like the best bet would be fsdb, assuming that it is a UFS
 file system.  That does have a 'findblk' command to find a file
 containing a block, but you would need to calculate the block offset in
 the filesystem first.  It doesn't look like it would be easy, as was
 said earlier.

Recently I had a similar problem with a disk (500GB SATA,
/dev/ad6 attached to controller ata3, one UFS data partition)
that had errors when accessing certain files or directories
like

TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=419149408

or

WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=9488927

Then also gvfs_done errors were displayed, and the disk then
magically disappeared.

On system startup, fsck failed:

unknown: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=419149408
ata3: timeout waiting to issue command
ata3: error issuing READ_DMA48 command

There were

CANNOT READ BLK: 419149408
UNEXPECTED SOFT UPDATE INCONSISTENCY

and several more of such errors. The summary states:

THE FOLLOWING DISK SECTORS COULD NOT BE READ: 419149408, 419149409, 419149410, 4
19149411, 419149412, 419149413, 419149414, 419149415, 419149416, 419149417, 4191
49418, 419149419, 419149420, 419149421, 419149422, 419149423, 419149424, 4191494
25, 419149426, 419149427, 419149428, 419149429, 419149430, 419149431, 419149432,
 419149433, 419149434, 419149435, 419149436, 419149437, 419149438, 419149439, 41
9149440, 419149441, 419149442, 419149443, 419149444, 419149445, 419149446, 41914
9447, 419149448, 419149449, 419149450, 419149451, 419149452, 419149453, 41914945
4, 419149455, 419149456, 419149457, 419149458, 419149459, 419149460, 419149461,
419149462, 419149463, 419149464, 419149465, 419149466, 419149467, 419149468, 419
149469, 419149470, 419149471, 419149472, 419149473, 419149474, 419149475, 419149
476, 419149477, 419149478, 419149479, 419149480, 419149481, 419149482, 419149483
, 419149484, 419149485, 419149486, 419149487, 419149488, 419149489, 419149490, 4
19149491, 419149492, 419149493, 419149494, 419149495, 419149496, 419149497, 4191
49498, 419149499, 419149500, 419149501, 419149502, 419149503, 419149504, 4191495
05, 419149506, 419149507, 419149508, 419149509, 419149510, 419149511, 419149512,
 419149513, 419149514, 419149515, 419149516, 419149517, 419149518, 419149519, 41
9149520, 419149521, 419149522, 419149523, 419149524, 419149525, 419149526, 41914
9527, 419149528, 419149529, 419149530, 419149531, 419149532, 419149533, 41914953
4, 419149535,

CANNOT READ BLK: 419525632
UNEXPECTED SOFT UPDATE INCONSISTENCY

THE FOLLOWING DISK SECTORS COULD NOT BE READ: 419525632, 419525633, 419525634, 4
19525635, 419525636, 419525637, 419525638, 419525639, 419525640, 419525641, 4195
25642, 419525643, 419525644, 419525645, 419525646, 419525647, 419525648, 4195256
49, 419525650, 419525651, 419525652, 419525653, 419525654, 419525655, 419525656,
 419525657, 419525658, 419525659, 419525660, 419525661, 419525662, 419525663,
CYLINDER GROUP 1115: BAD MAGIC NUMBER
UNEXPECTED SOFT UPDATE INCONSISTENCY

After that, fsck suggested to re-run the procedure because
the file system couldn't be marked clean. I just had a look
at the device files and...

# ll /dev/ad*
crw-r-  1 root  operator0, 108 Aug 23 01:40 /dev/ad4
crw-r-  1 root  operator0, 113 Aug 23 01:40 /dev/ad4s1
crw-r-  1 root  operator0, 116 Aug 23 03:40 /dev/ad4s1a
crw-r-  1 root  operator0, 117 Aug 23 01:40 /dev/ad4s1b
crw-r-  1 root  operator0, 118 Aug 23 03:40 /dev/ad4s1d
crw-r-  1 root  operator0, 119 Aug 23 03:40 /dev/ad4s1e
crw-r-  1 root  operator0, 120 Aug 23 03:40 /dev/ad4s1f
crw-r-  1 root  operator0, 121 Aug 23 03:40 /dev/ad4s1g
crw-r-  1 root  operator0, 122 Aug 23 03:40 /dev/ad4s1h

Tadaa! You see: /dev/ad6 isn't _there_ anymore!

I rebooted the system, skipped _any_ checks of /dev/ad6 and
carefully mounted it (being given the correct warning):

# mount -t ufs -o ro /dev/ad6 /mnt

I could then browse the disk, but when entering some directories,
error messaged (as above) did show up again. Then the disk was
empty, and finally it was gone.

As I had the original data on another disk, it wasn't a data
loss for me, but a very interesting behaviour of a disk!

When browsing the disk with the Midnight Commander, many files
were size 0, prefixed with !, and colored red. This indicates
that they couldn't be stat()ed. This traditionally shows that
the file system is damaged. A dying disk could be the reason.
But malfunctioning controllers and software errors also could
be, even though it's often the _disk_ that needs replacing.



With this 

Re: Is there way to get filename for specific LBA?

2011-08-31 Thread perryh
Ross basarev...@gmail.com wrote:

 Aug 31 05:13:24 da kernel: ad6: WARNING - READ_DMA UDMA ICRC error
 (retrying request) LBA=107491647

That message is reporting a problem in communication between the
drive and the controller (or, perhaps, between the controller and
main memory), not a problem reading the media, so the LBA is likely
not all that useful (esp. since, if you got no other messages, the
retry succeeded so no data was lost).

What does

  egrep 'ad[0-9]|ata' /var/run/dmesg.boot

report?

 #  dd if=/dev/ad6 of=/dev/null bs=1m seek=107491647 count=1
 dd: /dev/null: Inappropriate ioctl for device

 Another question: why does it fail?

seek= applies to the output file, so it tried to do a seek on
/dev/null :)  You probably wanted skip= (or iseek=).
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Is there way to get filename for specific LBA?

2011-08-31 Thread Ross
On Wed, Aug 31, 2011 at 4:11 PM,  per...@pluto.rain.com wrote:
 Ross basarev...@gmail.com wrote:

 Aug 31 05:13:24 da kernel: ad6: WARNING - READ_DMA UDMA ICRC error
 (retrying request) LBA=107491647

 That message is reporting a problem in communication between the
 drive and the controller (or, perhaps, between the controller and
 main memory), not a problem reading the media, so the LBA is likely
 not all that useful (esp. since, if you got no other messages, the
 retry succeeded so no data was lost).

 What does

  egrep 'ad[0-9]|ata' /var/run/dmesg.boot

 report?


atapci0: Intel ICH7 SATA300 controller port
0x20b8-0x20bf,0x20cc-0x20cf,0x20b0-0x20b7,0x20c8-0x20cb,0x20a0-0x20af
mem 0xe0284000-0xe02843ff irq 19 at device 31.2 on pci0
atapci0: [ITHREAD]
ata2: ATA channel 0 on atapci0
ata2: [ITHREAD]
ata3: ATA channel 1 on atapci0
ata3: [ITHREAD]
ad4: 238475MB Seagate ST9250315AS 0001SDM1 at ata2-master UDMA100 SATA
ad6: 476940MB Seagate ST9500325AS 0001SDM1 at ata3-master UDMA100 SATA
Trying to mount root from ufs:/dev/ad6s1a


 #  dd if=/dev/ad6 of=/dev/null bs=1m seek=107491647 count=1
 dd: /dev/null: Inappropriate ioctl for device

 Another question: why does it fail?

 seek= applies to the output file, so it tried to do a seek on
 /dev/null :)  You probably wanted skip= (or iseek=).


Thank you :) I have read the man page now.

smartd also reports this:

Aug 31 10:41:04 da smartd[886]: Device: /dev/ad6, Failed SMART usage
Attribute: 184 End-to-End_Error.

I found this explanation: http://kb.acronis.com/content/9119

So disk is dying? Or is it cable. I have no physical access to the
server at the moment.
But still, is there a way to get the filename for LBA?
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Is there way to get filename for specific LBA?

2011-08-31 Thread Robert Bonomi
 From owner-freebsd-questi...@freebsd.org  Tue Aug 30 22:14:58 2011
 Date: Wed, 31 Aug 2011 06:11:24 +0300
 From: Ross basarev...@gmail.com
 To: freebsd-questions@freebsd.org
 Subject: Is there way to get filename for specific LBA?

 Aug 31 05:13:24 da kernel: ad6: WARNING - READ_DMA UDMA ICRC error
 (retrying request) LBA=107491647
 #  dd if=/dev/ad6 of=/dev/null bs=1m seek=107491647 count=1
 dd: /dev/null: Inappropriate ioctl for device

 Another question: why does it fail?

*I* would call that a 'bug'.  wry grin


 # dd if=/dev/ad6 of=/var/tmp/ bs=1m seek=107491647 count=1
 1+0 records in
 1+0 records out
 1048576 bytes transferred in 0.026658 secs (39334650 bytes/sec)

 So no errors. I looked at bsdlabel a   it's partition f, /home. But what
 is the file name?

There's *no* easy way to find out.  You'll have to grovel through all
the filesystem metadata, and the layers of index blocks for every file
until you find the 'rgiht' one.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Is there way to get filename for specific LBA?

2011-08-31 Thread Robert Bonomi
 From owner-freebsd-questi...@freebsd.org  Wed Aug 31 13:27:07 2011
 Date: Wed, 31 Aug 2011 13:26:27 -0500 (CDT)
 From: Robert Bonomi bon...@mail.r-bonomi.com
 To: basarev...@gmail.com, freebsd-questions@freebsd.org
 Cc: 
 Subject: Re: Is there way to get filename for specific LBA?

  From owner-freebsd-questi...@freebsd.org  Tue Aug 30 22:14:58 2011
  Date: Wed, 31 Aug 2011 06:11:24 +0300
  From: Ross basarev...@gmail.com
  To: freebsd-questions@freebsd.org
  Subject: Is there way to get filename for specific LBA?
 
  Aug 31 05:13:24 da kernel: ad6: WARNING - READ_DMA UDMA ICRC error
  (retrying request) LBA=107491647
  #  dd if=/dev/ad6 of=/dev/null bs=1m seek=107491647 count=1
  dd: /dev/null: Inappropriate ioctl for device
 
  Another question: why does it fail?

 *I* would call that a 'bug'.  wry grin

And I would be WRONG. *sigh*

can't seek on /dev/null, 

needs to be 'skip', or 'iseek' for the the 'if'.

  # dd if=/dev/ad6 of=/var/tmp/ bs=1m seek=107491647 count=1
  1+0 records in
  1+0 records out
  1048576 bytes transferred in 0.026658 secs (39334650 bytes/sec)
 
  So no errors. I looked at bsdlabel a   it's partition f, /home. But what
  is the file name?

 There's *no* easy way to find out.  You'll have to grovel through all
 the filesystem metadata, and the layers of index blocks for every file
 until you find the 'rgiht' one.

 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Is there way to get filename for specific LBA?

2011-08-31 Thread perryh
Ross basarev...@gmail.com wrote:

  Aug 31 05:13:24 da kernel: ad6: WARNING - READ_DMA UDMA ICRC
  error (retrying request) LBA=107491647
  ...
  What does
 
  ??egrep 'ad[0-9]|ata' /var/run/dmesg.boot
 
  report?

 atapci0: Intel ICH7 SATA300 controller port
 0x20b8-0x20bf,0x20cc-0x20cf,0x20b0-0x20b7,0x20c8-0x20cb,0x20a0-0x20af
 mem 0xe0284000-0xe02843ff irq 19 at device 31.2 on pci0
 atapci0: [ITHREAD]
 ata2: ATA channel 0 on atapci0
 ata2: [ITHREAD]
 ata3: ATA channel 1 on atapci0
 ata3: [ITHREAD]
 ad4: 238475MB Seagate ST9250315AS 0001SDM1 at ata2-master UDMA100 SATA
 ad6: 476940MB Seagate ST9500325AS 0001SDM1 at ata3-master UDMA100 SATA
 Trying to mount root from ufs:/dev/ad6s1a

Different hardware than mine, so my w/a may not help.

If it's only happened the one time, you may want to just write it
off as a glitch.  If it happens frequently, or you start getting
unrecovered failures, you could _try_

  atacontrol mode ad6 UDMA66

Slowing down the transfer rate may make it more tolerant of
electrical noise, bad cabling, etc.  This approach worked for
me, but on a PATA (not SATA) port and using a different type
of controller (a VIA 6421).

 smartd also reports this:

 Aug 31 10:41:04 da smartd[886]: Device: /dev/ad6, Failed SMART
 usage Attribute: 184 End-to-End_Error.

 I found this explanation: http://kb.acronis.com/content/9119

 So disk is dying? Or is it cable. I have no physical access to
 the server at the moment.

I'll leave the SMART analysis to those who are familiar with it :)
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Is there way to get filename for specific LBA?

2011-08-31 Thread perryh
Robert Bonomi bon...@mail.r-bonomi.com wrote:

  Aug 31 05:13:24 da kernel: ad6: WARNING - READ_DMA UDMA ICRC
  error (retrying request) LBA=107491647
  ... I looked at bsdlabel a   it's partition f, /home. But what
  is the file name?

 There's *no* easy way to find out.  You'll have to grovel through
 all the filesystem metadata, and the layers of index blocks for
 every file until you find the 'rgiht' one.

This is what icheck -B was for, but icheck(8) no longer exists and
that particular bit of functionality does not seem to be provided in
fsck(8).

One current userland utility (other than fsck) which does know
how to grovel through the metadata and index blocks is dump(8),
but you'd have to hack on it to report which inode was using a
particular block.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Is there way to get filename for specific LBA?

2011-08-31 Thread Carl Johnson
per...@pluto.rain.com writes:

 Robert Bonomi bon...@mail.r-bonomi.com wrote:

  Aug 31 05:13:24 da kernel: ad6: WARNING - READ_DMA UDMA ICRC
  error (retrying request) LBA=107491647
  ... I looked at bsdlabel a   it's partition f, /home. But what
  is the file name?

 There's *no* easy way to find out.  You'll have to grovel through
 all the filesystem metadata, and the layers of index blocks for
 every file until you find the 'rgiht' one.

 This is what icheck -B was for, but icheck(8) no longer exists and
 that particular bit of functionality does not seem to be provided in
 fsck(8).

 One current userland utility (other than fsck) which does know
 how to grovel through the metadata and index blocks is dump(8),
 but you'd have to hack on it to report which inode was using a
 particular block.

It looks like the best bet would be fsdb, assuming that it is a UFS
file system.  That does have a 'findblk' command to find a file
containing a block, but you would need to calculate the block offset in
the filesystem first.  It doesn't look like it would be easy, as was
said earlier.

-- 
Carl Johnsonca...@peak.org

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org