Re: bad sectors on disk

2016-02-16 Thread Thomas Schmitt
Hi,

Ritesh Raj Sarraf wrote:
> Thomas: In the last responses, you asked if there was any Sense key.
> This time, I do have them.
> 
> [498601.069250] sd 0:0:0:0: [sda] UNKNOWN(0x2003) Result: hostbyte=0x00 
> driverbyte=0x08
> [498601.077151] sd 0:0:0:0: [sda] Sense Key : 0x3 [current] 
> [498601.082621] sd 0:0:0:0: [sda] ASC=0x11 ASCQ=0x0 
> [498601.087452] sd 0:0:0:0: [sda] CDB: opcode=0x28 28 00 d8 7d 5a 8f 00 00 08 
> 00
> [498601.094711] blk_update_request: critical medium error, dev sda, sector 
> 3632093839
> ...
> [498604.440543] sd 0:0:0:0: [sda] CDB: opcode=0x28 28 00 d8 7d 5a 8f 00 00 08 
> 00
> ...
> [498607.733232] sd 0:0:0:0: [sda] CDB: opcode=0x28 28 00 d8 7d 5a 8f 00 00 08 
> 00

This one looks much more like a conventional bad spot.

- Other than previously, "driverbyte" is not 0 but 8.
  This means that the disk firmware indicated SCSI Sense Data
  (i.e. error codes) [1]:
"0x08 | DRIVER_SENSE   | had sense information available"

- "Sense Key : 0x3" categorizes the problem as error of the medium.
  "ASC=0x11 ASCQ=0x0" means according to [2]:
"UNRECOVERED READ ERROR 

- This time, the error happens reproducibly at the same block address
  0xd87d5a8f = 3632093839.
  (The first line of your log snippet shows decimal address 3632093759.
   Is this the end of another group of failed "CDB: opcode=0x28" ?)


> Does it suggest bad sectors ?

I'd say that this is something different than your previous problem.
If the following command reliably reports i/o error and causes messages
in the system log, then badblock scan and treatment would be indicated:

  dd if=/dev/sda of=/dev/null bs=512 count=1 skip=3632093839

Especially since the volatile errors did not show up any more.


[1] 
http://www.tldp.org/HOWTO/archived/SCSI-Programming-HOWTO/SCSI-Programming-HOWTO-21.html#ss21.5
[2] 
http://www.tldp.org/HOWTO/archived/SCSI-Programming-HOWTO/SCSI-Programming-HOWTO-22.html#ss22.1


Have a nice day :)

Thomas



Re: bad sectors on disk

2016-02-16 Thread Ritesh Raj Sarraf
On Tue, 2016-02-16 at 19:00 +0530, Ritesh Raj Sarraf wrote:
> A couple days ago, I removed the smartmontools package. I suspected
> that that may be causing trouble. For the last couple days, my klog
> was
> clean and I was tempted to report.

I installed smartmontools again, and ran some commands manually. And
have some concerning report. And I think the reporting may be correct
because where my HDD is located, temperature may be high.


-- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."

smartctl 6.4 2014-10-07 r4002 [armv7l-linux-4.1.16-v7+] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda Green (AF)
Device Model: ST2000DL001-9VT156
Serial Number:5YD0S8LQ
LU WWN Device Id: 5 000c50 02ec00d59
Firmware Version: CC96
User Capacity:2,000,398,934,016 bytes [2.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate:5900 rpm
Device is:In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 3.0 Gb/s (current: 1.5 Gb/s)
Local Time is:Tue Feb 16 19:12:26 2016 IST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART Status command failed: scsi error medium or hardware error (serious)
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status:  (   0) The previous self-test routine completed
without error or no self-test has ever 
been run.
Total time to complete Offline 
data collection:(  633) seconds.
Offline data collection
capabilities:(0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off 
support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:(0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:(0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine 
recommended polling time:(   1) minutes.
Extended self-test routine
recommended polling time:( 348) minutes.
Conveyance self-test routine
recommended polling time:(   2) minutes.
SCT capabilities:  (0x103b) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE  UPDATED  
WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate 0x000f   117   099   006Pre-fail  Always   
-   552224292
  3 Spin_Up_Time0x0003   093   092   000Pre-fail  Always   
-   0
  4 Start_Stop_Count0x0032   096   096   020Old_age   Always   
-   4253
  5 Reallocated_Sector_Ct   0x0033   100   100   036Pre-fail  Always   
-   0
  7 Seek_Error_Rate 0x000f   078   060   030Pre-fail  Always   
-   62050510
  9 Power_On_Hours  0x0032   088   088   000Old_age   Always   
-   10634
 10 Spin_Retry_Count0x0013   100   100   097Pre-fail  Always   
-   0
 12 Power_Cycle_Count   0x0032   100   100   020Old_age   Always   
-   767
183 Runtime_Bad_Block   0x0032   100   100   000Old_age   Always   
-   0
184 End-to-End_Error0x0032   100   100   099Old_age   Always   
-   0
187 Reported_Uncorrect  0x0032   096   096   000Old_age   Always   
-   4
188 Command_Timeout 0x0032   100   099   000Old_age   Always   
-   25770197000
189 High_Fly_Writes 0x003a   094   094   000Old_age   Always   
-   

Re: bad sectors on disk

2016-02-16 Thread Ritesh Raj Sarraf
On Fri, 2016-02-12 at 08:42 +0100, Thomas Schmitt wrote:
> Hi,
> 
> Ritesh Raj Sarraf wrote:
> > > > From the report, it says that there are 0 bad blocks. So is
> this a
> > > > bug in e2fsprogs ?
> 
> David Wright wrote:
> > Does one I/O error mean that you have a bad block necessarily?
> 
> I personally do not believe in a bad spot here, but in a bad bus.
> Throwing out blocks which fall victim to a bus glitch would then
> unnecessarily kill files and reduce disk capacity.

Taking the thread further.

A couple days ago, I removed the smartmontools package. I suspected
that that may be causing trouble. For the last couple days, my klog was
clean and I was tempted to report.

But just now, I got a more enhanced scsi error, this time. Does it
suggest bad sectors ?

Thomas: In the last responses, you asked if there was any Sense key.
This time, I do have them.

[498597.450874] blk_update_request: critical medium error, dev sda,
sector 3632093759
[498601.069250] sd 0:0:0:0: [sda] UNKNOWN(0x2003) Result: hostbyte=0x00
driverbyte=0x08
[498601.077151] sd 0:0:0:0: [sda] Sense Key : 0x3 [current] 
[498601.082621] sd 0:0:0:0: [sda] ASC=0x11 ASCQ=0x0 
[498601.087452] sd 0:0:0:0: [sda] CDB: opcode=0x28 28 00 d8 7d 5a 8f 00
00 08 00
[498601.094711] blk_update_request: critical medium error, dev sda,
sector 3632093839
[498604.422352] sd 0:0:0:0: [sda] UNKNOWN(0x2003) Result: hostbyte=0x00
driverbyte=0x08
[498604.430263] sd 0:0:0:0: [sda] Sense Key : 0x3 [current] 
[498604.435766] sd 0:0:0:0: [sda] ASC=0x11 ASCQ=0x0 
[498604.440543] sd 0:0:0:0: [sda] CDB: opcode=0x28 28 00 d8 7d 5a 8f 00
00 08 00
[498604.447824] blk_update_request: critical medium error, dev sda,
sector 3632093839
[498607.715027] sd 0:0:0:0: [sda] UNKNOWN(0x2003) Result: hostbyte=0x00
driverbyte=0x08
[498607.722919] sd 0:0:0:0: [sda] Sense Key : 0x3 [current] 
[498607.728444] sd 0:0:0:0: [sda] ASC=0x11 ASCQ=0x0 
[498607.733232] sd 0:0:0:0: [sda] CDB: opcode=0x28 28 00 d8 7d 5a 8f 00
00 08 00
[498607.740485] blk_update_request: critical medium error, dev sda,
sector 3632093839


-- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."



signature.asc
Description: This is a digitally signed message part


Re: bad sectors on disk

2016-02-12 Thread Gene Heskett
On Friday 12 February 2016 02:42:19 Thomas Schmitt wrote:

> Hi,
>
> Ritesh Raj Sarraf wrote:
> > > > From the report, it says that there are 0 bad blocks. So is
> > > > this a bug in e2fsprogs ?
>
> David Wright wrote:
> > Does one I/O error mean that you have a bad block necessarily?
>
> I personally do not believe in a bad spot here, but in a bad bus.
> Throwing out blocks which fall victim to a bus glitch would then
> unnecessarily kill files and reduce disk capacity.
>
Is the OP talking about a sata interfaced device?  If so, is the sata 
cable a semi-reddish, tending toward hints of magenta color?  And more 
than 1 year old? 

That particular color of sata cables has a very poor record of 
satisfactory service at my location, and in my travels I have a tendency 
to collect spare sata cables that are NOT that "hot red' color. When a 
cable gets flaky, it gets replaced with a tan one or a black one and 
that seems to be the best long term fix.

As a CET*, with a long term relationship with cables that used that 
particular plastic dye to color one of a multiconductor cable, such as 
in the microphone cable of a CB Radio, first encountered here in the 
later 60's as radios from the J.A.Pan company started flooding the US 
market, it was this 'hot red' colored wire in a mic cable that always 
broke, and you could cut it back 2 cm figuring on cutting off the 
fraction of that which had failed and resoldering the connector to fresh 
wire.  Except when I saw that color. After a year or 2, there was no 
wire remaining in that tubing, just a reddish copper dust!  Whatever 
gave it that color, literally dissolved, or oxidized the copper in 3 
years time at the maximum.

Sata cables, having even smaller gage conductors, it stands to reason 
would fail even faster and that has been my experience.

A quick test, pull the side off the box, put a tail on the messages 
logfile, then take a stick about the size of a lead pencil, and prod 
each such cable, moving it an inch or so.  If the log explodes with sata 
reset messages, you have one of those time bombs.  Replace it, with one 
that is not that "hot red" color.

*CET=Certified Electronics Technician.

> Have a nice day :)
>
> Thomas

Cheers, Gene Heskett
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page 



Re: bad sectors on disk

2016-02-11 Thread Thomas Schmitt
Hi,

Ritesh Raj Sarraf wrote:
> > > From the report, it says that there are 0 bad blocks. So is this a
> > > bug in e2fsprogs ?

David Wright wrote:
> Does one I/O error mean that you have a bad block necessarily?

I personally do not believe in a bad spot here, but in a bad bus.
Throwing out blocks which fall victim to a bus glitch would then
unnecessarily kill files and reduce disk capacity.


Have a nice day :)

Thomas



Re: bad sectors on disk

2016-02-10 Thread Thomas Schmitt
Hi,

Ritesh Raj Sarraf wrote:
> I've seen the same message again, today. but at different locations.
> [...] there are no other message about any sense code.

One point for the theory of volatile bus problems.
The disk itself seems happy. Maybe its bus controller is not.

Did i already mention that USB 3 drop-outs are fashionable currently ?
(But too rare to be a general disease.)


> > I cannot find "-c" in man fsck of Sid. 
> Run the command `man fsck.ext4`

Aha.
"-c [...] read-only  scan  of  the device in order to find any bad blocks.
If any bad blocks are found, they are added  to  the  bad  block inode"

If the problem is not reproducible on the particular blocks, it would
be a waste of good file content if 0x2003 incidents would cause blocks
to be marked bad.


Have a nice day :)

Thomas



Re: bad sectors on disk

2016-02-10 Thread Ritesh Raj Sarraf
Hello Thomas,

On Tue, 2016-02-09 at 12:53 +0100, Thomas Schmitt wrote:
> 
> > Linux pi 4.1.16-v7+ #833 SMP Wed Jan 27 14:32:22 GMT 2016 armv7l
> GNU/Linux
> 
> The source code where i find the message text in my Sid kernel
> is not depending on the CPU architecture. So it is supposed to be
> in effect on your system.
> But i riddle why it does not convert 0x2003 to "FAILED".
> 
> 

I'm now updating my kernel to see if there are any improvements, which
I doubt because there's hardly any change in the repo.

But I've filed a bug to keep the RPi guy informed.

https://github.com/Hexxeh/rpi-firmware/issues/103


> 
> > I just hope it is not another HDD failure.
> 
> Looks like a controller and/or driver problem.
> The web echo on "UNKNOWN(0x2003)" is suspiciously unhelpful.
> 
> Lets try google
>   sd "FAILED Result" DID_OK DRIVER_OK
> Aha. There are kernels which can translate 0x2003 and the commenters
> are somewhat more qualified. But still no hands-on proposals.
> 
> 

I've seen the same message again, today. but at different locations.

[62711.477903] sd 0:0:0:0: [sda] UNKNOWN(0x2003) Result: hostbyte=0x00
driverbyte=0x00
[62711.485701] sd 0:0:0:0: [sda] CDB: opcode=0x28 28 00 9d 40 01 47 00
00 08 00
[62711.492910] blk_update_request: I/O error, dev sda, sector
2638217543
[84370.313684] sd 0:0:0:0: [sda] UNKNOWN(0x2003) Result: hostbyte=0x00
driverbyte=0x00
[84370.321532] sd 0:0:0:0: [sda] CDB: opcode=0x28 28 00 a1 40 01 47 00
00 08 00
[84370.328721] blk_update_request: I/O error, dev sda, sector
2705326407

And about your last question, No, apart from these, there are no other
message about any sense code.


> > I am hoping the fsck results are reliable. I only tried the "-c"
> read-
> > only option. The other was with "-cc" which would also perform a
> > read/write test.
> 
> I cannot find "-c" in man fsck of Sid. 
> 

Run the command `man fsck.ext4`

Just `man fsck` takes you to the outdated/wrong util-linux manpage.


> If it really does read the metadata and the content of data files,
> then at least your filesystem should be ok for making a backup.
> (I would not use it for heavy writing before such a backup was made.)
> 

Yes. I guess I'll do the same. The other spare one has enough space. So
I'm going to backup everything and try your verification example of dd
below.

Thanks again.


> If you want to know whether there is a reproducible bad spot, then
> try whether your disk produces any i/o errors when read flatly.
> Like
> 
>   dd if=/dev/sda of=/dev/null
>   
> If you get errors, try whether they occur again if you start reading
> a few hundred blocks before that address
> 
>   dd if=/dev/sda of=/dev/null skip=...block.number...
> 
> 
> But i do not really expect a reproducible pattern here.
-- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."



signature.asc
Description: This is a digitally signed message part


Re: bad sectors on disk

2016-02-09 Thread David Christensen

On 02/09/2016 12:57 AM, Ritesh Raj Sarraf wrote:

On my RPi2, I saw the following reported by my kernel.

...

This got me worried so I ran an fsck on my drive. Following is the report.

...

SEAGATE: * FILE SYSTEM WAS MODIFIED *

...

I suggest that you download Seagate's Seatools for DOS (bootable ISO 
image), burn it to disc, and run it:


http://www.seagate.com/support/downloads/seatools/


For machines that don't have an optical drive, this page explains how to 
convert a bootable CD ISO image into a bootable hybrid ISO/USB image 
that can be burned to a USB flash drive and booted on machines that 
support such:


https://www.turnkeylinux.org/blog/iso2usb


As an aside, I have Seagate, Western Digital, Intel, Samsung, Maxtor, 
and possibly other brands of HDD/ SSD.  While these manufacturers seem 
to provide utilities that run on Windows, Seagate is the only vendor I 
know of that still provides their tool as a stand-alone bootable ISO. 
In years past, some vendors offered bootable floppy disk images, 
bootable floppy disk creators, and/or files that could be copied to a 
bootable floppy made with Windows.  Such stand-alone utilities are 
invaluable for troubleshooting and/or repair.  Does anyone have links to 
such tools?  I'm especially interested in Western Digital DLG 5.04f. 
I'm also looking for a HOWTO for converting a bootable floppy image to a 
bootable USB flash drive image.



David



Re: bad sectors on disk

2016-02-09 Thread Thomas Schmitt
Hi,

Ritesh Raj Sarraf's kernel wrote:
> [156278.815976] sd 1:0:0:0: [sdb] UNKNOWN(0x2003) Result: hostbyte=0x00 
> driverbyte=0x00
> [156278.823864] sd 1:0:0:0: [sdb] CDB: opcode=0x28 28 00 9a 40 04 47 00 00 08 
> 00
> [156278.831152] blk_update_request: I/O error, dev sdb, sector 2587886663

This was an attempt to read data from address 0x9a400447,
which would mean your storage medium has a capacity of 1.2 TB at least.

Is this log snippet preceded by lines with "Sense Key" or "Add. Sense" ?


The text "UNKNOWN(0x2003)" is quite popular in the web. But i did not find
further enlightenment yet. (The problem is near enough to my own sports
to make me curious.)

The message part seems to be quite new in the kernel:
  /usr/src/linux-4.1.6/drivers/scsi/scsi_logging.c  line 458
I cannot find it in Jessie's 3.16.
Reason for "UNKNOWN" is that scsi_mlreturn_string() in
drivers/scsi/constants.c did not find 0x2003 in scsi_mlreturn_arr.
This is riddling, because in include/scsi/scsi.h i see
  #define FAILED  0x2003
which i see mentioned in drivers/scsi/constants.c as member of
scsi_mlreturn_arr[]:
  #define scsi_mlreturn_name(result)  { result, #result }
  ...
scsi_mlreturn_name(FAILED),

Looks like something with the macro magic of scsi_mlreturn_name()
does not work as expected.
  https://gcc.gnu.org/onlinedocs/cpp/Stringification.html
Or scsi_mlreturn_string() does not search far enough. But the
many definitions of ARRAY_SIZE in kernel headers look ok to my
userlander eyes.

What kernel version exactly are you using ?

---

Whatever, "FAILED" instead of "UNKNOWN(0x2003)" will not help much
to find the reason for the problem.

If it happened with an optical drive, i'd say it is a volatile hardware
error in one of the participating bus controllers or in their cabling.
(The current main suspect for controller problems is USB 3.)


Have a nice day :)

Thomas



Re: bad sectors on disk

2016-02-09 Thread Hans
> 
> >From the report, it says that there are 0 bad blocks. So is this a bug in
> 
> e2fsprogs ?

Please correct me, if I am wrong. As far as I know, bad sectors are blanked 
out by the hardware itself. If bad sectors are detected, the software tries to 
read and then move the data to a good sectpor and after it sets the sector as 
dead. So this sector does not appear again. 

Maybe this is the reason, why e2fsck only sees and shows good sectors, just 
because it never checks sectors already marked as bad by the hardware itself.

However, I maybe wrong. You should watch this, if more errors appear, think of 
exchanging the drive before it is brain dead.

Best

Hans



Re: bad sectors on disk

2016-02-09 Thread Ritesh Raj Sarraf
Hello Thomas,

Thank you for your response.

On Tue, 2016-02-09 at 11:17 +0100, Thomas Schmitt wrote:
> Hi,
> 
> Ritesh Raj Sarraf's kernel wrote:
> > [156278.815976] sd 1:0:0:0: [sdb] UNKNOWN(0x2003) Result:
> > hostbyte=0x00 driverbyte=0x00
> > [156278.823864] sd 1:0:0:0: [sdb] CDB: opcode=0x28 28 00 9a 40 04
> > 47 00 00 08 00
> > [156278.831152] blk_update_request: I/O error, dev sdb, sector
> > 2587886663
> 
> This was an attempt to read data from address 0x9a400447,
> which would mean your storage medium has a capacity of 1.2 TB at
> least.

Yes. This is a 2 TB Seagate drive.

[4.247833] scsi 0:0:0:0: Direct-Access Seagate  FA GoFlex
Desk   0155 PQ: 0 ANSI: 4
[4.253080] sd 0:0:0:0: [sda] 3907029167 512-byte logical blocks:
(2.00 TB/1.81 TiB)
[4.253598] sd 0:0:0:0: [sda] Write Protect is off
[4.253613] sd 0:0:0:0: [sda] Mode Sense: 1c 00 00 00
[4.254114] sd 0:0:0:0: [sda] Write cache: enabled, read cache:
enabled, doesn't support DPO
 or FUA
[4.273477] bcm2708_rng_init=b3a1c000
[4.287442]  sda: sda1
[4.298911] sd 0:0:0:0: [sda] Attached SCSI disk


> 
> Is this log snippet preceded by lines with "Sense Key" or "Add.
> Sense" ?
> 

No. Those were the only lines reported. I recently lost another drive
that was attached to the same Raspberry Pi. But that one had a FAT file
system.


> 
> The text "UNKNOWN(0x2003)" is quite popular in the web. But i did not
> find
> further enlightenment yet. (The problem is near enough to my own
> sports
> to make me curious.)
> 
> The message part seems to be quite new in the kernel:
>   /usr/src/linux-4.1.6/drivers/scsi/scsi_logging.c  line 458
> I cannot find it in Jessie's 3.16.
> Reason for "UNKNOWN" is that scsi_mlreturn_string() in
> drivers/scsi/constants.c did not find 0x2003 in scsi_mlreturn_arr.
> This is riddling, because in include/scsi/scsi.h i see
>   #define FAILED  0x2003
> which i see mentioned in drivers/scsi/constants.c as member of
> scsi_mlreturn_arr[]:
>   #define scsi_mlreturn_name(result)  { result, #result }
>   ...
> scsi_mlreturn_name(FAILED),
> 
> Looks like something with the macro magic of scsi_mlreturn_name()
> does not work as expected.
>   https://gcc.gnu.org/onlinedocs/cpp/Stringification.html
> Or scsi_mlreturn_string() does not search far enough. But the
> many definitions of ARRAY_SIZE in kernel headers look ok to my
> userlander eyes.
> 
> What kernel version exactly are you using ?
> 

I am on a fairly recent kernel.

pi@pi:~$ uname -a
Linux pi 4.1.16-v7+ #833 SMP Wed Jan 27 14:32:22 GMT 2016 armv7l
GNU/Linux


> ---
> 
> Whatever, "FAILED" instead of "UNKNOWN(0x2003)" will not help much
> to find the reason for the problem.
> 
> If it happened with an optical drive, i'd say it is a volatile
> hardware
> error in one of the participating bus controllers or in their
> cabling.
> (The current main suspect for controller problems is USB 3.)
> 
> 
> Have a nice day :)
> 
> Thomas
> 

The RPi2 is a USB 2.0 only device. But yes, I think the drive is 3.0
capable.

Bus 001 Device 004: ID 0bc2:5071 Seagate RSS LLC 
Device Descriptor:
  bLength18
  bDescriptorType 1
  bcdUSB   2.00
  bDeviceClass0 (Defined at Interface level)
  bDeviceSubClass 0 
  bDeviceProtocol 0 
  bMaxPacketSize064
  idVendor   0x0bc2 Seagate RSS LLC
  idProduct  0x5071 
  bcdDevice1.55
  iManufacturer   1 Seagate
  iProduct2 FA GoFlex Desk
  iSerial 3 NA0K1JZA
  bNumConfigurations  1
  Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength   32
bNumInterfaces  1
bConfigurationValue 1
iConfiguration  0 
bmAttributes 0xc0
  Self Powered
MaxPower2mA
Interface Descriptor:
  bLength 9
  bDescriptorType 4
  bInterfaceNumber0
  bAlternateSetting   0
  bNumEndpoints   2
  bInterfaceClass 8 Mass Storage
  bInterfaceSubClass  6 SCSI
  bInterfaceProtocol 80 Bulk-Only
  iInterface  0 
  Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81  EP 1 IN
bmAttributes2
  Transfer TypeBulk
  Synch Type   None
  Usage Type   Data
wMaxPacketSize 0x0200  1x 512 bytes
bInterval   0
      Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x02  EP 2 OUT
bmAttributes2
  Transfer TypeBulk
  Synch Type   None
  Usage Type   Data
wMaxPacketSize 0x0200  1x 512 bytes

Re: bad sectors on disk

2016-02-09 Thread Jonathan Dowland
It would be interesting to check your drive's SMART data to see if that has
reported any errors. You might be able to do this with smartctl from the
smartmontools package, depending on which USB bridge you have. I recently
uploaded a new version of the package to unstable, but I do not know if/when
that might appear in Raspbian (which I presume you are using)

-- 
Jonathan Dowland
Please do not CC me, I am subscribed to the list.



Re: bad sectors on disk

2016-02-09 Thread Thomas Schmitt
Hi,

> Thank you for your response.

Purely selfish. :)
I want to know about cabling problems.


> Linux pi 4.1.16-v7+ #833 SMP Wed Jan 27 14:32:22 GMT 2016 armv7l GNU/Linux

The source code where i find the message text in my Sid kernel
is not depending on the CPU architecture. So it is supposed to be
in effect on your system.
But i riddle why it does not convert 0x2003 to "FAILED".


> The RPi2 is a USB 2.0 only device. But yes, I think the drive is 3.0
> capable.

The reports about optical drive problems which i have seen during
the last year were about USB 2 boxes plugged into USB 3 computer
sockets. They are far too few to indicate a general problem.
I suspect it is about certain kernels and certain pairings of the
two participating USB controllers, maybe even the cables.

On the other hand, there are criminal USB power supply contraptions
around which in most cases even seem to work. See
  https://media-cdn.ubuntu-de.org/forum/attachments/00/03/8036213-IMG_0222.JPG
  
https://media-cdn.ubuntu-de.org/forum/attachments/41/03/8037143-51zTbNM27vL._SL1000_.jpg
from a german discussion about spurious "host_status 7" errors.
I meanwhile suspect something like a dead fruit fly was sticking
to the plug. A modern version of the 1947 classic
  https://en.wikipedia.org/wiki/File:H96566k.jpg


> I just hope it is not another HDD failure.

Looks like a controller and/or driver problem.
The web echo on "UNKNOWN(0x2003)" is suspiciously unhelpful.

Lets try google
  sd "FAILED Result" DID_OK DRIVER_OK
Aha. There are kernels which can translate 0x2003 and the commenters
are somewhat more qualified. But still no hands-on proposals.


> I am hoping the fsck results are reliable. I only tried the "-c" read-
> only option. The other was with "-cc" which would also perform a
> read/write test.

I cannot find "-c" in man fsck of Sid. 

If it really does read the metadata and the content of data files,
then at least your filesystem should be ok for making a backup.
(I would not use it for heavy writing before such a backup was made.)

If you want to know whether there is a reproducible bad spot, then
try whether your disk produces any i/o errors when read flatly.
Like

  dd if=/dev/sda of=/dev/null
  
If you get errors, try whether they occur again if you start reading
a few hundred blocks before that address

  dd if=/dev/sda of=/dev/null skip=...block.number...


But i do not really expect a reproducible pattern here.


Have a nice day :)

Thomas



Re: bad sectors on disk

2016-02-09 Thread Eike Lantzsch
On Tuesday 09 February 2016 09:25:40 Eike Lantzsch wrote:
> Hi Ritesh Raj Sarraf
> 
> Just a thought - my two cents:
> 
> On Tuesday 09 February 2016 12:53:04 Thomas Schmitt wrote:
> > Hi,
> > 
> > > Thank you for your response.
> > 
> > Purely selfish. :)
> > I want to know about cabling problems.
> > 
> > > Linux pi 4.1.16-v7+ #833 SMP Wed Jan 27 14:32:22 GMT 2016 armv7l
> > > GNU/Linux
> > 
> > The source code where i find the message text in my Sid kernel
> > is not depending on the CPU architecture. So it is supposed to be
> > in effect on your system.
> > But i riddle why it does not convert 0x2003 to "FAILED".
> > 
> > > The RPi2 is a USB 2.0 only device. But yes, I think the drive is 3.0
> > > capable.
> 
> Are you copying or moving files from or to your USB-HDD via the network?
> RPis share the same USB controller for the network chip and for the
> USB-ports. My experience with RPis is that heavy network traffic makes the
> I/O over the USB-ports to and from HDDs very shaky. I even lost a HDD that
> way. If you want some sort of a NAS then ditch the RPi. You will be better
> off with a Cubietruck if you want to stick to ARM-architecture or a
> PC-Engines ALIX (i386) or a PC-Engines APU if you prefer amd64 and want
> more memory.

I should have mentioned:
Cubietruck and APU: connect HDD w/ SATA
ALIX: only parallel port for HDD
> 
> All the best
> Eike
[snip]



Re: bad sectors on disk

2016-02-09 Thread Eike Lantzsch
Hi Ritesh Raj Sarraf

Just a thought - my two cents:

On Tuesday 09 February 2016 12:53:04 Thomas Schmitt wrote:
> Hi,
> 
> > Thank you for your response.
> 
> Purely selfish. :)
> I want to know about cabling problems.
> 
> > Linux pi 4.1.16-v7+ #833 SMP Wed Jan 27 14:32:22 GMT 2016 armv7l GNU/Linux
> 
> The source code where i find the message text in my Sid kernel
> is not depending on the CPU architecture. So it is supposed to be
> in effect on your system.
> But i riddle why it does not convert 0x2003 to "FAILED".
> 
> > The RPi2 is a USB 2.0 only device. But yes, I think the drive is 3.0
> > capable.

Are you copying or moving files from or to your USB-HDD via the network?
RPis share the same USB controller for the network chip and for the USB-ports.
My experience with RPis is that heavy network traffic makes the I/O over the 
USB-ports to and from HDDs very shaky. I even lost a HDD that way.
If you want some sort of a NAS then ditch the RPi. You will be better off with 
a Cubietruck if you want to stick to ARM-architecture or a PC-Engines ALIX 
(i386) or a PC-Engines APU if you prefer amd64 and want more memory.

All the best
Eike

> 
> The reports about optical drive problems which i have seen during
> the last year were about USB 2 boxes plugged into USB 3 computer
> sockets. They are far too few to indicate a general problem.
> I suspect it is about certain kernels and certain pairings of the
> two participating USB controllers, maybe even the cables.
> 
> On the other hand, there are criminal USB power supply contraptions
> around which in most cases even seem to work. See
>  
> https://media-cdn.ubuntu-de.org/forum/attachments/00/03/8036213-IMG_0222.JP
> G
> https://media-cdn.ubuntu-de.org/forum/attachments/41/03/8037143-51zTbNM27vL
> ._SL1000_.jpg from a german discussion about spurious "host_status 7"
> errors.
> I meanwhile suspect something like a dead fruit fly was sticking
> to the plug. A modern version of the 1947 classic
>   https://en.wikipedia.org/wiki/File:H96566k.jpg
> 
> > I just hope it is not another HDD failure.
> 
> Looks like a controller and/or driver problem.
> The web echo on "UNKNOWN(0x2003)" is suspiciously unhelpful.
I second that - see above.
> 
> Lets try google
>   sd "FAILED Result" DID_OK DRIVER_OK
> Aha. There are kernels which can translate 0x2003 and the commenters
> are somewhat more qualified. But still no hands-on proposals.
> 
> > I am hoping the fsck results are reliable. I only tried the "-c" read-
> > only option. The other was with "-cc" which would also perform a
> > read/write test.
> 
> I cannot find "-c" in man fsck of Sid.
> 
> If it really does read the metadata and the content of data files,
> then at least your filesystem should be ok for making a backup.
> (I would not use it for heavy writing before such a backup was made.)
> 
> If you want to know whether there is a reproducible bad spot, then
> try whether your disk produces any i/o errors when read flatly.
> Like
> 
>   dd if=/dev/sda of=/dev/null
> 
> If you get errors, try whether they occur again if you start reading
> a few hundred blocks before that address
> 
>   dd if=/dev/sda of=/dev/null skip=...block.number...
> 
> 
> But i do not really expect a reproducible pattern here.
> 
> 
> Have a nice day :)
> 
> Thomas

-- 
Eike Lantzsch ZP6CGE
Agencia Shopping del Sol
Casilla de Correo 13005
1749 Asuncion / Paraguay
Land-line: +595-21-553984
Cell-phone: +595-971-696909
Skype: eikelan



Re: bad sectors on disk

2016-02-09 Thread jdd

Le 09/02/2016 11:54, Ritesh Raj Sarraf a écrit :


I am hoping the fsck results are reliable. I only tried the "-c" read-
only option. The other was with "-cc" which would also perform a
read/write test.



try

http://dodin.info/wiki/pmwiki.php?n=Doc.TesterUnDisqueDur

it's in french, but the command lines are pretty clear

jdd



Re: bad sectors on disk

2016-02-09 Thread Thomas Schmitt
Hi,

David Christensen wrote:
> http://www.seagate.com/support/downloads/seatools/
> convert a bootable CD ISO image into a bootable hybrid ISO/USB image
> https://www.turnkeylinux.org/blog/iso2usb

This will not work, i fear, because the ISO is not bootable via an
isohybrid capable ISOLINUX boot image.

>From the start of the El Torito boot image in SeaToolsDOS223ALL.ISO:
  Bootable CD Wizard v1.50Z
  Copyright (c)2004 by reanimatolog. http://bootcd.narod.ru


(This statement from blog/iso2usb is plain wrong:
 "The isohybrid tool in the syslinux package will convert ISO images
  into a USB flash drive compatible format"
isohybrid writes an MBR and partition tables into the System Area
of the ISO, so that boot firmware can find the boot loader if the ISO
is presented on HDD. The ISO 9660 filesystem does not get converted.)


Have a nice day :)

Thomas