Re: Éteindre mon PC

2021-03-20 Thread Georges


Le Fri, 19 Mar 2021 11:24:33 +0100,

F a écrit :

> 'lut,
> 
> 
> >   Sur le même PC une partition avec Buster ou le script fonctionne
> > et une autre partition avec Buster ou le script ne fonctionne pas.

 Merci tout de même d'avoir essayé. ;-)

> oops, oublies mes sottises de bios alors ;) kernel ?
> 
> f.
> 



Re: Copying stuck

2021-03-20 Thread Nicholas Geovanis
On Sat, Mar 20, 2021, 8:48 PM komodo  wrote:

> Hi,
>
> last two months when I copy multiple files from USB or SD to HDD after the
> first
> file copying stucks.
>
> I don't know if this is some known bug but it's really annoying.
>

Just a thought. I've had reliability problems with USB keys in the past.
But it sounds more like you could be hitting a bad area on the HDD or the
filesystem. You might want to check for hardware or filesystem errors in
the logs.

I have already reported this problem upstream, but no response.
>
> https://bugs.kde.org/show_bug.cgi?id=434145
>
> So please if someone has some info let me know.
>
> This is really annoying.
>
> Thanks
>
> Martin
>
>
>
>


Re: [OT] Re: Social-media antipathy (was Re: How i can optimize my operating system?)

2021-03-20 Thread deloptes
Stefan Monnier wrote:

> Reminds me of the saying that the difference between USA and USSR was
> that in USSR the population knew that it was propaganda.

Exactly - this part is absolutely the same.
The difference is in the methods. There it was wellknown, here - not until
internet came out ... or is Assange, or Snowden a free person.

This BS is a joke - I would have never imagined I would watch RT to get a
different point of view :/ or that I would stop reading the magazines I
grew up with (it became unbearable around 2007-2010). But may be it is part
of the transformation we experience with these new technologies incl. the
Social Media, we discuss.




Re: Copying stuck

2021-03-20 Thread Weaver
On 21-03-2021 11:32, komodo wrote:
> Hi,
> 
> last two months when I copy multiple files from USB or SD to HDD after the 
> first
> file copying stucks.
> 
> I don't know if this is some known bug but it's really annoying.
> 
> I have already reported this problem upstream, but no response.
> 
> https://bugs.kde.org/show_bug.cgi?id=434145
> 
> So please if someone has some info let me know.
> 
> This is really annoying.

You don't mention what version you are running, but I've just finished
transferring multiple GBs of movies from one machine to another, via
sneaker net (flash drive) without incident.
I'm running SID on an old Dell Optiplex 980.
Cheers!

Harry.
-- 
`The World is not dangerous because of those who do harm but
 because of those who look on without doing anything'.
 -- Albert Einstein



Copying stuck

2021-03-20 Thread komodo
Hi,

last two months when I copy multiple files from USB or SD to HDD after the first
file copying stucks.

I don't know if this is some known bug but it's really annoying.

I have already reported this problem upstream, but no response.

https://bugs.kde.org/show_bug.cgi?id=434145

So please if someone has some info let me know.

This is really annoying.

Thanks

Martin





Re: Hardware failure?: Now what?

2021-03-20 Thread Dan Ritter
Charles Curley wrote: 
> 
> The board is an ASUS H97M-E, bios date 05/15/2015. Processor is
> Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz, with eight processors.
> 
> Now what?

4 cores, 8 threads. 

As others are pointing out, this could be thermal. Clean the
fan, consider replacing the power supply, consider removing the
heatsink and cleaning it then re-applying thermal paste.

If the problem recurs and it isn't thermal, you can replace the
CPU.

i7-5775C or another i7-4790S will go for about $200; a used
i7-4770K will be nearly unnoticeably faster for about $180.


-dsr-



Re: Hardware failure?: Now what?

2021-03-20 Thread Andy Smith
Hi,

On Sat, Mar 20, 2021 at 02:29:25PM -0600, Charles Curley wrote:
> MCE events:
> 1 2021-03-20 13:58:30 -0600 error: Internal parity error, mcg mcgstatus=0, 
> mci Corrected_error Error_enabled, mcgcap=0x0c09, 
> status=0x904f0005, tsc=0xf442c87fda, walltime=0x605653e5, 
> cpu=0x0003, cpuid=0x000306c3, apicid=0x0006

This could be a RAM error, but it could also be a memory error for
the cache inside the CPU, so a CPU error. But it could also be a
spurious CPU bug:

https://trick77.com/qemu-on-haswell-causes-spurious-mce-events/

Are you running qemu or KVM or some other kind of virtualisation? If
yes and if there doesn't appear to be any actual instability then it
may be spurious.

Cheers,
Andy

-- 
https://bitfolk.com/ -- No-nonsense VPS hosting



Re: Hardware failure?: Now what?

2021-03-20 Thread Sven Hartge
Charles Curley  wrote:

> Mar 20 13:58:29 hawk rasdaemon[892]: Calling ras_mc_event_opendb()
> Mar 20 13:58:29 hawk rasdaemon[892]: cpu 03:rasdaemon: mce_record store: 
> 0x55c124c9b148
> Mar 20 13:58:29 hawk kernel: [  300.407406] mce: [Hardware Error]: Machine 
> check events logged
> Mar 20 13:58:29 hawk kernel: [  300.407410] mce: [Hardware Error]: CPU 3: 
> Machine Check: 0 Bank 0: 904f0005
> Mar 20 13:58:29 hawk kernel: [  300.407411] mce: [Hardware Error]: TSC 
> f442c87fda 
> Mar 20 13:58:29 hawk kernel: [  300.407413] mce: [Hardware Error]: PROCESSOR 
> 0:306c3 TIME 1616270309 SOCKET 0 APIC 6 microcode 19
> Mar 20 13:58:29 hawk rasdaemon[892]: rasdaemon: register inserted at db

> 1 2021-03-20 13:58:30 -0600 error: Internal parity error, mcg mcgstatus=0, 
> mci Corrected_error Error_enabled, mcgcap=0x0c09, 
> status=0x904f0005, tsc=0xf442c87fda, walltime=0x605653e5, 
> cpu=0x0003, cpuid=0x000306c3, apicid=0x0006
> 2 2021-03-20 14:07:07 -0600 error: Internal parity error, mcg mcgstatus=0, 
> mci Corrected_error Error_enabled, mcgcap=0x0c09, 
> status=0x904f0005, tsc=0x274d9e61020, walltime=0x605655ea, 
> cpu=0x0003, cpuid=0x000306c3, apicid=0x0006
> 3 2021-03-20 14:07:07 -0600 error: Internal parity error, mcg mcgstatus=0, 
> mci Corrected_error Error_enabled, mcgcap=0x0c09, 
> status=0x904f0005, tsc=0x27517a5dacb, walltime=0x605655eb, 
> cpu=0x0003, cpuid=0x000306c3, apicid=0x0006
> 4 2021-03-20 14:10:34 -0600 error: Internal parity error, mcg mcgstatus=0, 
> mci Corrected_error Error_enabled, mcgcap=0x0c09, 
> status=0x904f0005, tsc=0x30ea8517bee, walltime=0x605656b9, 
> cpuid=0x000306c3

> If I read that correctly, CPU 3 is seeing and correcting internal parity
> errors.

Correct.

> The board is an ASUS H97M-E, bios date 05/15/2015. Processor is
> Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz, with eight processors.

> Now what?

Nothing really.

Check if there is a BIOS/Firmware update available.

Check if the voltages are set correctly in the BIOS/Firmware. (Usually
by loading the defaults and setting everything to "auto".)

Check temperature of the CPU.

Check if the latest intel-microcode package from Debian is installed
(3.20201118.1~deb10u1 at the moment) or grab the newest one from testing
(3.20210216.1).

Try running mprime95 in test mode for some time to see if it complains
and if errors occur more often when under load.

Also run memtest86+ for some time to verify the correctness of your RAM.

In the end, if the error is something in one of the caches inside the
CPU, there is nothing really you can do.

Grüße,
Sven.

-- 
Sigmentation fault. Core dumped.



Hardware failure?: Now what?

2021-03-20 Thread Charles Curley
My syslog is reporting things like:

Mar 20 13:58:29 hawk rasdaemon[892]: Calling ras_mc_event_opendb()
Mar 20 13:58:29 hawk rasdaemon[892]: cpu 03:rasdaemon: mce_record store: 
0x55c124c9b148
Mar 20 13:58:29 hawk kernel: [  300.407406] mce: [Hardware Error]: Machine 
check events logged
Mar 20 13:58:29 hawk kernel: [  300.407410] mce: [Hardware Error]: CPU 3: 
Machine Check: 0 Bank 0: 904f0005
Mar 20 13:58:29 hawk kernel: [  300.407411] mce: [Hardware Error]: TSC 
f442c87fda 
Mar 20 13:58:29 hawk kernel: [  300.407413] mce: [Hardware Error]: PROCESSOR 
0:306c3 TIME 1616270309 SOCKET 0 APIC 6 microcode 19
Mar 20 13:58:29 hawk rasdaemon[892]: rasdaemon: register inserted at db

root@hawk:/crc/back# ras-mc-ctl --errors
No Memory errors.

No PCIe AER errors.

No Extlog errors.

MCE events:
1 2021-03-20 13:58:30 -0600 error: Internal parity error, mcg mcgstatus=0, mci 
Corrected_error Error_enabled, mcgcap=0x0c09, status=0x904f0005, 
tsc=0xf442c87fda, walltime=0x605653e5, cpu=0x0003, cpuid=0x000306c3, 
apicid=0x0006
2 2021-03-20 14:07:07 -0600 error: Internal parity error, mcg mcgstatus=0, mci 
Corrected_error Error_enabled, mcgcap=0x0c09, status=0x904f0005, 
tsc=0x274d9e61020, walltime=0x605655ea, cpu=0x0003, cpuid=0x000306c3, 
apicid=0x0006
3 2021-03-20 14:07:07 -0600 error: Internal parity error, mcg mcgstatus=0, mci 
Corrected_error Error_enabled, mcgcap=0x0c09, status=0x904f0005, 
tsc=0x27517a5dacb, walltime=0x605655eb, cpu=0x0003, cpuid=0x000306c3, 
apicid=0x0006
4 2021-03-20 14:10:34 -0600 error: Internal parity error, mcg mcgstatus=0, mci 
Corrected_error Error_enabled, mcgcap=0x0c09, status=0x904f0005, 
tsc=0x30ea8517bee, walltime=0x605656b9, cpuid=0x000306c3

root@hawk:/crc/back# 


If I read that correctly, CPU 3 is seeing and correcting internal parity
errors.

The board is an ASUS H97M-E, bios date 05/15/2015. Processor is
Intel(R) Core(TM) i7-4790S CPU @ 3.20GHz, with eight processors.

Now what?


-- 
Does anybody read signatures any more?

https://charlescurley.com
https://charlescurley.com/blog/



Re: su - kees en DISPLAY voor kees

2021-03-20 Thread Geert Stappers
On Sat, Mar 20, 2021 at 09:42:28PM +0100, Geert Stappers wrote:
> On Sat, Mar 20, 2021 at 09:22:24PM +0100, Geert Stappers wrote:
> > Hoi,
> > 
> > Computer met XCFE  daar kan ik `su - kees` doen.
> > Als user kees mag ik geen grafische opstarten
> > want DISPLAY staat niet goed.
> > 
> > Hoe dat wel goed te zetten?
> > 
> 
> Vooraf aan de "switch user"
> 
>   xhost +si:local:kees

Om toestemming te geven voor iets op grafisch scherm te zetten.


>   echo $DISPLAY

Om waarde op te vragen, was  :0.0
 
> Wissel van gebruiker
> 
>   su - kees
 
Er wordt om wachtwoord van kees gevraagd

 
> En als `kees`
> 
>   export DISPLAY=0.0

De eerder opgevraagde waarde opgeven


>   xterm

Grafische applicatie opstarten,
dat is waar het om ging.



Groeten
Geert Stappers

P.S.

Tips hoe het anders kan zijn welkom.
-- 
Silence is hard to parse



Re: su - kees en DISPLAY voor kees

2021-03-20 Thread Geert Stappers
On Sat, Mar 20, 2021 at 09:22:24PM +0100, Geert Stappers wrote:
> Hoi,
> 
> Computer met XCFE  daar kan ik `su - kees` doen.
> Als user kees mag ik geen grafische opstarten
> want DISPLAY staat niet goed.
> 
> Hoe dat wel goed te zetten?
> 

Vooraf aan de "switch user"

  xhost +si:local:kees
  echo $DISPLAY

Wissel van gebruiker

  su - kees


En als `kees`

  export DISPLAY=0.0
  xterm



Groeten
Geert Stappers
-- 
Silence is hard to parse



su - kees en DISPLAY voor kees

2021-03-20 Thread Geert Stappers
Hoi,

Computer met XCFE  daar kan ik `su - kees` doen.
Als user kees mag ik geen grafische opstarten
want DISPLAY staat niet goed.

Hoe dat wel goed te zetten?



Groeten
Geert Stappers
-- 
Silence is hard to parse



Re: [OT] Re: Social-media antipathy (was Re: How i can optimize my operating system?)

2021-03-20 Thread Stefan Monnier
> In my (not so humble) opinion, this level of security could make sense 
> for a disident in a totalitarian state, less so for regular users in 
> democratic country.

Reminds me of the saying that the difference between USA and USSR was
that in USSR the population knew that it was propaganda.


Stefan



Re: fsck error on boot: /dev/sda1: UNEXPECTED INCONSISTENCY and Partition 1 does not start on physical sector boundary

2021-03-20 Thread Alexander V. Makartsev

On 20.03.2021 22:05, Andy Smith wrote:

Anyway in OP's position, they have lost data which they need to
restore and while they could wait and see if the errors are
increasing in number they probably just want to get it replaced
ASAP.

Thanks for your input, Andy.
Now OP should be able to make a right decision, to keep the drive and 
take the risk, or to replace it.
Eh, they could even put this drive into a cheap portable USB3-to-SATA 
HDD box and use it to carry around non-critical data or additional 
backup copies.


I just want to add my rationale behind recommendation to test the drive.
When I looked at SMART information OP provided, I've noticed high values 
of attributes #191 and #254.
This means OP is using a laptop and often move it around, bumps it, etc, 
which is obviously can't be good for a HDD inside.
And since attribute #5 still at 0, I think those pending sectors might 
be just sectors with bad CRC, that was caused by bumps and vibrations.
So, if OP will report back with clean test results, without any bad 
blocks being found, I'd consider this HDD healthy.
However, if bad blocks would be found, then I'd have recommended to 
replace it, or at least warn about consequences of keeping it.
There is still many things to consider, even if this drive's tests will 
pass without errors, because there are many reasons why bad sectors 
could appear.


Now that I think about it, I hope it didn't looked like I was suggesting 
OP to keep the drive no matter what, like "pshh few bad sectors big deal",

because of my lackluster ability to express myself using foreign language.

--
With kindest regards, Alexander.

⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system
⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org
⠈⠳⣄



Re: smartctl on SSD question

2021-03-20 Thread David Christensen

On 3/20/21 3:50 AM, Gene Heskett wrote:

Greetings all;

I have been using amanda on this machine to back up the rest of my
machines for over 2 decades. So I'm not a new user.

But I installed a 240 gig SSD about 6 months back, because its use as a
holding disk tripled amanda's speed at backing up this 6 machine system.

But amanda has been increasingly plagued with bad crc checksums of the
data in this holding disk, which usually leaves the files on that disk
as it redoes that disklist entry bypassing the holding disk, which
obviously takes longer as it is then doing the backup directly to the
archive medium.

I just initiated a "-t long" test on it.  So since its an SSD, what is
the best command to extract the most complete test report possible when
the test has finished? Its now done, and the -a option gives it a clean
report, along with a list of this and thats not supported. Cheap adata
drive.



I typically use the smartctl(8) '-x' option to get the most information:

# smartctl -x /dev/sda


I pipe the output to a file, and put the file into a version control 
system.  This allows me to look for trends.



It is good that Amanda does checksums on the backup repository, but bad 
that Amanda is detecting errors.  Get your backups onto reliable storage 
immediately.  I do ZFS and mirrored HDD's with an SSD cache.  The SSD 
cache can improve replication performance by an order of magnitude.  If 
you're not using ZFS, there are SSD cache solutions for traditional 
filesystems and for LVM (I won't comment because I don't use them).



David



Re: smartctl on SSD question

2021-03-20 Thread Gene Heskett
On Saturday 20 March 2021 08:29:04 Alexander V. Makartsev wrote:

> On 20.03.2021 15:50, Gene Heskett wrote:
> > Greetings all;
> >
> > I have been using amanda on this machine to back up the rest of my
> > machines for over 2 decades. So I'm not a new user.
> >
> > But I installed a 240 gig SSD about 6 months back, because its use
> > as a holding disk tripled amanda's speed at backing up this 6
> > machine system.
> >
> > But amanda has been increasingly plagued with bad crc checksums of
> > the data in this holding disk, which usually leaves the files on
> > that disk as it redoes that disklist entry bypassing the holding
> > disk, which obviously takes longer as it is then doing the backup
> > directly to the archive medium.
> >
> > I just initiated a "-t long" test on it.  So since its an SSD, what
> > is the best command to extract the most complete test report
> > possible when the test has finished? Its now done, and the -a option
> > gives it a clean report, along with a list of this and thats not
> > supported. Cheap adata drive.
> >
> > Thank you.
> >
> > Cheers, Gene Heskett
>
> You can see SMART test results with "--log=selftest" parameter.
>      # smartctl --log=selftest /dev/sdZ

root@coyote:/$ smartctl --log=selftest /dev/sdb
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.19.0-0.bpo.9-rt-amd64] (local 
build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_DescriptionStatus  Remaining  LifeTime(hours)  
LBA_of_first_error
# 1  Extended offlineCompleted without error   00%  2580 -

> Can you post back the output from "smartctl" with "--all" parameter
> for this device?
root@coyote:/$ smartctl --all /dev/sdb
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.19.0-0.bpo.9-rt-amd64] (local 
build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: ADATA SU650
Serial Number:AA010685
Firmware Version: R0831B0
User Capacity:240,057,409,536 bytes [240 GB]
Sector Size:  512 bytes logical/physical
Rotation Rate:Solid State Device
Form Factor:  2.5 inches
Device is:Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2 T13/2015-D revision 3
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:Sat Mar 20 14:09:15 2021 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x02) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Disabled.
Self-test execution status:  (   0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection:(  120) seconds.
Offline data collection
capabilities:(0x11) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities:(0x0002) Does not save SMART data before
entering power-saving mode.
Supports SMART auto save timer.
Error logging capability:(0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time:(   2) minutes.
Extended self-test routine
recommended polling time:(  10) minutes.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE  UPDATED  
WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate 0x0032   100   100   050Old_age   Always   
-   0
  5 Reallocated_Sector_Ct   0x0032   100   100   050Old_age   Always   
-   0
  9 Power_On_Hours  0x0032   100   100   050Old_age   Always   
-   2588
 12 Power_Cycle_Count   0x0032   100   100   050Old_age   Always   
-   11
160 Unknown_Attribute   0x0032   100   100   050Old_age   Always   
-   0
161 Unknown_Attribute   0x0033   100   

Re: [OT] Re: Social-media antipathy (was Re: How i can optimize my operating system?)

2021-03-20 Thread deloptes
Andrei POPESCU wrote:

> In my (not so humble) opinion, this level of security could make sense
> for a disident in a totalitarian state, less so for regular users in
> democratic country.
> 

And you disappoint me here too - you believe in illusion of democracy, which
is not so obvious as i.e. in China, but I found out it is even much worse.
No need to argue and explain - I will not - Assange, Snowden are enough!
This is like in The Matrix. 
Enjoy the feeling for democracy as long as you can, but keep in mind the
reality is not so far from a "totalitarian state" and don't take the
colorful glasses off - it may lead to depression.




Re: [OT] Re: Social-media antipathy (was Re: How i can optimize my operating system?)

2021-03-20 Thread deloptes
Andrei POPESCU wrote:

> Good luck in doing public key cryptography without publishing the public
> key :)

Andrei - you disappoint me here!



Re: fsck error on boot: /dev/sda1: UNEXPECTED INCONSISTENCY and Partition 1 does not start on physical sector boundary

2021-03-20 Thread Andy Smith
Hello,

On Fri, Mar 19, 2021 at 10:36:37PM +0500, Alexander V. Makartsev wrote:
> Personally, I don't think it is wise to throw away any HDD as soon as it
> gets a few pending bad blocks for whatever reason.

It really depends upon your risk stance.

At home, on my home fileserver, it has RAID, it has backups, so if a
HDD sees a few remapped sectors I'm not going to throw the HDD out.
When it starts seeing many many increasing numbers of remapped
sectors then yes it's being replaced. But indeed it can be many
years between picking up a few remapped sectors and complete
meltdown.

https://gist.github.com/grifferz/64808f61079fe610c6f21f03ac7fd1aa

$ sudo ./blkleaderboard.sh 
 sdd 100418 hours (11.45 years) 0.29TiB ST3320620AS
 sdb  95783 hours (10.92 years) 0.29TiB ST3320620AS
 sda  94252 hours (10.75 years) 0.29TiB ST3320620AS
 sdi  66276 hours ( 7.56 years) 0.45TiB ST500DM002-1BD14
 sdk  55418 hours ( 6.32 years) 2.73TiB WDC WD30EZRX-00D
 sdh  44511 hours ( 5.07 years) 0.91TiB Hitachi HUA72201
 sde  24239 hours ( 2.76 years) 0.91TiB SanDisk SDSSDH31
 sdc  17672 hours ( 2.01 years) 0.29TiB ST3320418AS
 sdf   7252 hours ( 0.82 years) 1.82TiB Samsung SSD 860
 sdj   7130 hours ( 0.81 years) 1.75TiB KINGSTON SUV5001
 sdg   1560 hours ( 0.17 years) 1.75TiB KINGSTON SUV5001

I've replaced some drives in the last 2 years and those ones, once
they started gaining reallocated sectors they didn't survive long
even though I gave them the chance. Hence the three replacements in
the last 2 years. sdc and sdd are hanging on:

$ for d in /dev/sd?; do echo -n "$d: "; sudo smartctl -A $d | grep '^  5'; done
/dev/sda:   5 Reallocated_Sector_Ct   0x0033   100   100   036Pre-fail  
Always   -   0
/dev/sdb:   5 Reallocated_Sector_Ct   0x0033   100   100   036Pre-fail  
Always   -   0
/dev/sdc:   5 Reallocated_Sector_Ct   0x0033   097   097   036Pre-fail  
Always   -   151
/dev/sdd:   5 Reallocated_Sector_Ct   0x0033   100   100   036Pre-fail  
Always   -   5
/dev/sde:   5 Reallocated_Sector_Ct   0x0032   100   100   ---Old_age   
Always   -   0
/dev/sdf:   5 Reallocated_Sector_Ct   0x0033   100   100   010Pre-fail  
Always   -   0
/dev/sdg:   5 Reallocated_Sector_Ct   0x0033   100   100   010Pre-fail  
Always   -   0
/dev/sdh:   5 Reallocated_Sector_Ct   0x0033   100   100   005Pre-fail  
Always   -   0
/dev/sdi:   5 Reallocated_Sector_Ct   0x0033   100   100   036Pre-fail  
Always   -   0
/dev/sdj:   5 Reallocated_Sector_Ct   0x0033   100   100   010Pre-fail  
Always   -   0
/dev/sdk:   5 Reallocated_Sector_Ct   0x0033   200   200   140Pre-fail  
Always   -   0

At work, where's it's other people's data on the line, drives get
replaced soon as they show any defect like that, as when it does
escalate it tends to do so very quickly.

My own risk stance doesn't even permit running without redundancy
(unless inherently impossible due to the machine in question not
supporting that), because once you encounter Offline_Uncorrectable
in normal daily use it means that without redundancy, data loss has
occurred.

The drive couldn't read one or more of its sectors. If it's just
file data you can get it from backup but if, like OP here, it's
filesystem metadata then your actual filesystem is damaged and needs
fsck. And if unluckier still, whole filesystem can be broken. I'd
really rather not have to spend time on fixing that sort of thing.

> Even brand new drives are shipped with information about factory remapped
> sectors in special section inside their firmware, to cover up platter
> imperfections.

That's true, and to some extent with the densities in use today all
reading from drive is probabilistic and corrected by checksums. But
when they arrive like that they are supposed to be in a stable
state, without such errors increasing, so when they do start to
appear it is a cause for serious concern.

> This is why performing regular backups and validating them is better, I mean
> you do it all anyway, than replacing drives as soon as they get a few bad
> sectors.

I would say the two strategies are orthogonal because backups and
self-tests are advisable for everyone. Once a drive gets some
Offline_Uncorrectable the data is gone from it; backups and
self-tests didn't stop that from happening, they just helped you
recover from it (backups) or spot it early by testing even unused
areas of the drive (self-tests).

Anyway in OP's position, they have lost data which they need to
restore and while they could wait and see if the errors are
increasing in number they probably just want to get it replaced
ASAP.

Cheers,
Andy

-- 
https://bitfolk.com/ -- No-nonsense VPS hosting



Re: Package release number.

2021-03-20 Thread Klaus Singvogel
pe...@easthope.ca wrote:
> How is the "version number" interpreted?
> 
> "1:" ?

epoch

Epochs can help when the upstream version numbering scheme changes, but
they must be used with care. You should not change the epoch, even in
experimental, without getting consensus on debian-devel first.

> "3.1" qemu release number?

upstream version. Indeed the qemu release number.

> "dfsg-8" ?

Quote:
“+dfsg.N” is a conventional way of extending a version string, when the
Debian package's upstream source tarball is actually different from the
source released upstream. This is typically because upstream's source
release contains elements that do not satisfy the Debian Free Software
Guildelines (DFSG) and hence may not be distributed as source in the
Debian system.

> "deb10u8" Debian 10, update 8?

Yes, debian version: Debian 10, update 8

> Additional ideas?

https://www.debian.org/doc/debian-policy/ch-controlfields.html#version
https://readme.phys.ethz.ch/documentation/debian_version_numbers/
https://www.reddit.com/r/debian/comments/66094l/what_is_dfsg_in_package_version_numbers/

Regards,
Klaus.
-- 
Klaus Singvogel
GnuPG-Key-ID: 1024R/5068792D  1994-06-27



Re: Package release number.

2021-03-20 Thread Andy Smith
Hi Peter,

On Sat, Mar 20, 2021 at 08:16:22AM -0700, pe...@easthope.ca wrote:
> peter@joule:/home/peter$ dpkg -l | grep qemu-system-x86
> ii  qemu-system-x86   1:3.1+dfsg-8+deb10u8
>   i386 QEMU full system emulation binaries (x86)
> 
> How is the "version number" interpreted?

It's ony important within Debian so aside from some conventions it
does *have* to correspond to anything.

> "1:" ?

This is referred to within Debian and derivatives as "an epoch". It
is typically used when the format of the rest of the version needs
to change in ways that would not otherwise guarantee that subsequent
versions are considered newer than previous versions.

Something with an epoch of 1: will be considered newer than
something without an epoch, and if an epoch happened again then it
would be 2: and that would be newer than anything that starts with
no epoch or with 1: as an epoch.

> "3.1" qemu release number?

Yes, it is desirable to match the Debian package version with the
upstream version that it's based upon. Sometimes this is done
incorrectly and it has to be fixed and then a typical convention is
to use another suffix of "-reallyx.y.z".

> "dfsg-8" ?

A convention indicating that the package includes some number of
Debian-specific changes to make the package comply with Debian Free
Software Guidelines. For example, some documentation contains
invariant sections that no one has permission to change and those
don't fit what Debian considers to be "free", so they can get
stripped out.

https://wiki.debian.org/GFDLPositionStatement

> "deb10u8" Debian 10, update 8?

Yes, a convention saying that the basis for this package is the
version that first appeared in release 10 and this is the 8th update
to it since then.

> Additional ideas?

There are a lot of other conventions in use (the "-really" one being
one example) and I'm not sure if they are all listed out somewhere.

> I checked /https://www.debian.org/distrib/packages for an explanation.
> Nothing relevant.

If anywhere, I would expect it to be in documentation aimed at
Debian developers and contributors.

Cheers,
Andy

-- 
https://bitfolk.com/ -- No-nonsense VPS hosting



Package release number.

2021-03-20 Thread peter
Hi,

peter@joule:/home/peter$ dpkg -l | grep qemu-system-x86
ii  qemu-system-x86   1:3.1+dfsg-8+deb10u8
  i386 QEMU full system emulation binaries (x86)

How is the "version number" interpreted?

"1:" ?
"3.1" qemu release number?
"dfsg-8" ?
"deb10u8" Debian 10, update 8?

Additional ideas?

I checked /https://www.debian.org/distrib/packages for an explanation.
Nothing relevant.

Thanks,... P.

-- 
cell: +1 236 464 1479Bcc: peter at easthope. ca
VoIP: +1 604 670 0140



Re: [OT] Re: Social-media antipathy (was Re: How i can optimize my operating system?)

2021-03-20 Thread Andrei POPESCU
On Vi, 19 mar 21, 00:54:08, deloptes wrote:
> Stefan Monnier wrote:
> 
> > I hear there's a lot of interesting discussions there about how to
> > communicate safely, but sadly so far I haven't managed to configure my
> > safe not-internet-connected machine to participate.
> 
> do you think it is possible to have public & encrypted discussion, when we
> do not know each other? It is pointless.
> My point is that even if you use GPG on network computer, it is a risk that
> you get compromised. 
> I don't remember if it was StaxNEt that was making screenshots of your
> mobile display and sending them home for further analyses and this was may
> be 10y ago. Today with the one and only iPhone and Android ... even with
> encrypted whatever part.
> 
> The best way is 
> 1. download the encrypted message (usb/SD or uSD)
> 2. upload to isolated machine
> 3. decrypt, read, answer, encrypt 

The message itself could be used to compromise the offline machine.

> 4. upload encrypted message to the networked machine (usb/SD or uSD)
> 
> Note: all keys on the isolated machine (especially the private keys)

Good luck in doing public key cryptography without publishing the public 
key :)

> This worked 30-40y ago, works also now (well back then it was a floppy
> drive). 
> I am writing it, because people get lazy but in the same time wine about
> privacy. On the battle field (or in the jungle) if you are lazy, you die.
> It should be clear that even with the best security network it still may get
> compromised. And if you are stupid, nothing can help you anyway :)

In my (not so humble) opinion, this level of security could make sense 
for a disident in a totalitarian state, less so for regular users in 
democratic country.

Kind regards,
Andrei
-- 
http://wiki.debian.org/FAQsFromDebianUser


signature.asc
Description: PGP signature


Re: smartctl on SSD question

2021-03-20 Thread Alexander V. Makartsev

On 20.03.2021 15:50, Gene Heskett wrote:

Greetings all;

I have been using amanda on this machine to back up the rest of my
machines for over 2 decades. So I'm not a new user.

But I installed a 240 gig SSD about 6 months back, because its use as a
holding disk tripled amanda's speed at backing up this 6 machine system.

But amanda has been increasingly plagued with bad crc checksums of the
data in this holding disk, which usually leaves the files on that disk
as it redoes that disklist entry bypassing the holding disk, which
obviously takes longer as it is then doing the backup directly to the
archive medium.

I just initiated a "-t long" test on it.  So since its an SSD, what is
the best command to extract the most complete test report possible when
the test has finished? Its now done, and the -a option gives it a clean
report, along with a list of this and thats not supported. Cheap adata
drive.

Thank you.

Cheers, Gene Heskett

You can see SMART test results with "--log=selftest" parameter.
    # smartctl --log=selftest /dev/sdZ

Can you post back the output from "smartctl" with "--all" parameter for 
this device?
Usually cheap and\or old SSDs have a limited SMART information, but it 
still worth to look.
For reference, here is the output for one of my consumer grade SSDs 
(2-bit 2D MLC NAND), used as system disk for two OSs:

https://paste.debian.net/1190136

It was used daily for 23932 hours (attribute #9) and still reports 99% 
health left (attribute #231).


--
With kindest regards, Alexander.

⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system
⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org
⠈⠳⣄



smartctl on SSD question

2021-03-20 Thread Gene Heskett
Greetings all;

I have been using amanda on this machine to back up the rest of my 
machines for over 2 decades. So I'm not a new user.

But I installed a 240 gig SSD about 6 months back, because its use as a 
holding disk tripled amanda's speed at backing up this 6 machine system.

But amanda has been increasingly plagued with bad crc checksums of the 
data in this holding disk, which usually leaves the files on that disk 
as it redoes that disklist entry bypassing the holding disk, which 
obviously takes longer as it is then doing the backup directly to the 
archive medium.

I just initiated a "-t long" test on it.  So since its an SSD, what is 
the best command to extract the most complete test report possible when 
the test has finished? Its now done, and the -a option gives it a clean 
report, along with a list of this and thats not supported. Cheap adata 
drive.

Thank you.

Cheers, Gene Heskett
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
If we desire respect for the law, we must first make the law respectable.
 - Louis D. Brandeis
Genes Web page 



Re: fsck error on boot: /dev/sda1: UNEXPECTED INCONSISTENCY and Partition 1 does not start on physical sector boundary

2021-03-20 Thread Alexander V. Makartsev

On 20.03.2021 13:50, David wrote:

On Fri, 19 Mar 2021 at 19:53, Alexander V. Makartsev  wrote:


To perform surface scans you can use SMART short and long scans, and also a program called 
"badblocks" from the package "e2fsprogs".
Be sure to unmount "/dev/sda1" before performing the scans.

Hi, I wonder why you give this advice to unmount.
Are you aware of a smartmontools reference document
that gives this advice?

If not using captive mode --captive option then
I feel it is unnecessary.

If using captive mode then I would suggest to unmount
the entire drive, not just one partition.

This is my casual understanding, corrections with
authoritative sources are welcome.

Normally, a SMART scans are non-destructive and would finish with an 
error as soon as first bad block is found.
But if HDD would encounter a real bad block or a long sequence of them 
during test scans, there is always a chance that HDD won't recover right 
away and as a consequence, kernel will remount partitions as read-only 
making OS unresponsive, even if later HDD's controller will return to 
normal state after processing bad blocks.
For me it is simply a proactive measure, to prevent tests from 
interrupting mid-scan and any external interference. This might prevent 
a huge time waste when you scan a multi-terrabyte drives.
Additionally, a manpage for "badblocks" program strongly recommends to 
unmount partitions before performing non-destructive read-write or 
destructive write tests on the device.


If you look through information OP gathered for us, you should notice 
"/dev/sda" has only one partition with NTFS filesystem.


--
With kindest regards, Alexander.

⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system
⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org
⠈⠳⣄



Re: fsck error on boot: /dev/sda1: UNEXPECTED INCONSISTENCY and Partition 1 does not start on physical sector boundary

2021-03-20 Thread David
On Fri, 19 Mar 2021 at 19:53, Alexander V. Makartsev  wrote:

> To perform surface scans you can use SMART short and long scans, and also a 
> program called "badblocks" from the package "e2fsprogs".
> Be sure to unmount "/dev/sda1" before performing the scans.

Hi, I wonder why you give this advice to unmount.
Are you aware of a smartmontools reference document
that gives this advice?

If not using captive mode --captive option then
I feel it is unnecessary.

If using captive mode then I would suggest to unmount
the entire drive, not just one partition.

This is my casual understanding, corrections with
authoritative sources are welcome.