Re: Fedora37: Fails to boot with NISDOMAIN=... in /etc/sysconfig/network

2022-12-19 Thread Terry Barnaby

Well the /etc/nsswitch.conf is as set by authselect:

# Generated by authselect on Sun Dec 11 04:58:36 2022
# Do not modify this file manually, use authselect instead. Any user 
changes will be overwritten.
# You can stop authselect from managing your configuration by calling 
'authselect opt-out'.

# See authselect(8) for more details.

# In order of likelihood of use to accelerate lookup.
passwd: files nis systemd
shadow: files nis
group:  files nis systemd
hosts:  files myhostname resolve [!UNAVAIL=return] nis dns
services:   files nis
netgroup:   files nis
automount:  files nis

aliases:    files nis
ethers: files nis
gshadow:    files nis
networks:   files nis dns
protocols:  files nis
publickey:  files nis
rpc:    files nis

All the "users" for the system daemons should be in the files.

Really on this host the nis should be removed as it is the nis master. 
The config came from general config for clients and servers and I will 
remove the nis auth from the server. However this worked fine on 
Fedora35, so it is a change in Fedora37.


Although my config is really wrong, I wouldn't have thought this should 
cause a systemd boot fail.



On 18/12/2022 18:23, Roger Heflin wrote:
If the nisdomain is not responding, I would claim the system should 
still boot, so I would think that is a bug.  But if systemd/pam is not 
timing out on the non-responding nisdomain or the timeout is too high 
then I would think that might screw up a significant part of the 
system because lookups of passwd/hosts/group access may not work, 
depending on where else nis pieces are setup.  I would think if file 
was first and there is a valid entry in the file that it should not go 
to nisdomain, but that may depend on what the order is in 
/etc/nsswitch.conf.



On Sun, Dec 18, 2022 at 10:03 AM Terry Barnaby  wrote:

A strange one this. I was just updating a Fedora35 server to
Fedora37,
using a full reinstall and then copying configuration files from
the old
system.

The system failed to boot with lots of strange issues with
systemd. It
started with console messages like:

[ TIME ] Timeout waiting for device dev-zram0.device - /dev/zram0.

Some further issues with zram, followed by some other services
starting
fine then the system gets in a loop:

[ FAILED] Failed to start systemd-udevd.service - ...

[ FAILED] Failed to start systemd-oomd.service - ...

None of this was logged in /var/log/messages.

After some tracking down on a VM, I found the issue was somehow
caused
by having "NISDOMAIN=kingnet" in the file /etc/sysconfig/network.
This
came from an old client configuration ,setting the NIS domain.
This was
not actually needed on the server and in fact not needed on the
clients
now as the DHCP does this on our systems, it was a hangback from 20
years or so.

I have no idea why this has caused the boot fail though, I thought
I'd
mention it in case anyone else sees it. I will report it as a "bug"
against systemd although I'm not sure it is really a systemd issue or
really a bug at all, but a bit nasty as the system fails to boot.

Terry
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam, report it:
https://pagure.io/fedora-infrastructure/new_issue


___
users mailing list --users@lists.fedoraproject.org
To unsubscribe send an email tousers-le...@lists.fedoraproject.org
Fedora Code of 
Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines
List 
Archives:https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam, report 
it:https://pagure.io/fedora-infrastructure/new_issue


___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Fedora37: Fails to boot with NISDOMAIN=... in /etc/sysconfig/network

2022-12-18 Thread Terry Barnaby
A strange one this. I was just updating a Fedora35 server to Fedora37, 
using a full reinstall and then copying configuration files from the old 
system.


The system failed to boot with lots of strange issues with systemd. It 
started with console messages like:


[ TIME ] Timeout waiting for device dev-zram0.device - /dev/zram0.

Some further issues with zram, followed by some other services starting 
fine then the system gets in a loop:


[ FAILED] Failed to start systemd-udevd.service - ...

[ FAILED] Failed to start systemd-oomd.service - ...

None of this was logged in /var/log/messages.

After some tracking down on a VM, I found the issue was somehow caused 
by having "NISDOMAIN=kingnet" in the file /etc/sysconfig/network. This 
came from an old client configuration ,setting the NIS domain. This was 
not actually needed on the server and in fact not needed on the clients 
now as the DHCP does this on our systems, it was a hangback from 20 
years or so.


I have no idea why this has caused the boot fail though, I thought I'd 
mention it in case anyone else sees it. I will report it as a "bug" 
against systemd although I'm not sure it is really a systemd issue or 
really a bug at all, but a bit nasty as the system fails to boot.


Terry
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: Fedora37 NIS logins no audio in KDE/Plasma/X11

2022-12-12 Thread Terry Barnaby

On 11/12/2022 10:16, Terry Barnaby wrote:
I've just updated a test machine from Fedora35 to Fedora37, most is 
working but users logged in using NIS authentication no longer have 
access to audio.


The system is using the KDE/Plasma desktop and the sddm login manager 
all using X11.


It looks like /run/user/ does not exist and XDG_RUNTIME_DIR is 
not set etc.


The system has 
/usr/lib/systemd/system/systemd-logind.service.d/nss_nis.conf with the 
default contents:


[Service]
IPAddressAllow=any
RestrictAddressFamilies=AF_UNIX AF_NETLINK AF_INET AF_INET6

The first obvious error message in /var/log/messages is:

dbus-daemon[8116]: [session uid=1002 pid=8114] Activated service 
'org.freedesktop.systemd1' failed: Process org.freedesktop.systemd1 
exited with status 1


The start of .xsession-errors has:

Failed to import environment: Process org.freedesktop.systemd1 exited 
with status 1


Any ideas ? 



For info this looks like a bug. Restarting the systemd-userdbd service 
after ypbind has started, is a workaround for the issue.


See: https://bugzilla.redhat.com/show_bug.cgi?id=2152376
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Fedora37 NIS logins no audio in KDE/Plasma/X11

2022-12-11 Thread Terry Barnaby
I've just updated a test machine from Fedora35 to Fedora37, most is 
working but users logged in using NIS authentication no longer have 
access to audio.


The system is using the KDE/Plasma desktop and the sddm login manager 
all using X11.


It looks like /run/user/ does not exist and XDG_RUNTIME_DIR is 
not set etc.


The system has 
/usr/lib/systemd/system/systemd-logind.service.d/nss_nis.conf with the 
default contents:


[Service]
IPAddressAllow=any
RestrictAddressFamilies=AF_UNIX AF_NETLINK AF_INET AF_INET6

The first obvious error message in /var/log/messages is:

dbus-daemon[8116]: [session uid=1002 pid=8114] Activated service 
'org.freedesktop.systemd1' failed: Process org.freedesktop.systemd1 
exited with status 1


The start of .xsession-errors has:

Failed to import environment: Process org.freedesktop.systemd1 exited 
with status 1


Any ideas ?
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


Re: Time to update the hardware

2022-03-11 Thread Terry Barnaby
Note with NVME drives, well any NAND FLASH, you have to know what 
technology is in use when writing data to the drive.
In particular we have noticed that some of the latest, large storage 
devices, use TLC (three bits per cell) based technology. Writing to TLC 
cells is relatively slow.
So most NVME "drives" actually write PCIe -> RAM -> SLC first, then in 
the background SLC -> TLC. An area of the NAND flash is configured/used 
as SLC (1 bit per cell) which can be written at a fast speed. Then later 
(or when this SLC area is full) the "drive" starts moving this to TLC 
(probably the same SLC cells and now used in TLC mode).


The results of this is that you can see a fast burst for a few 100 
MBytes, and then the drive slows dramatically depending on the "drive" 
type, the size of it, how full it is and how the manufacturer's firmware 
does this. This is fine for typical uses by for streaming large amounts 
of data or data tests this can rear its head. Typically modern MLC based 
drives don't see this drop off in write speed.


Terry
On 11/03/2022 12:26, Patrick O'Callaghan wrote:

On Thu, 2022-03-10 at 19:47 -0400, George N. White III wrote:

Oops. It actually has "compress=zstd:1" in the fstab line.

Apologies. That completely invalidates the numbers.


Not completely invalid, they still say something about a real-world
use
case, (I work with optical remote sensing where many images have big
blocks of "missing" data codes, e.g., clouds) but the interpretation
changes.   We have been using netcdf4 files with internal
compression,
but now I'm motivated to compare without compression on btrfs for
"scratch"
files that don't  move on a network.

I'm calling it invalid because the data is a stream of zeroes, i.e.
it's pretty much maximally compressible.

This might be more realistic, using /dev/urandom:

$ time dd if=/dev/urandom bs=1M count=23000 of=Big
23000+0 records in
23000+0 records out
24117248000 bytes (24 GB, 22 GiB) copied, 81.9688 s, 294 MB/s

real1m22.106s
user0m0.040s
sys 1m21.753s

poc
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: NFS mount lockups since about a month ago

2021-10-30 Thread Terry Barnaby
Since some Fedora33 update in the last couple of weeks the problem has 
gone away. I haven't changed anything as far as I am aware.


One change is that the kernel moved from 5.13.x to 5.14.x ...

Terry
On 21/10/2021 23:36, Reon Beon via users wrote:

https://release-monitoring.org/project/2081/
Well it is a pre-release version. 2.5.5.rc3
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: NFS mount lockups since about a month ago

2021-10-06 Thread Terry Barnaby

Hi Roger,

Thanks for looking.
I will try NFS v3 with my latency tests running. I did try NFS v3 before 
and I "think" there were still desktop lockups but for a much shorter 
time. But this is just a feeling.

Current kernel on both systems is: 5.13.19-100.fc33.x86_64.
If I find the time, I will try and add some kernel NFS RPC call timers 
with some printk's and maybe try Fedora35 on another system.


Terry
On 05/10/2021 19:53, Roger Heflin wrote:

That network looks fine to me

I would try v3.  I have had bad luck many times with v4 on a variety
of different kernels.  If the code is recovering from something
related to a bug 45 seconds might be right to decide something that
was working is no longer working.

I am not sure any amount of debugging would help (without having
really verbose kernel debugging).

What is the current kernel you are running and trying a new one might
be worth it.  Though I don't see nfs changes/fixes listed in the
5.14.* or 5.13.* kernels changelog in the rpm file (rpm -q
--changelog) and there are only a few listed at kernel.org for  those
kernels.


___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: NFS mount lockups since about a month ago

2021-10-05 Thread Terry Barnaby
sar -n EDEV reports all 0's all around then. There are somerxdrop/s of 0.02 occasionally on eno1 through the day (about 20 of these 
with minute based sampling). Today ifconfig lists 39 dropped RX packets 
out of 2357593. Not sure why there are some dropped packets. "ethtool -S 
eno1" doesn't seem to list any particular issues. sar -n DEV does not 
appear to show anything at 10:51:30:
   IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s 
  txcmp/s  rxmcst/s   %ifutil 10:44:04 eno1 18.29 19.54 
 5.81  5.25  0.00  0.00  0.00  0.00 10:45:04 
  lo  0.00  0.00  0.00  0.00  0.00 
 0.00  0.00  0.00 10:45:04 eno1 20.45 22.52 
 5.96  5.79  0.00  0.00  0.00  0.00 10:46:04 
  lo  0.00  0.00  0.00  0.00  0.00 
 0.00  0.00  0.00 10:46:04 eno1 22.50 24.26 
 7.52  7.88  0.00  0.00  0.00  0.01 10:47:04 
  lo  0.00  0.00  0.00  0.00  0.00 
 0.00  0.00  0.00 10:47:04 eno1 21.53 22.75 
 7.27  5.71  0.00  0.00  0.00  0.01 10:48:04 
  lo  0.00  0.00  0.00  0.00  0.00 
 0.00  0.00  0.00 10:48:04 eno1    222.03    284.24 
   173.49    367.55  0.00  0.00  0.00  0.30 10:49:04 
  lo  0.00  0.00  0.00  0.00  0.00 
 0.00  0.00  0.00 10:49:04 eno1 11.83 12.28 
 2.74  3.98  0.00  0.00  0.00  0.00 10:50:04 
  lo  0.00  0.00  0.00  0.00  0.00 
 0.00  0.00  0.00 10:50:04 eno1 15.72 14.13 
 4.33  3.80  0.00  0.00  0.00  0.00 10:51:04 
  lo  0.00  0.00  0.00  0.00  0.00 
 0.00  0.00  0.00 10:51:04 eno1 11.00 10.53 
 3.48  2.63  0.00  0.00  0.00  0.00 10:52:04 
  lo  0.00  0.00  0.00  0.00  0.00 
 0.00  0.00  0.00 10:52:04 eno1 13.48 13.45 
 4.21  4.56  0.00  0.00  0.00  0.00 10:53:04 
  lo  0.00  0.00  0.00  0.00  0.00 
 0.00  0.00  0.00 10:53:04 eno1 21.76 23.98 
 6.99 10.26  0.00  0.00  0.00  0.01 10:54:04 
  lo  0.00  0.00  0.00  0.00  0.00 
 0.00  0.00  0.00 Also NFV4 uses TCP/IP I think by default 
and TCP/IP retries would be much quicker than 45 seconds. I do feel 
there is an issue in the NFS code somewhere, but I am biased about the 
speed of NFS directory access these days !


On 04/10/2021 17:06, Roger Heflin wrote:

Since it is recovering from it, maybe it is losing packets inside the
network, what does "sar -n DEV" and "sar -n EDEV" look like during
that time on both client seeing the pause and the server.

EDEV is typically all zeros unless something is lost.  if something is
being lost and it matches the times the time of hangs that could be
it.

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: NFS mount lockups since about a month ago

2021-10-04 Thread Terry Barnaby

and iostats:


04/10/21 10:51:14
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
  2.09    0.00    1.56    0.02    0.00   96.33

Device    r/s rkB/s   rrqm/s  %rrqm r_await rareq-sz w/s 
wkB/s   wrqm/s  %wrqm w_await wareq-sz d/s dkB/s   drqm/s 
 %drqm d_await dareq-sz f/s f_await  aqu-sz  %util
nvme0n1  0.00  0.00 0.00   0.00    0.00 0.00    0.00 
 0.00 0.00   0.00    0.00 0.00    0.00  0.00 0.00 
  0.00    0.00 0.00    0.00    0.00    0.00   0.00
nvme1n1  0.00  0.00 0.00   0.00    0.00 0.00    7.10 
39.20 1.40  16.47    1.68 5.52    0.00  0.00 0.00 
  0.00    0.00 0.00    0.10    1.00    0.01   0.19
sda  0.00  0.00 0.00   0.00    0.00 0.00    0.00 
 0.00 0.00   0.00    0.00 0.00    0.00  0.00 0.00 
  0.00    0.00 0.00    0.00    0.00    0.00   0.00
zram0    0.00  0.00 0.00   0.00    0.00 0.00    0.00 
 0.00 0.00   0.00    0.00 0.00    0.00  0.00 0.00 
  0.00    0.00 0.00    0.00    0.00    0.00   0.00



04/10/21 10:51:40
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
  2.31    0.00    1.60   10.86    0.00   85.24

Device    r/s rkB/s   rrqm/s  %rrqm r_await rareq-sz w/s 
wkB/s   wrqm/s  %wrqm w_await wareq-sz d/s dkB/s   drqm/s 
 %drqm d_await dareq-sz f/s f_await  aqu-sz  %util
nvme0n1  0.00  0.00 0.00   0.00    0.00 0.00    0.00 
 0.00 0.00   0.00    0.00 0.00    0.00  0.00 0.00 
  0.00    0.00 0.00    0.00    0.00    0.00   0.00
nvme1n1  0.00  0.00 0.00   0.00    0.00 0.00    0.71 
 3.43 0.15  17.39   12.16 4.84    0.00  0.00 0.00 
  0.00    0.00 0.00    0.04    1.00    0.01   0.19
sda  0.00  0.00 0.00   0.00    0.00 0.00    0.00 
 0.00 0.00   0.00    0.00 0.00    0.00  0.00 0.00 
  0.00    0.00 0.00    0.00    0.00    0.00   0.00
zram0    0.00  0.00 0.00   0.00    0.00 0.00    0.00 
 0.00 0.00   0.00    0.00 0.00    0.00  0.00 0.00 
  0.00    0.00 0.00    0.00    0.00    0.00   0.00



04/10/21 10:51:50
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
  2.59    0.00    1.69    0.17    0.00   95.55

Device    r/s rkB/s   rrqm/s  %rrqm r_await rareq-sz w/s 
wkB/s   wrqm/s  %wrqm w_await wareq-sz d/s dkB/s   drqm/s 
 %drqm d_await dareq-sz f/s f_await  aqu-sz  %util
nvme0n1  0.00  0.00 0.00   0.00    0.00 0.00    0.00 
 0.00 0.00   0.00    0.00 0.00    0.00  0.00 0.00 
  0.00    0.00 0.00    0.00    0.00    0.00   0.00
nvme1n1  0.00  0.00 0.00   0.00    0.00 0.00    1.40 
11.20 1.40  50.00   12.36 8.00    0.00  0.00 0.00 
  0.00    0.00 0.00    0.20    1.00    0.02   0.50
sda  0.00  0.00 0.00   0.00    0.00 0.00    0.00 
 0.00 0.00   0.00    0.00 0.00    0.00  0.00 0.00 
  0.00    0.00 0.00    0.00    0.00    0.00   0.00
zram0    0.00  0.00 0.00   0.00    0.00 0.00    0.00 
 0.00 0.00   0.00    0.00 0.00    0.00  0.00 0.00 
  0.00    0.00 0.00    0.00    0.00    0.00   0.00


So nothing obvious on the disks of the server. I am pretty sure this is 
an NFS issue ...
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: NFS mount lockups since about a month ago

2021-10-04 Thread Terry Barnaby
My disklatencytest showed a longish (14 secs) NFS file system 
directoty/stat lookup again today on a desktop:


2021-10-04T05:26:19 0.069486 0.069486 0.000570 /home/...
2021-10-04T05:28:19 0.269743 0.538000 0.001019 /home/...
2021-10-04T09:48:00 1.492158 0.003314 0.000907 /home/...
2021-10-04T09:49:02 2.581025 0.159358 0.000836 /home/...
2021-10-04T09:50:44 2.657260 0.076560 0.027128 /home/...
2021-10-04T09:51:30    14.889837    14.889837 0.022132 /home/...

A disklatencytest running on the server shows no long latency today so far.

The sar -d on the server around this time is:

10:49:00 dev8-020.14    152.57    246.05 0.0019.80 
 0.51 21.48  9.81
10:49:00    dev8-32 0.00  0.00  0.00  0.00  0.00 
 0.00  0.00  0.00
10:49:00    dev8-1639.36   1277.08    246.51 0.0038.71 
 0.52 11.03 13.06
10:49:00    dev8-48 0.00  0.00  0.00  0.00  0.00 
 0.00  0.00  0.00
10:49:00 
  dev252-0 2.35 0.00 9.39 0.00 4.00 0.00 
 0.00 0.02
10:50:00 dev8-0 8.38    134.51 80.09 0.0025.60 
 0.14  8.90  6.89
10:50:00    dev8-32 0.00  0.00  0.00  0.00  0.00 
 0.00  0.00  0.00
10:50:00    dev8-1614.08    286.15 80.09 0.0026.01 
 0.16  6.53  7.35
10:50:00    dev8-48 0.00  0.00  0.00  0.00  0.00 
 0.00  0.00  0.00
10:50:00 
  dev252-0 0.80 0.00 3.20 0.00 4.00 0.00 
 0.00 0.01
10:51:00 dev8-0 7.99 75.65    110.65 0.0023.31 
 0.26 23.22  7.33
10:51:00    dev8-32 0.00  0.00  0.00  0.00  0.00 
 0.00  0.00  0.00
10:51:00    dev8-1644.98   1704.41    110.65 0.0040.35 
 0.30  5.00 12.43
10:51:00    dev8-48 0.00  0.00  0.00  0.00  0.00 
 0.00  0.00  0.00
10:51:00 
  dev252-0 3.86 0.0015.45 0.00 4.00 0.00 0.01  0.03
10:52:00 dev8-021.20    265.73    415.69 0.0032.15 
 0.42 16.08  8.64
10:52:00    dev8-32 0.00  0.00  0.00  0.00  0.00 
 0.00  0.00  0.00
10:52:00    dev8-1656.56   1603.80    415.69 0.0035.71 
 0.45  6.62 12.77
10:52:00    dev8-48 0.00  0.00  0.00  0.00  0.00 
 0.00  0.00  0.00
10:52:00 
  dev252-0 5.23 0.0020.91 0.00 4.00 0.00 0.02  0.02
10:53:00 dev8-012.94    265.40 94.51 0.0027.82 
 0.25 14.24  7.29
10:53:00    dev8-32 0.00  0.00  0.00  0.00  0.00 
 0.00  0.00  0.00
10:53:00    dev8-1646.35   1747.85 94.51 0.0039.75 
 0.32  5.27 11.99
10:53:00    dev8-48 0.00  0.00  0.00  0.00  0.00 
 0.00  0.00  0.00
10:53:00 
  dev252-0 3.60 0.0014.39 0.00 4.00 0.00 0.01  0.02


and iostats in next email

So nothing obvious on the disks of the server. I am pretty sure this is 
an NFS issue ...
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: NFS mount lockups since about a month ago

2021-10-03 Thread Terry Barnaby

On 04/10/2021 00:51, Roger Heflin wrote:
With 10 minute samples anything that happened gets averaged enough 
that even the worst event is almost impossible to see.


Sar will report the same as date ie local time.  And a 12:51 event 
would be in the 13:00 sample (started at about 12:50 and ended at 1300).


What I do see is that during that window your io rate was about 2x 
prior 10 minute windows.  With the 1 minute data we would be able to 
see if the disk was excessively busy.  You average iops were about 10% 
of the disk capacity.


I have debugged issues where the badly behaving IO was maxing out 
everything for 10sec on/10 sec off, in the 1 minute data there 
appeared to be nothing interesting to see (50% capacity), but it was 
playing hell with the interactive apps since during the 10 sec on 
window operations that the user was doing that were normally taking .5 
sec were taking 1-2 seconds and so clearly slow for the users.  
With the sample size (60sec) close to the event size (45sec) it should 
be visible on 1 minute data, but less than clear on 10 minute data 
(9.25 minutes to average it out and hide/mask it).


do "systemctl edit sysstat-collect.timer"
And add this to the file:

[Timer]
OnCalendar=*:00/1

That will change it to 1minutes.

if you do this:
#!/bin/bash
while [ true ] ; do
export hour=$(date +%H)
iostat -t -x 10 360  > filename.${hour}
done

that will give you 10 sec iostat data, and start a new file each hour, 
and overwrite the hour files the next day.

Thanks Roger, I will set those up/running.
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: NFS mount lockups since about a month ago

2021-10-03 Thread Terry Barnaby
45 second event happened at: 2021-10-02T11:51:02 UTC. Not sure what sar 
time is based on (maybe local time BST  rather than UTC so would be 
2021-10-02T12:51:02 BST.


Continuing info ...

sar -n NFSD on the server

11:00:01    24.16  0.00 24.16  0.00 24.16  0.00  
0.00  0.35 1.48  2.07 21.08
11:10:01    21.13  0.00 21.13  0.00 21.13 0.00  
0.00  0.28  0.89  1.72 19.58
11:20:02    17.85  0.00 17.85  0.00 17.85 0.00  
0.00  0.27  0.69  0.82 16.65
11:30:02    20.66  0.00 20.66  0.00 20.66 0.00  
0.00  0.29  0.83  1.42 19.15
11:40:02    39.80  0.00 39.80  0.00 39.80 0.00  
0.00  0.89  2.05  3.67 25.51
11:50:02    35.40  0.00 35.40  0.00 35.40 0.00  
0.00  0.39  0.65  1.22 18.21
12:00:02    41.85  0.00 41.85  0.00 41.85 0.00  
0.00  0.84  1.14  2.08 20.50
12:10:02    38.54  0.00 38.54  0.00 38.54 0.00  
0.00  0.48  0.82  1.48 19.62
12:20:02    39.85  0.00 39.85  0.00 39.85 0.00  
0.00  0.37  1.50  1.29 19.44
12:30:02    39.84  0.00 39.84  0.00 39.84 0.00  
0.00  0.70  1.03  2.28 19.78
12:40:02    38.29  0.00 38.29  0.00 38.29 0.00  
0.00  0.46  0.81  1.26 19.37
12:50:02    71.38  0.00 71.38  0.00 71.37 0.00  
0.00  1.12  2.41  8.19 34.87
13:00:00    77.46  0.00 77.46  0.00 77.45 0.00  
0.00  1.43  3.30  7.36 38.31
13:10:00    67.62  0.00 67.63  0.00 67.62 0.00  
0.00  3.85  2.84  4.68 29.66


sar -n SOCK on the server
11:20:02    480    41    32 10 0
11:30:02    482    41    32 104
11:40:02    480    41    32 10 0
11:50:02    480    41    32 101
12:00:02    480    41    32 101
12:10:02    480    41    32 101
12:20:02    480    41    32 101
12:30:02    480    41    32 101
12:40:02    480    41    32 101

12:40:02   totsck    tcpsck    udpsck    rawsck   ip-frag    tcp-tw
12:50:02    480    41    32 101
13:00:00    480    41    32 101
13:10:00    490    43    32 101

sar -n NFS on the client
11:10:02  19.82 0.00 0.28  0.34  0.71 15.13
11:20:03  16.53 0.00 0.27  0.15  0.34 13.80
11:30:04  17.20 0.00 0.13  0.08  0.30 15.17
11:40:04  37.46 0.00 0.89  1.47  2.07 14.42
11:50:05  32.97 0.00 0.28  0.11  0.34 15.00
12:00:05  36.31 0.00 0.59  0.47  0.75 14.17
12:10:06  34.77 0.00 0.36  0.26  0.65 14.95
12:20:07  37.55 0.00 0.27  0.97  0.35 15.36
12:30:07  33.90 0.00 0.46  0.37  0.62 13.47
12:40:07  35.67 0.00 0.36  0.28  0.64 15.44
12:50:07  68.97 0.00 1.01  1.89  6.64 13.28
13:00:07  71.08 0.00 1.17  2.58  4.32 17.87
13:10:07  49.28 0.00 0.84  1.55  1.81 15.23
13:20:00  41.87 0.00 0.50  1.24  0.87 14.98

sar -n SOCK client
11:20:03   1166    39    140 0 0
11:30:04   1164    35    140 02
11:40:04   1191    50    150 01
11:50:05   1182    40    140 0 0
12:00:05   1182    39    140 02
12:10:06   1182    39    140 03
12:20:07   1171    39    140 01
12:30:07   1179    40    150 0 0
12:40:07   1179    39    150 0 0
12:50:07   1200    45    170 01
13:00:07   1188    40    140 02

Nothing obvious I can see there ...

Terry
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org

Re: NFS mount lockups since about a month ago

2021-10-03 Thread Terry Barnaby
45 second event happened at: 2021-10-02T11:51:02 UTC. Not sure what sar 
time is based on (maybe local time BST  rather than UTC so would be 
2021-10-02T12:51:02 BST.


"sar -d" on the server:

11:50:02   dev8-0  4.67  0.01 46.62 0.00  9.99  
0.12 14.03  5.75
11:50:02  dev8-32  0.01  0.00  0.00  0.00 0.25  
0.00  1.62  0.00
11:50:02  dev8-16  4.85  5.46 46.62  0.00 10.74  
0.13 14.25  5.92
11:50:02  dev8-48  0.01  0.00  0.00  0.00 0.25  
0.00  1.75  0.00
11:50:02 dev252-0  0.00  0.00  0.00  0.00 0.00  
0.00  0.00  0.00
12:00:02   dev8-0  5.61  0.06 99.39  0.00 17.74  
0.15 15.41  6.17
12:00:02  dev8-32  0.01  0.00  0.00  0.00 0.30  
0.00  1.60  0.00
12:00:02  dev8-16  5.76  4.48 99.39  0.00 18.03  
0.14 14.77  6.26
12:00:02  dev8-48  0.01  0.00  0.00  0.00 0.30  
0.00  1.60  0.00
12:00:02 dev252-0  0.00  0.00  0.00  0.00 0.00  
0.00  0.00  0.00
12:10:02   dev8-0 11.41  0.07    139.77  0.00 12.25  
0.20 12.15  6.26
12:10:02  dev8-32  0.04  0.02  0.00  0.00 0.37  
0.00  1.12  0.01
12:10:02  dev8-16 11.69  3.72    139.77  0.00 12.28  
0.19 10.74  6.50
12:10:02  dev8-48  0.04  0.02  0.00  0.00 0.37  
0.00  1.12  0.01
12:10:02 dev252-0  0.12  0.00  0.47  0.00 4.00  
0.00  0.00  0.00
12:20:02   dev8-0  8.69  0.51 84.42  0.00 9.77  
0.18 14.21  6.15
12:20:02  dev8-32  0.01  0.00  0.00  0.00 0.25  
0.00  1.62  0.00
12:20:02  dev8-16  8.94  5.27 84.42  0.00 10.04  
0.15 10.27  6.33
12:20:02  dev8-48  0.01  0.00  0.00  0.00 0.25  
0.00  1.62  0.00
12:20:02 dev252-0  0.00  0.00  0.00  0.00 0.00  
0.00  0.00  0.00
12:30:02   dev8-0  5.44  0.08 95.99  0.00 17.65  
0.13 13.68  5.91
12:30:02  dev8-32  0.01  0.00  0.00  0.00 0.30  
0.00  1.80  0.00
12:30:02  dev8-16  5.60  5.17 95.99  0.00 18.05  
0.14 13.75  6.17
12:30:02  dev8-48  0.01  0.00  0.00  0.00 0.30  
0.00  1.80  0.00
12:30:02 dev252-0  0.00  0.00  0.00  0.00 0.00  
0.00  0.00  0.00
12:40:02   dev8-0  4.62  0.04 88.43  0.00 19.14  
0.12 13.70  5.48
12:40:02  dev8-32  0.01  0.00  0.00  0.00 0.30  
0.00  1.60  0.00
12:40:02  dev8-16  4.73  4.15 88.43  0.00 19.57  
0.12 14.01  5.70
12:40:02  dev8-48  0.01  0.00  0.00  0.00 0.30  
0.00  1.60  0.00
12:40:02 dev252-0  0.00  0.00  0.00  0.00 0.00  
0.00  0.00  0.00
12:50:02   dev8-0  8.25  3.26    213.70  0.00 26.29  
0.22 17.96  7.47
12:50:02  dev8-32  0.01  0.00  0.00  0.00 0.25  
0.00  1.62  0.00
12:50:02  dev8-16  8.50  8.12    213.70  0.00 26.09  
0.24 19.29  7.63
12:50:02  dev8-48  0.01  0.00  0.00  0.00 0.25  
0.00  1.75  0.00
12:50:02 dev252-0  0.44  0.00  1.78  0.00 4.00  
0.00  0.01  0.00
13:00:00   dev8-0 10.36  0.09    200.16  0.00 19.33  
0.23 15.19  7.59
13:00:00  dev8-32  0.01  0.00  0.00  0.00 0.30  
0.00  1.80  0.00
13:00:00  dev8-16 10.72  3.70    200.16  0.00 19.02  
0.23 14.65  7.72
13:00:00  dev8-48  0.01  0.00  0.00  0.00 0.30  
0.00  1.60  0.00
13:00:00 dev252-0  0.09  0.00  0.36  0.00 4.00  
0.00  0.02  0.00


Other bits in next email ...

Terry
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: NFS mount lockups since about a month ago

2021-10-02 Thread Terry Barnaby
I am getting more sure this is an NFS/networking issue rather than an 
issue with disks in the server.


I created a small test program that given a directory finds a random 
file in a random directory three levels below, opens it and reads up to 
a block (512 Bytes) of data from it and times how long it took to find 
the file (opendir/readir) and read the block from the file printing the 
results if the time is greater than previous ones (so seeing the peek 
times). This is repeated every 10 seconds. First param is the average 
time to find the file (there may not be a file 3 levels down so it 
repeats those searches untill it finds one that the user can access), 
the second is the time it took to find the file (3 x opendir/readdir) to 
a file that existed. the last time is how long it took to open, read and 
close the file.


I set one of these processes running on the server starting at the /home 
dir and did the same on one of my clients that has /home NFS V4 mounted 
with defaults + async.


The server after 12 hours had peak timings of (file paths hidden):

2021-10-02T09:26:38 0.008858 0.043513 0.031735 /home/...
2021-10-02T09:26:58 0.005384 0.050870 0.039186 /home/...
2021-10-02T09:38:09 0.006684 0.081707 0.014616 /home/...
2021-10-02T10:18:42 0.037394 0.144025 0.012603 /home/...

The client had timings of:

2021-10-02T08:48:45 0.056195 0.110149 0.019353 /home/...
2021-10-02T09:06:31 0.098647 0.098647 0.015171 /home/...
2021-10-02T09:28:38 1.060605 0.001996 0.000422 /home/...
2021-10-02T09:31:28 4.896196 2.037488 0.000836 /home/...
2021-10-02T11:48:44 4.423502 7.087917 1.111684 /home/...
2021-10-02T11:51:02    27.711746    45.646627 0.021321 /home/...

So at one point the NFS mounted client took 45 seconds to find a file 
(opendir/readdir 3 times) and once before 7.08 seconds with 1.1 seconds 
to read a block. The actual file it accessed is

46819 Bytes long and can be normally quickly accessed/copied etc.

"sar -d" reported no issues.

"mountstats /home" reported no issues

"/var/log/messages" in both systems reported no issues.

Generally the desktop system has been responsive all day (no other users 
and nothing obvious going on on both server and client) and I have not 
noticed a "lockup" on the GUI I have been using (intermittently). No 
noticeable network errors, no noticeable hard disk read issues, but 
occasional very long NFS opendir/readdir which would match up with when 
i see the desktop lock up for around 30secs ore more.
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: NFS mount lockups since about a month ago

2021-10-01 Thread Terry Barnaby

On 01/10/2021 19:05, Roger Heflin wrote:

it will show latency.  await is average iotime in ms, and %util is
calced based in await and iops/sec.  So long as your turn sar down to
1 minute samples it should tell you which of the 2 disks had higher
await/util%.With a 10 minute sample the 40sec pause may get spread
out across enough iops that you cannot see it.

If one disk pauses that disks utilization will be significantly higher
than the other disk, and if utilization is much higher for the same or
less IOPS that is generally a bad sign.   2 similar disks with similar
iops will generally have similar util.The math is close to (iops *
await / 10)(returns percent).

Are you using MDraid or hardware raid?   doing a "grep mddevice
/var/log/messages will show if md forced a rewrite and/or had a slow.

you can do this on those disks:
  smartctl -l scterc,20,20 /dev/

I believe 20 (2.0 seconds) is as low as a WD red lets you go according
to my tests.  If the disk hangs it will hang for 2 seconds vers the
current default (it should be 7 seconds, and really depends on how
many bad blocks there are together that try to be read).   Setting it
to 2 will make the overall timeout 3.5x smaller, so if that reduce the
hang time by about that that is a confirmation that it is a disk
issue.

and do this on the disks:
  smartctl --all /dev/sdb | grep -E '(Reallocated|Current_Pen|Offline Uncor)'


if any of those 3 is nonzero in the last column, that may be the
issue.   The smart firmware will fail disks that are perfectly find,
and it will fail to fail horribly bad disks.The PASS/FAIL
absolutely cannot be trusted no matter what is says.  FAIL is more
often right, but PASS  is often unreliable.

So if nonzero note the number, and next pause look again and see if
the numbers changed.
___


Thanks for the info, I am using MDraid. There are no "mddevice" messages 
in /var/log/messages and smartctl -a lists no errors on any of the 
disks. The disks are about 3 years old, I change them in servers between 
3 and 4 years old.


I will create a program to measure the effective sars output and detect 
any discrepancies as this problem only occurs now and then along with 
measuring iolatency on NFS accesses on the clients to see if I can track 
down if it is a server disk issue or an NFS issue. Thanks again for the 
info.

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: NFS mount lockups since about a month ago

2021-10-01 Thread Terry Barnaby

On 01/10/2021 13:31, D. Hugh Redelmeier wrote:

Trivial thoughts from reading this thread.  Please don't take the
triviality as an insult.

Perhaps the best way to determine if the problem is from a software update
is to downgrade likely packages.  In the case of the kernel, you can just
boot an older one (assuming that an old enough one is still installed --
fedora sure has a lot of package churn).

In case the HDDs are the problem, consider running S.M.A.R.T. drive
self-tests on them.  I know you said that smartctl reports no errors but
you didn't say whether you've run the drive self-tests.

Is the pause long enough for you to figure out what is hanging?  On either
side?  (I haven't used NFS for a couple of decades so I'm pretty rusty on
the tooling.)
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Its probably getting to hard to downgrade the server and clients now, 
its been more than a month and as you say Fedora updates are frequent!


I think I will write some programs to perform live tests and logging of 
things.

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: NFS mount lockups since about a month ago

2021-10-01 Thread Terry Barnaby

On 30/09/2021 19:27, Roger Heflin wrote:

Raid0, so there is no redundancy on the data?

And what kind of underlying hard disks?   The desktop drives will try
for a long time (ie a minute or more) to read any bad blocks.  Those
disks will not report an error unless it gets to the default os
timeout, or it hits the disk firmware timeout.

The sar data will show if one of the disks is being slow on the server end.

On the client end you are unlikely to get anything useful from any
samples as it seems pretty likely the server is not responding to nfs
and/or the disks are not responding.

It could be as simple as on login it tries to read a badish/slow block
and that block takes a while to finally get it to read.   If that is
happening it will probably eventually stop being able to read it, and
if you really are using raid0 then some data will be lost.

All of the nfsv4 issues I have ran into involve it just breaking and
staying broke (usually when the server reboots).  I never had it have
big sudden pauses, but using v3 won't hurt and I try to avoid v4
still.

Sorry I meant Raid1, they are WD RED WD30EFRX-68N32N0 disks, I have 
found them pretty good for 24/7 RAID usage on a few different systems 
and have had no issues like this until about a month ago. Unfortunately 
I don't think sar will show latency, only amount of disk usage. Yes NFS 
V4 does have issues with things like directory access performance over 
slow connections etc.
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: NFS mount lockups since about a month ago

2021-09-30 Thread Terry Barnaby

On 30/09/2021 11:42, Roger Heflin wrote:
On mine when I first access the NFS volume it takes 5-10 seconds for 
the disks to spin up.  Mine will spin down later in the day if little 
or nothing is going on and I will get another delay.


I have also seen delays if a disk gets bad blocks and corrects them.  
About 1/2 of time that does have a message but some of the time there 
are no messages at all about it, and I have had to resort to using Sar 
to figure out which disk is causing the issue.


So on my machine I see this (sar -d):
05:29:01 AM DEV tps rkB/s wkB/s dkB/s areq-sz aqu-sz await %util
05:29:01 AM dev8-0 36.16 94.01 683.65 0.00 21.51 0.03 0.67 1.11
05:29:01 AM dev8-16 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00
05:29:01 AM dev8-32 0.02 0.00 0.00 0.00 0.00 0.00 1.00 0.00
05:29:01 AM dev8-48 423.65 71239.92 198.64 0.00 168.63 12.73 29.72 86.07
05:29:01 AM dev8-64 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00
05:29:01 AM dev8-80 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00
05:29:01 AM dev8-144 2071.22 71311.58 212.22 0.00 34.53 11.37 5.47 54.81
05:29:01 AM dev8-96 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00
05:29:01 AM dev8-128 1630.99 71389.49 198.18 0.00 43.89 15.72 9.62 57.05
05:29:01 AM dev8-112 2081.05 71426.01 182.48 0.00 34.41 11.32 5.42 55.68

There is a 4 disk raid6 check going on.

You will notice that dev8-48 is busier than the other 3 disks, in this 
case that is because it is a 3TB disk vs the other 3 being all newer 
6tb disks with higher data/revolution.


If you have sar setup with 60 second samples the one disk that pauses 
should stand out more obvious than this since the 3tb seems to be only 
marginally faster than the 6tbs.



___


In my case the servers /home is on a partition of the two main Raid0 
disks that is shared with the OS and so are active most of the time. No 
errors reported.


I will try setting up sar with a 60 second sample time on the client, 
thanks for the idea.


___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: NFS mount lockups since about a month ago

2021-09-30 Thread Terry Barnaby

On 30/09/2021 11:32, Ed Greshko wrote:

On 30/09/2021 16:35, Terry Barnaby wrote:
This is a very lightly loaded system with just 3 users ATM and very 
little going on across the network (just editing code files etc). The 
problem occurred again yesterday. For about 10 minutes my KDE desktop 
locked up in 20 second bursts and then the problem went away for the 
rest of the day. During that time the desktop and server were idle 
for 98.5% and pings continued fine. A kconsole window doing an "ls 
/home" every 5 seconds was locked up doing the ls. I had kconsole 
windows open doing the pings, top's and ls'es and although I couldn't 
operate the desktop (move virtual desktops etc) the ping and top 
windows were updating fine. No error messages in /var/log/messages on 
both systems and the sar stats showed nothing out of the ordinary.


I am pretty sure the Ethernet network is fine including cables, 
switches Ethernet adapters etc. Pings are fine etc. It just appears 
that the client programs get a huge (> 20 secs) delayed response to 
accesses to /home every now and then which points to NFS issues. Most 
of the system stats counters just give the amount of access, not the 
latency of an access which is what I need to track down the problem 
as there are few disk and network accesses going on.


As I said all has been fine on this system until about a month ago 
and the only obvious changes are the Fedora updates so I wondered if 
anyone new if there had been changes to the NFS stack recently and/or 
how to log peak NFS latencies ?


First of all, pings are at the hardware level and pretty much useless 
for doing anything other than confirming

connectivity.

How are the mounts achieved.  Hard mounts, soft mounts, what version 
are you using for mounts?


I use systemd automounts for home directories and and have

Options=rw,soft,fg,x-systemd.mount-timeout=30,v4.2 Type=nfs4 I have 
not seen any issues, but all the systems are VM. When faced with this 
type of problem even though I swear there is nothing wrong with my 
physical set up I do tend to reset cables and swithch things around to 
see if something changes.

--

Yes, the pings are to determine that the network interface chips, cables 
and switches are basically working, which they are with no obvious issues.


Mounts are normal fstab with "king.kingnet:/home /home nfs 
defaults,async 0 0", so defaults apart from async and with Ethernet 
interfaces set to default 1500 MTU etc. So is using the default NFSV4, I 
think I might try forcing that to NFSV3 to see if that changes anything.


Yes, problems often occur due to you having done something, but I am 
pretty sure nothing has changed apart from Fedora updates.


___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: NFS mount lockups since about a month ago

2021-09-30 Thread Terry Barnaby

Thanks for the feedback everyone.

This is a very lightly loaded system with just 3 users ATM and very 
little going on across the network (just editing code files etc). The 
problem occurred again yesterday. For about 10 minutes my KDE desktop 
locked up in 20 second bursts and then the problem went away for the 
rest of the day. During that time the desktop and server were idle for 
98.5% and pings continued fine. A kconsole window doing an "ls /home" 
every 5 seconds was locked up doing the ls. I had kconsole windows open 
doing the pings, top's and ls'es and although I couldn't operate the 
desktop (move virtual desktops etc) the ping and top windows were 
updating fine. No error messages in /var/log/messages on both systems 
and the sar stats showed nothing out of the ordinary.


I am pretty sure the Ethernet network is fine including cables, switches 
Ethernet adapters etc. Pings are fine etc. It just appears that the 
client programs get a huge (> 20 secs) delayed response to accesses to 
/home every now and then which points to NFS issues. Most of the system 
stats counters just give the amount of access, not the latency of an 
access which is what I need to track down the problem as there are few 
disk and network accesses going on.


As I said all has been fine on this system until about a month ago and 
the only obvious changes are the Fedora updates so I wondered if anyone 
new if there had been changes to the NFS stack recently and/or how to 
log peak NFS latencies ?


Terry
On 26/09/2021 18:06, Roger Heflin wrote:

Make sure you have sar/sysstat enabled and changed to do 1 minute samples.

sar -d will show disk perf.  If one of the disks "blips" at the
firmware level (working on a hard to read block maybe), the util% on
that device will be significantly higher than all other disks so will
stand out.  Then you can look deeper at the smart data.

sar generically will show your cpu/system time and sar -n DEV will
show detailed network traffic, sar -n EDEV will show network errors.

With it set to 1 minute you should be able to detect most blips.

On Sun, Sep 26, 2021 at 10:26 AM Jamie Fargen  wrote:

Are there network switches under your control? It sounds similar to what 
happens when MTU on the systems MTU do not match or one system MTU is set above 
the value on the switch ports.

Next time the issue occurs use ping with the do not fragment flag.
ex $ ping -m DO -s 8972 ip.address

This example should be the highest value to work in the case of MTU size 9000, 
there is 28 byte overhead for IPv4 packets.

Second, are you sure no one is attaching to the network and duplicating the MAC 
address of your NFS server or perhaps the system that is stalled? If the 
switches are manageable you would have to insure that the MAC addresses are 
being learned on the correct ports.

-Jamie


On Sun, Sep 26, 2021 at 10:24 AM Tom Horsley  wrote:

On Sun, 26 Sep 2021 10:26:19 -0300
George N. White III wrote:


If you have cron jobs that use a lot of network bandwidth it may work
fine until some network issue causing lots of retransmits bogs it down.

Which is why you should check the dumb stuff first! Has a critter
chewed on the ethernet cable to the server?
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List 

Re: NFS mount lockups since about a month ago

2021-09-25 Thread Terry Barnaby

On 25/09/2021 09:00, Ed Greshko wrote:

On 25/09/2021 14:07, Terry Barnaby wrote:



A few questions.

1.  Are you saying your NFS server HW is the same for the past 25 
years.  Couldn't have been all Fedora, right?


No ( :) ) was using previous Linux and Unix systems before then. 
Certainly OS versions and hardware has changed over the years but 
setup is the same and no hardware changed in that last couple of 
years certainly no hardware/software changes in the last couple of 
months when the problem started to occur apart from Fedora33 updates.


OK.  Kinda sounded like the server HW was "old".





2.  How many clients?  Connected on a single or multiple switches?


5 clients, two switches, but clients on the single switch to the 
server have the issue as well as others. Pings still operate in 
locked up condition.


Since all clients are being affected in the same manner it would point 
more towards a server issues as you've already

concluded.





3.  Do the lockups happen during a given time of day, or random?


They appear to be random although they appear more frequently when 
first logging in (more /home accesses then ?).


Not that it matters, but everyone isn't logging in at the same time 
correct?  At login folks are getting lock ups most

frequently.





4.  Have you checked for possible disk errors on the server?
No disk related error messages and RAID file systems show as ok. 
Smartctl shows no issue on disks. 


Are you running sysstat and collecting system information?  You may 
want to consider doing that to see, for example,
if "sar -n NFS" or "sar -n NFSD" show anything unusual.  LIke 
excessive re-transmission.



Thanks for the info. Yes, sysstat is running I will try "sar -n NFS" and 
"sar -n NFSD" as well as "mountstats /home" which I have found after a 
lockup has occurred. Although "systat does no seem to list max latency 
which would be the pointer to look for. Actually I was thinking it may 
be the clients rather than the server as normally there are "NFS server 
not responding" messages on the clients if the server is down for some 
reason, but obviously it could be either.


Random login times and it occurs during the day as well.

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: NFS mount lockups since about a month ago

2021-09-25 Thread Terry Barnaby

On 25/09/2021 06:42, Ed Greshko wrote:

On 25/09/2021 13:04, Terry Barnaby wrote:

Hi,

I use NFS mount (defaults so V4) /home directories with a simple 
server over Gigabit Ethernet all running Fedora33. This has been 
working fine for 25+ years through various Fedora versions. However 
in the last month or so all of the client computers are getting KDE 
GUI lockups every few hours that last for around 40 secs. /home is 
not accessible during this time and it feels/looks to be an NFS 
lockup issue. There are no "NFS server no responding" or such like 
messages in either the servers or clients /var/log/messages and the 
network communications seems fine.


1. Have there been some changes to NFS recently in the kernel ?

2. Any idea where to begin to try and debug this ?






Thanks for the reply:


A few questions.

1.  Are you saying your NFS server HW is the same for the past 25 
years.  Couldn't have been all Fedora, right?


No ( :) ) was using previous Linux and Unix systems before then. 
Certainly OS versions and hardware has changed over the years but setup 
is the same and no hardware changed in that last couple of years 
certainly no hardware/software changes in the last couple of months when 
the problem started to occur apart from Fedora33 updates.




2.  How many clients?  Connected on a single or multiple switches?


5 clients, two switches, but clients on the single switch to the server 
have the issue as well as others. Pings still operate in locked up 
condition.




3.  Do the lockups happen during a given time of day, or random?


They appear to be random although they appear more frequently when first 
logging in (more /home accesses then ?).




4.  Have you checked for possible disk errors on the server?
No disk related error messages and RAID file systems show as ok. 
Smartctl shows no issue on disks.

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


NFS mount lockups since about a month ago

2021-09-24 Thread Terry Barnaby

Hi,

I use NFS mount (defaults so V4) /home directories with a simple server 
over Gigabit Ethernet all running Fedora33. This has been working fine 
for 25+ years through various Fedora versions. However in the last month 
or so all of the client computers are getting KDE GUI lockups every few 
hours that last for around 40 secs. /home is not accessible during this 
time and it feels/looks to be an NFS lockup issue. There are no "NFS 
server no responding" or such like messages in either the servers or 
clients /var/log/messages and the network communications seems fine.


1. Have there been some changes to NFS recently in the kernel ?

2. Any idea where to begin to try and debug this ?

Terry
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Dhcpd server usage with USB RDNIS interface that goes up and down

2021-08-19 Thread Terry Barnaby
I have an AM335x based IOT system where the CPU can initially boot over 
its USB interface using RDNIS and TCPIP/DHCP/TFTP. I need to run a DHCP 
server on a Fedora33 host to support the DHCP requests from this CPU. In 
this scenario the CPU brings its USB RDNIS interface up and down about 3 
times.


In a old Fedora system (probably Fedora 23) this worked fine, however I 
am trying to update our boot system to run on Fedora33 and am having 
problems with the DHCP server. It has a basic /etc/dhcp/dhcpd.conf file 
that I have used before that dishes out IP addresses for the subnet 
10.0.0.0 netmask 255.255.255.0 (as below). I have configured a usb0 
interface with NetworkManager that uses a static IP address of 10.0.0.1.


I am having two problems:

1. If you try and start the DHCP server when the usb0 interface is down 
(not connected or CPU not driving a RDNIS interface) then the dhcpd 
server will not start with the error message "Not configured to listen 
on any interfaces!".


2. If the usb0 interface is up, the dhcp server runs and serves an IP 
address to the AM335x fine, but then the AM335x close and opens the USB0 
interface and I see the error message "receive_packet failed on usb0: 
Network is down" and the dhcp server no longer replies to requests on 
the newly come up usb0 interface.


I am guessing that at some time the dhcp server was changed to listen on 
specific network interfaces only rather than a standard host socket 
listening on all networks and that it does not try to reconnect to a 
network interface once that has gone down.


Does anyone know of a configuration option to get the dhcp server to 
listen on all network interfaces in a general way or to retry its 
network connection when the interface goes down and up ?


Or do I have to get NetworkManager to stop/start the DHCP server for 
this particular usb0 network interface (yuck!) ?


Terry



/etc/dhcp/dhcpd.conf

subnet 10.0.0.0 netmask 255.255.255.0 {
    default-lease-time    1209600;    # Two weeks
    max-lease-time        31557600;    # One year
    range            dynamic-bootp 10.0.0.128 10.0.0.250;
    option    subnet-mask        255.255.255.0;
    option    domain-name        "usbnet";
    option    nis-domain        "usbnet";
    option    nis-servers        10.0.0.1;
    option    domain-name-servers    10.0.0.1;
    option    ntp-servers        10.0.0.1;
    next-server            10.0.0.1;
    use-host-decl-names on;
    allow bootp;
    allow booting;

    if substring (option vendor-class-identifier, 0, 10) = "AM335x ROM" {
        filename "vlim/u-boot-spl.bin";
    }
    elsif substring (option vendor-class-identifier, 0, 10) = "DM814x 
ROM" {

        filename "vlim/u-boot-spl.bin";
    }
    elsif substring (option vendor-class-identifier, 0, 17) = "AM335x 
U-Boot SPL"     {

        filename "vlim/u-boot.img";
    }
    else {
        filename "uImage";
    }

}
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


Re: Cron sometimes starts two jobs from the same crontab entry

2020-04-09 Thread Terry Barnaby

On 09/04/2020 12:24, Ed Greshko wrote:

On 2020-04-09 17:57, Terry Barnaby wrote:

The script already has date/time and the log shows (The Wed 8th entries entry 
having the PID):

Bbackup / /usr/beam /home /src /srcOld /dist /opt /scratch /data/svn /data/www 
/data/vwt /data/www /data1/kvm /data/database /data/vwt /data/backup at Tue  7 
Apr 23:01:01 BST 2020
Bbackup / /usr/beam /home /src /srcOld /dist /opt /scratch /data/svn /data/www 
/data/vwt /data/www /data1/kvm /data/database /data/vwt /data/backup at Tue  7 
Apr 23:01:01 BST 2020
Backup aborted as locked by another backup

Just wondering, if you look at the journal output for that date and time do you 
see it being started twice?

For example, I have 2 jobs that start at 19:00 each day and for each time they 
are started I'd see...

Apr 09 19:00:01 meimei.greshko.com CROND[76409]: (egreshko) CMD 
(/home/egreshko/bin/getpics-vis)
Apr 09 19:00:01 meimei.greshko.com CROND[76411]: (egreshko) CMD 
(/home/egreshko/bin/getpics-vis2)

Are you getting 2 starts of the same script recorded?




No, the cron log in /var/log/cron only shows one for the backup log below:

Bbackup 805476 / /usr/beam /home /src /srcOld /dist /opt /scratch 
/data/svn /data/www /data/vwt /data/www /data1/kvm /data/database 
/data/vwt /data/backup at Wed  8 Apr 23:01:01 BST 2020
Bbackup 27200 / /usr/beam /home /src /srcOld /dist /opt /scratch 
/data/svn /data/www /data/vwt /data/www /data1/kvm /data/database 
/data/vwt /data/backup at Wed  8 Apr 23:01:01 BST 2020

Backup aborted as locked by another backup
Bbackup completed status: 0 at Thu  9 Apr 01:29:24 BST 2020

#/var/log/cron

Apr  8 22:01:01 beam CROND[804585]: (root) CMD (/src/bbackup/bbackup-test1)
Apr  8 22:01:01 beam CROND[804586]: (root) CMD (run-parts /etc/cron.hourly)
Apr  8 22:01:01 beam CROND[804583]: (root) CMDOUT (Starting Bbackup-test1)
Apr  8 22:01:01 beam run-parts[804586]: (/etc/cron.hourly) starting 0anacron
Apr  8 22:01:01 beam run-parts[804586]: (/etc/cron.hourly) finished 0anacron
Apr  8 23:01:01 beam CROND[805474]: (root) CMD (/src/bbackup/bbackup-test1)
Apr  8 23:01:01 beam CROND[805475]: (root) CMD (run-parts /etc/cron.hourly)
Apr  8 23:01:01 beam CROND[805476]: (root) CMD (/src/bbackup/bbackup-beam)
Apr  8 23:01:01 beam CROND[805471]: (root) CMDOUT (Starting Bbackup-test1)
Apr  8 23:01:01 beam run-parts[805475]: (/etc/cron.hourly) starting 0anacron
Apr  8 23:01:01 beam CROND[805473]: (root) CMDOUT (Waking up 
70:85:c2:0f:68:07...)

Apr  8 23:01:01 beam run-parts[805475]: (/etc/cron.hourly) finished 0anacron
Apr  8 23:01:21 beam CROND[805473]: (root) CMDOUT (Starting Backup)
Apr  9 00:01:01 beam CROND[806518]: (root) CMD (run-parts /etc/cron.hourly)
Apr  9 00:01:01 beam CROND[806519]: (root) CMD (/src/bbackup/bbackup-test1)

Strange low PID for second bbackup-beam script run ...

Ahggh found it! There was a VM running on our network of an older 
development Fedora platform based on an image of our old server and it 
still had its backup system running as the cron table entry was still there!


Sorry for all the noise.
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: Cron sometimes starts two jobs from the same crontab entry

2020-04-09 Thread Terry Barnaby

On 09/04/2020 07:00, francis.montag...@inria.fr wrote:

Hi

On Tue, 07 Apr 2020 07:07:36 +0100 Terry Barnaby wrote:


# Min Hour Day Month WeekDay
# Perform incremental backup to every work day
01 23 * * 1 root /src/bbackup/bbackup-beam
01 23 * * 2 root /src/bbackup/bbackup-beam
01 23 * * 3 root /src/bbackup/bbackup-beam
01 23 * * 4 root /src/bbackup/bbackup-beam
01 23 * * 5 root /src/bbackup/bbackup-beam
This system has been in use for 10 years or more on various Fedora
versions. However about 18 months ago I have seen a problem where cron
will start two backups with identical start times occasionally.

I have seen that also a few time, but years ago.


I have had to add a file lock system in the bbackup-beam to cope with this.

I did the same, also for frequent cron jobs that may be stuck for a
too long time, for example if a network outage occurs.

Then (years later) an alternative to cron appeared: systemd.timer.

Pros:
- Gratuitous execution locking
  "systemctl start X" is a noop if X runs.

- Ease log management
  By default in the system log.
  No more need to redirect stdout and stderr to /dev/null as seen
  in so many crontabs

- Ease tracking processes
  With "systemctl status" "systemctl stop" ...

Cons:
- It's systemd-ish :-)

I can show you how to convert your crontab to systemd.{service,timer}
if you want.

Many thanks. I do use systemd for some things, especially daemon 
management, but this is an old (was reliable) system and it is also used 
on some much older systems. The cron config is very simple, 
understandable and its configuration files are not lost in systemd's 
noise. I think I will keep with cron (while it is still available!) for 
now. As stated it is functions with the lock, I was just feeding back 
the issue in-case there is actually a problem in the system here.

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: Cron sometimes starts two jobs from the same crontab entry

2020-04-09 Thread Terry Barnaby

On 08/04/2020 23:11, Cameron Simpson wrote:

On 08Apr2020 14:54, Terry Barnaby  wrote:
Note this has happened a few times this year, (approx 1 in 64 x) so 
not related to DST changes anyway. Might be due to chrony clock 
resyncs I suppose but I think something stranger is going on here or 
something silly and obvious in what I am doing.


I will have a more detailed look at the system and my backup shell 
script and create a simple test cron job to see if it shows the same 
thing ...


How about sticking:

 echo "`date` [`date -u`]: start $0 $*" >>some_log_file

at the start. Makes sure the job is actually starting at the cron 
time.  Might tell you more about the system time situation.


Cheers,
Cameron Simpson 
___

The script already has date/time and the log shows (The Wed 8th entries 
entry having the PID):


Bbackup / /usr/beam /home /src /srcOld /dist /opt /scratch /data/svn 
/data/www /data/vwt /data/www /data1/kvm /data/database /data/vwt 
/data/backup at Tue  7 Apr 23:01:01 BST 2020
Bbackup / /usr/beam /home /src /srcOld /dist /opt /scratch /data/svn 
/data/www /data/vwt /data/www /data1/kvm /data/database /data/vwt 
/data/backup at Tue  7 Apr 23:01:01 BST 2020

Backup aborted as locked by another backup
Bbackup completed status: 0 at Wed  8 Apr 00:58:57 BST 2020
Bbackup 805476 / /usr/beam /home /src /srcOld /dist /opt /scratch 
/data/svn /data/www /data/vwt /data/www /data1/kvm /data/database 
/data/vwt /data/backup at Wed  8 Apr 23:01:01 BST 2020
Bbackup 27200 / /usr/beam /home /src /srcOld /dist /opt /scratch 
/data/svn /data/www /data/vwt /data/www /data1/kvm /data/database 
/data/vwt /data/backup at Wed  8 Apr 23:01:01 BST 2020

Backup aborted as locked by another backup
Bbackup completed status: 0 at Thu  9 Apr 01:29:24 BST 2020

I will change it to show down to the us.

Terry
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: Cron sometimes starts two jobs from the same crontab entry

2020-04-08 Thread Terry Barnaby

On 08/04/2020 14:37, Patrick O'Callaghan wrote:

On Wed, 2020-04-08 at 20:28 +0800, Ed Greshko wrote:

I don't know how it works where you live, but in the UK the autumn
switch occurs at 2am, which reverts back to being 1am. Thus a time such
as 1:30am occurs twice, and events programmed for that time can be
triggered twice.


BUT, we are specifically addressing the OP's situation and the time in his 
crontab ISN'T within the
time frame of when BST (the OP is in the UK) starts/ends.

Yes, that's clear. I was simply answering the general question (since
you said "to/from DST"), even though it doesn't apparently affect cron
as such, as documented in the manual.

poc
___


Note this has happened a few times this year, (approx 1 in 64 x) so not 
related to DST changes anyway. Might be due to chrony clock resyncs I 
suppose but I think something stranger is going on here or something 
silly and obvious in what I am doing.


I will have a more detailed look at the system and my backup shell 
script and create a simple test cron job to see if it shows the same 
thing ...


Terry
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: Cron sometimes starts two jobs from the same crontab entry

2020-04-07 Thread Terry Barnaby

On 07/04/2020 13:06, Iosif Fettich wrote:

Hi Terry,


Yes, there is nothing unusual in /var/log/cron:

Apr  6 22:01:01 beam CROND[651585]: (root) CMD (run-parts 
/etc/cron.hourly)

[...]


/var/log/messages

Feb 24 23:00:03 beam dhcpd[1743]: DHCPREQUEST for 192.168.201.214 
from 00:25:b3:e6:a9:18 via enp4s0

[...]


In the backup log:

Bbackup / /usr/beam /home /src /srcOld /dist /opt /scratch /data/svn 
/data/www /data/vwt /data/www /data1/kvm /data/database /data/vwt 
/data/backup at Mon  6 Apr 23:01:01 BST 2020

[...]

Maybe irrelevant, but just mentioning: the log extract from 
/var/log/messages is dated Feb 24, whereas the other two are dated 
Apr  6.


Best regards,

Iosif

Yes, missed that. I searched back for 23:01:00 and missed the fact that 
it was the wrong month. There was even less of interest in 
/var/log/messages around Apr  6 23:01:00, just some DHCP requests.


Terry
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: Cron sometimes starts two jobs from the same crontab entry

2020-04-07 Thread Terry Barnaby

On 07/04/2020 09:03, Samuel Sieb wrote:

On 4/6/20 11:07 PM, Terry Barnaby wrote:
This system has been in use for 10 years or more on various Fedora 
versions. However about 18 months ago I have seen a problem where 
cron will start two backups with identical start times occasionally.


I have had to add a file lock system in the bbackup-beam to cope with 
this.


Any idea on why cron might start the same job off twice at the same 
time ? Could there be a time change issue with chronyd ?


Have you checked the logs to see when the jobs were started and if 
there are any relevant messages?

___


Yes, there is nothing unusual in /var/log/cron:

Apr  6 22:01:01 beam CROND[651585]: (root) CMD (run-parts /etc/cron.hourly)
Apr  6 22:01:01 beam run-parts[651585]: (/etc/cron.hourly) starting 0anacron
Apr  6 22:01:01 beam run-parts[651585]: (/etc/cron.hourly) finished 0anacron
Apr  6 23:01:01 beam CROND[652722]: (root) CMD (/src/bbackup/bbackup-beam)
Apr  6 23:01:01 beam CROND[652721]: (root) CMD (run-parts /etc/cron.hourly)
Apr  6 23:01:01 beam run-parts[652721]: (/etc/cron.hourly) starting 0anacron
Apr  6 23:01:01 beam CROND[652720]: (root) CMDOUT (Waking up 
70:85:c2:0f:68:07...)

Apr  6 23:01:01 beam run-parts[652721]: (/etc/cron.hourly) finished 0anacron
Apr  6 23:01:21 beam CROND[652720]: (root) CMDOUT (Starting Backup)

I do note that /etc/cron.hourly is performed at a minute past the hour 
as well though ...


/var/log/messages

Feb 24 23:00:03 beam dhcpd[1743]: DHCPREQUEST for 192.168.201.214 from 
00:25:b3:e6:a9:18 via enp4s0
Feb 24 23:00:03 beam dhcpd[1743]: DHCPACK on 192.168.201.214 to 
00:25:b3:e6:a9:18 via enp4s0

Feb 24 23:00:05 beam systemd[1]: Starting system activity accounting tool...
Feb 24 23:00:05 beam systemd[1]: sysstat-collect.service: Succeeded.
Feb 24 23:00:05 beam systemd[1]: Started system activity accounting tool.
Feb 24 23:00:05 beam audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 
ses=4294967295 msg='unit=sysstat-collect comm="systemd" 
exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Feb 24 23:00:05 beam audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 
ses=4294967295 msg='unit=sysstat-collect comm="systemd" 
exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Feb 24 23:00:41 beam dhcpd[1743]: DHCPREQUEST for 192.168.201.241 from 
70:85:c2:0f:68:07 via enp4s0
Feb 24 23:00:41 beam dhcpd[1743]: DHCPACK on 192.168.201.241 to 
70:85:c2:0f:68:07 via enp4s0
Feb 24 23:00:59 beam dhcpd[1743]: DHCPREQUEST for 192.168.201.26 from 
84:1b:5e:70:2a:c9 via enp4s0
Feb 24 23:00:59 beam dhcpd[1743]: DHCPACK on 192.168.201.26 to 
84:1b:5e:70:2a:c9 via enp4s0
Feb 24 23:01:01 beam dhcpd[1743]: DHCPREQUEST for 192.168.201.250 from 
4c:60:de:e4:29:2d via enp4s0
Feb 24 23:01:01 beam dhcpd[1743]: DHCPACK on 192.168.201.250 to 
4c:60:de:e4:29:2d via enp4s0
Feb 24 23:01:20 beam dhcpd[1743]: DHCPREQUEST for 192.168.201.28 from 
00:50:c2:e2:f0:db via enp4s0
Feb 24 23:01:20 beam dhcpd[1743]: DHCPACK on 192.168.201.28 to 
00:50:c2:e2:f0:db via enp4s0
Feb 24 23:01:32 beam dhcpd[1743]: DHCPREQUEST for 192.168.201.8 from 
00:d8:61:a0:a4:06 via enp4s0
Feb 24 23:01:32 beam dhcpd[1743]: DHCPACK on 192.168.201.8 to 
00:d8:61:a0:a4:06 via enp4s0
Feb 24 23:02:21 beam dhcpd[1743]: DHCPREQUEST for 192.168.201.214 from 
00:25:b3:e6:a9:18 via enp4s0
Feb 24 23:02:21 beam dhcpd[1743]: DHCPACK on 192.168.201.214 to 
00:25:b3:e6:a9:18 via enp4s0
Feb 24 23:03:10 beam dhcpd[1743]: DHCPREQUEST for 192.168.201.241 from 
70:85:c2:0f:68:07 via enp4s0


In the backup log:

Bbackup / /usr/beam /home /src /srcOld /dist /opt /scratch /data/svn 
/data/www /data/vwt /data/www /data1/kvm /data/database /data/vwt 
/data/backup at Mon  6 Apr 23:01:01 BST 2020
Bbackup / /usr/beam /home /src /srcOld /dist /opt /scratch /data/svn 
/data/www /data/vwt /data/www /data1/kvm /data/database /data/vwt 
/data/backup at Mon  6 Apr 23:01:01 BST 2020

Backup aborted as locked by another backup
Bbackup completed status: 0 at Tue  7 Apr 11:12:00 BST 2020

The bbackup-beam shell script is pretty basic and I can't see how this 
could have an issue like this.


The same system has been running since 2009 without any issues. This 
problem started happening around Fedora27 or perhaps Fedora29 I think.


Terry
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: Cron sometimes starts two jobs from the same crontab entry

2020-04-07 Thread Terry Barnaby

On 07/04/2020 09:25, Iosif Fettich wrote:

Hi,


On 2020-04-07 14:07, Terry Barnaby wrote:
I have a simple backup system that starts off a backup once per 
night during the weekdays. There is a crontab file in /etc/cron.d 
with the following entries:


 


# Beam Bbackup cron setup   Backup to ...
 


#
# Min Hour Day Month WeekDay

# Perform incremental backup to every work day
01 23 * * 1 root /src/bbackup/bbackup-beam
01 23 * * 2 root /src/bbackup/bbackup-beam
01 23 * * 3 root /src/bbackup/bbackup-beam
01 23 * * 4 root /src/bbackup/bbackup-beam
01 23 * * 5 root /src/bbackup/bbackup-beam

This system has been in use for 10 years or more on various Fedora 
versions. However about 18 months ago I have seen a problem where 
cron will start two backups with identical start times occasionally.


I have had to add a file lock system in the bbackup-beam to cope 
with this.


Any idea on why cron might start the same job off twice at the same 
time ? Could there be a time change issue with chronyd ?


Are you sure that the jobs you saw started *twice*...?

If one of the jobs would not finish till next launch time, maybe just 
hanging around due to some error, you'll end up with multiple jobs out 
there, despite the fact that they started days apart.


Another possible issue might be the daylight saving time. If you're 
using a non-UTC timezone, there are datetimes that repeat or that are 
missing. 23:01 seems not to be among them, but what do I know...


Same thing might happen if there are periodical bigger time 
adjustements (NTP, manual, ...).


If it happens more than once that the system time is 23:01, launching 
the script each time is not only legit, but actually mandatory.


Best regards,

Iosif Fettich

I am pretty sure two jobs are started at the same time. The backup 
script. that is run. writes some logging text to a log file with the 
datetime it started. I see two entries with the same datetime and 
without the file lock the backups get messed up due to two operating at 
once. This "fault" has happened about 4 times in the last year.


I have assumed the system time is always UTC synchronised using chronyd. 
The servers user code is running under the GMT timezone. I was wondering 
if the tweaking of the time by chronyd could cause this issue, but I 
would have thought this situation would have been handled by crond if 
this could happen and I have seen this issue about 4 times in a year, so 
it is not a totally sporadic event.


Terry

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: Cron sometimes starts two jobs from the same crontab entry

2020-04-07 Thread Terry Barnaby

On 07/04/2020 08:21, Ed Greshko wrote:

On 2020-04-07 14:07, Terry Barnaby wrote:

I have a simple backup system that starts off a backup once per night during 
the weekdays. There is a crontab file in /etc/cron.d with the following entries:


# Beam Bbackup cron setup   Backup to ...

#
# Min Hour Day Month WeekDay

# Perform incremental backup to every work day
01 23 * * 1 root /src/bbackup/bbackup-beam
01 23 * * 2 root /src/bbackup/bbackup-beam
01 23 * * 3 root /src/bbackup/bbackup-beam
01 23 * * 4 root /src/bbackup/bbackup-beam
01 23 * * 5 root /src/bbackup/bbackup-beam

This system has been in use for 10 years or more on various Fedora versions. 
However about 18 months ago I have seen a problem where cron will start two 
backups with identical start times occasionally.

I have had to add a file lock system in the bbackup-beam to cope with this.

Any idea on why cron might start the same job off twice at the same time ? 
Could there be a time change issue with chronyd ?


I don't have an answer to your question.  But wonder if using just

01 23 * * 1-5 root /src/bbackup/bbackup-beam

would help avoid the problem.

I assume the system is up 24/7?


___


Yes, server is up 24/7.

Well I could try that, but I already have a workaround.

The reason for asking is to find out if I am doing something wrong, if 
this is a known "feature" or is a bug in Fedora31. It could be a nasty 
little bug for some.

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Cron sometimes starts two jobs from the same crontab entry

2020-04-07 Thread Terry Barnaby
I have a simple backup system that starts off a backup once per night 
during the weekdays. There is a crontab file in /etc/cron.d with the 
following entries:



# Beam Bbackup cron setup   Backup to ...

#
# Min Hour Day Month WeekDay

# Perform incremental backup to every work day
01 23 * * 1 root /src/bbackup/bbackup-beam
01 23 * * 2 root /src/bbackup/bbackup-beam
01 23 * * 3 root /src/bbackup/bbackup-beam
01 23 * * 4 root /src/bbackup/bbackup-beam
01 23 * * 5 root /src/bbackup/bbackup-beam

This system has been in use for 10 years or more on various Fedora 
versions. However about 18 months ago I have seen a problem where cron 
will start two backups with identical start times occasionally.


I have had to add a file lock system in the bbackup-beam to cope with this.

Any idea on why cron might start the same job off twice at the same time 
? Could there be a time change issue with chronyd ?


Terry
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: Please add OpenFOAM to the Fedora repositories.

2019-12-20 Thread Terry Barnaby

On 17/12/2019 17:21, Richard Shaw wrote:
Since I did a *LITTLE* CFD in college I decided to take a look and as 
I expected OpenFOAM will likely not be easy to package...


1. It's designed to install to $HOME or /opt (but I believe that can 
be modified).
2. It uses a custom build system (wmake) and the documentation seems 
to indicate that it builds and installs all in one shot. Not a 
roadblock but that does conflict with the %build %install process we 
use in spec files.


There's other smaller issues (for how little I looked into it, may be 
more big problems).


As the packaging would be non-trivial, someone who uses the package 
would be the best maintainer for said package.


Thanks,
Richard

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


I have been using OpenFOAM on Fedora and Redhat/Centos7 for many years 
using an RPM who's spec file was available somewhere. I had modified the 
SPEC file for the OpenFOAM 2.3.0 version. It hasn't been updated for 
some time. I suspect there is a later version RPM spec file somewhere, 
if not I could dig out the old SRPM I have if someone wants to generate 
an up-to date RPM for Fedora.


Terry

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: Fedora31: grub2 issues with grub2-mount using 100% CPU

2019-12-12 Thread Terry Barnaby

On 12/12/2019 12:06, ja wrote:

On Thu, 2019-12-12 at 10:01 +, Terry Barnaby wrote:

On 12/12/2019 06:33, Terry Barnaby wrote:

I have just started to try out Fedora31 on some of our systems.

I am using a bit of an unusual, manual install method copying an image
of the rootfs to the disk and configuring this, (this may be related
to my issue, but I have been using the same system for 6 years or
more). The particular hardware platform is a bit complicated using a
PCIe card with NVMe disk and two SATA disks. It has to boot (BIOS boot
and /boot) from one of the SATA disks and then uses the rootfs system
on the NVMe disk. I don't think that is related to the the issue though.

The issue is that when I run "grub2-mkconfig -o /boot/grub2/grub.cfg"
to manually configure grub2, this process hangs. There is a process
grub2-mount that is sitting there using 100% of a CPU core. Not sure
what this process does (seems new after Fedora29) but it is passed the
/dev entry for a disk partition some presumably mounts the file system
and "probes" for what OS is installed there. The file system it hangs
on (if you kill the grub2-mount process it will hang on another
drive), is easily mountable without any issues.

It seems like there is some bug in this grub2-mount program. Has
anyone else seen this or have any ideas what might be the issue.


A bit more digging. It seems the issue is seen within os-prober.
Tracking this down:
I am running Fedora29 with the 5.3.11-100.fc29.x86_64 kernel.

I have a Fedora31 rootsfs on the partition /dev/nvme0n1p2 and have
chroot'ed to this (with some /sys, /dev mounts etc). (Fedora31 would be
running the vmlinuz-5.3.15-300.fc31.x86_64 kernel).
Within the os-prober it effectively does the following at some point:
grub2-mount /dev/nvme0n1p2 /var/lib/os-prober/mount
ls /var/lib/os-prober/mount/lib*/lib*.so*

If I run:
mount /dev/nvme0n1p2 /var/lib/os-prober/mount
time ls /var/lib/os-prober/mount/lib*/lib*.so* > /dev/null
real0m0.089s
user0m0.028s
sys 0m0.044s

If I run:

grub2-mount /dev/nvme0n1p2 /var/lib/os-prober/mount
time ls /var/lib/os-prober/mount/lib*/lib*.so* > /dev/null
real0m59.593s
user0m0.190s
sys 0m0.254s

grub2-mount appears to be a FUSE driver to mount various file system
types. I am guessing that there is either a bug in grub2-mount or there
is a kernel level incompatibility (FUSE API) with the slightly older
5.3.11-100.fc29.x86_64 kernel or it is very very very very inefficient.
A minute of 100% of a CPU core to list a directory on a fast NVMe SSD!
Talk about bloatware and what about climate change folks ?

I guess this my issue in that grub2-mkconfig will take a "really" long
time on my system with largely populated rootfs (It has all of the RPM
package for development that I use so is about 28 GBytes) and this
system has an SSD and two SATA disks with multiple partitions and
operating systems.


Terry


Maybe

https://bugzilla.redhat.com/show_bug.cgi?id=1744693

John


___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Thanks John.

Looks like the last bits of that bug are related. I had already added 
bug:  https://bugzilla.redhat.com/show_bug.cgi?id=1782773 for the slowness.
As a hack, if you change to use "mount" instead of "grub2-mount" then 
grub2-mkconfig all seems to work ok and is fast:

mv /usr/bin/grub2-mount /usr/bin/grub2-mount-orig
ln -sf mount /usr/bin/grub2-mount

Terry
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: Fedora31: grub2 issues with grub2-mount using 100% CPU

2019-12-12 Thread Terry Barnaby

On 12/12/2019 06:33, Terry Barnaby wrote:

I have just started to try out Fedora31 on some of our systems.

I am using a bit of an unusual, manual install method copying an image 
of the rootfs to the disk and configuring this, (this may be related 
to my issue, but I have been using the same system for 6 years or 
more). The particular hardware platform is a bit complicated using a 
PCIe card with NVMe disk and two SATA disks. It has to boot (BIOS boot 
and /boot) from one of the SATA disks and then uses the rootfs system 
on the NVMe disk. I don't think that is related to the the issue though.


The issue is that when I run "grub2-mkconfig -o /boot/grub2/grub.cfg" 
to manually configure grub2, this process hangs. There is a process 
grub2-mount that is sitting there using 100% of a CPU core. Not sure 
what this process does (seems new after Fedora29) but it is passed the 
/dev entry for a disk partition some presumably mounts the file system 
and "probes" for what OS is installed there. The file system it hangs 
on (if you kill the grub2-mount process it will hang on another 
drive), is easily mountable without any issues.


It seems like there is some bug in this grub2-mount program. Has 
anyone else seen this or have any ideas what might be the issue.




A bit more digging. It seems the issue is seen within os-prober. 
Tracking this down:

I am running Fedora29 with the 5.3.11-100.fc29.x86_64 kernel.

I have a Fedora31 rootsfs on the partition /dev/nvme0n1p2 and have 
chroot'ed to this (with some /sys, /dev mounts etc). (Fedora31 would be 
running the vmlinuz-5.3.15-300.fc31.x86_64 kernel).

Within the os-prober it effectively does the following at some point:
grub2-mount /dev/nvme0n1p2 /var/lib/os-prober/mount
ls /var/lib/os-prober/mount/lib*/lib*.so*

If I run:
mount /dev/nvme0n1p2 /var/lib/os-prober/mount
time ls /var/lib/os-prober/mount/lib*/lib*.so* > /dev/null
real    0m0.089s
user    0m0.028s
sys 0m0.044s

If I run:

grub2-mount /dev/nvme0n1p2 /var/lib/os-prober/mount
time ls /var/lib/os-prober/mount/lib*/lib*.so* > /dev/null
real    0m59.593s
user    0m0.190s
sys 0m0.254s

grub2-mount appears to be a FUSE driver to mount various file system 
types. I am guessing that there is either a bug in grub2-mount or there 
is a kernel level incompatibility (FUSE API) with the slightly older 
5.3.11-100.fc29.x86_64 kernel or it is very very very very inefficient. 
A minute of 100% of a CPU core to list a directory on a fast NVMe SSD! 
Talk about bloatware and what about climate change folks ?


I guess this my issue in that grub2-mkconfig will take a "really" long 
time on my system with largely populated rootfs (It has all of the RPM 
package for development that I use so is about 28 GBytes) and this 
system has an SSD and two SATA disks with multiple partitions and 
operating systems.



Terry

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Fedora31: grub2 issues with grub2-mount using 100% CPU

2019-12-11 Thread Terry Barnaby

I have just started to try out Fedora31 on some of our systems.

I am using a bit of an unusual, manual install method copying an image 
of the rootfs to the disk and configuring this, (this may be related to 
my issue, but I have been using the same system for 6 years or more). 
The particular hardware platform is a bit complicated using a PCIe card 
with NVMe disk and two SATA disks. It has to boot (BIOS boot and /boot) 
from one of the SATA disks and then uses the rootfs system on the NVMe 
disk. I don't think that is related to the the issue though.


The issue is that when I run "grub2-mkconfig -o /boot/grub2/grub.cfg" to 
manually configure grub2, this process hangs. There is a process 
grub2-mount that is sitting there using 100% of a CPU core. Not sure 
what this process does (seems new after Fedora29) but it is passed the 
/dev entry for a disk partition some presumably mounts the file system 
and "probes" for what OS is installed there. The file system it hangs on 
(if you kill the grub2-mount process it will hang on another drive), is 
easily mountable without any issues.


It seems like there is some bug in this grub2-mount program. Has anyone 
else seen this or have any ideas what might be the issue.

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: Smallest Fedora box to use as gateway/firewall/VPN

2019-01-09 Thread Terry Barnaby

On 09/01/2019 08:19, John Harris wrote:

On Wednesday, January 9, 2019 3:14:25 AM EST Terry Barnaby wrote:

I know you asked for Fedora, but a standard, low cost router, running
OpenWRT, https://openwrt.org/, would likely be better for the tasks you
mention. OpenWRT is a minimal Linux system with the ability to install
extra packages. It has a simple to use WEB admin system and can do all
the things you mention.

I cannot think of any reason not to use ones distro of choice as their gateway
and/or VPN. I personally use a system Fedora (well, Fedora + Freed-ora-
freedom) for my router and VPN. OpenWRT is not inherently better than Fedora,
and there are many benefits of using Fedora over OpenWRT.

I agree there are pros in using a system you know and use on as many 
things as possible. I use Fedora on multiple servers, workstation, 
webservers, backup servers etc. However there are a few cons in use 
Fedora for such tasks, my particular cons for this task are:


1. Fedora is big and bloated for small/low powered hardware that can be 
used for this task and low energy usage is important in my opinion for 
24/7 systems.


2. Fedora is complex for such a task.

3. Fedora hasn't a simple web interface to manage the particular 
functionality that a simple router like device needs.


4. Fedora's aggressive new "feature" release cycle is painful for such 
low level infrastructure.


5. Other Linux systems have been designed to easily install on small 
router like hardware easily and be easily used. As long as it is 
OpenSource and Linux most of someone's knowledge of Fedora will be 
applicable.


Terry
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: Smallest Fedora box to use as gateway/firewall/VPN

2019-01-09 Thread Terry Barnaby
I know you asked for Fedora, but a standard, low cost router, running 
OpenWRT, https://openwrt.org/, would likely be better for the tasks you 
mention. OpenWRT is a minimal Linux system with the ability to install 
extra packages. It has a simple to use WEB admin system and can do all 
the things you mention.


I use cheap (£20 second hand on ebay) TP-Link TL-WDR3600 v1 routers and 
OpenWRT 18.06 at work and home. This particular router has 5 x 1Gbit 
Ethernet ports, Wifi (2.4 and 5GHz), 2 USB ports and has efficient use 
of power. Can connect to cable/FTTP/FTTC "modems" if needed etc. There 
are many other hardware platforms that would work with OpenWRT but this 
one works well and has a good amount of FLASH/RAM.


Terry

On 08/01/2019 16:09, Alex wrote:

Hi,
I need a gateway for our new office. I'd like it to run Fedora. What
are my options? I'd like to be able to do the following:

   - provide VPN back to the main office
   - provide basic masquerading of hosts on inside network
   - be small enough to fit on a shelf. Preferably fanless
   - web-based administration
   - ssh access

We're experienced admins, so a simple interface isn't specifically
necessary, but desired.

It's only for a few remote office workers, so it doesn't have to be
particularly powerful, but should be responsive enough to support
regular ssh and VPN activity.

Thanks,
Alex
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: Server goes catatonic after a few days

2019-01-08 Thread Terry Barnaby
You could try the program memtester "dnf install memtester" "memtester 
1g". This is a user level memory tester.


I also have a server that occasionally dies. It started doing this late 
last year under Fedora27. I wasn't sure if it was a particular kernel 
change or hardware, but I replaced the motherboard/memory/CPU at that 
time as it was about 5 years old. However it has still crashed 
occasionally with the new hardware and with a fresh install of Fedora29.


The latest /var/log/messages entry when it crashed was:

Jan 8 10:42:52 king mosquitto[1435]: 1546944172: New connection from 
192.168.202.30 on port 1883.


Jan 8 10:42:52 king mosquitto[1435]: 1546944172: New client connected 
from 192.168.202.30 as DVES_00B2F8 (c1, k10, u'DVES_USER').


Jan 8 10:43:13 king mosquitto[1435]: 1546944193: Client DVES_00B2F8 has 
exceeded timeout, disconnecting.


Jan 8 10:43:13 king mosquitto[1435]: 1546944193: Socket error on client 
DVES_00B2F8, disconnecting.


#Jan 
8 18:03:39 king kernel: microcode: microcode updated early to revision 
0xc6, date = 2018-04-17


Jan 8 18:03:39 king kernel: Linux version 4.19.9-300.fc29.x86_64 
(mockbu...@bkernel03.phx2.fedoraproject.org) (gcc version 8.2.1 20181105 
(Red Hat 8.2.1-5) (GCC)) #1 SMP Thu Dec 13 17:25:01 UTC 2018


Jan 8 18:03:39 king kernel: Command line: 
BOOT_IMAGE=/boot/vmlinuz-4.19.9-300.fc29.x86_64 
root=UUID=5d3007f8-fa92-4fe6-98a8-e812b680198f ro rd.auto LANG=en_GB.UTF-8


Jan 8 18:03:39 king kernel: x86/fpu: Supporting XSAVE feature 0x001: 
'x87 floating point registers'


Jan 8 18:03:39 king kernel: x86/fpu: Supporting XSAVE feature 0x002: 
'SSE registers'


Jan 8 18:03:39 king kernel: x86/fpu: Supporting XSAVE feature 0x004: 
'AVX registers'


Jan 8 18:03:39 king kernel: x86/fpu: Supporting XSAVE feature 0x008: 
'MPX bounds registers'


Jan 8 18:03:39 king kernel: x86/fpu: Supporting XSAVE feature 0x010: 
'MPX CSR'


The "#" were in fact  (0x00) bytes which is strange.


I do wonder if there is an obtuse kernel bug somewhere. This server has 
a Intel(R) Core(TM) i3-6100 CPU @ 3.70GHz and is doing DVB recording 
amoungst other work. Other servers I have though seem fine.


Terry

On 06/01/2019 22:15, Alex wrote:

Hi,
I have a fedora29 system in our colo that's a few years old now and
just goes catatonic and stops responding after a few days. It's
happened a few times now, even with different kernels, so I suspect
it's a memory or hardware problem.

Is it possible to run memtest without having physical access to the
machine to insert a USB stick or CDROM?

After the machine reboots (via IPMI access), there's nothing in the
logs and no abrt-cli info on a kernel crash or other info I can find
about why it died.

What else can I do to troubleshoot this without having to drive to the
colo to check on it?

The last entry from journalctl just before it stopped responding was
just a regular nrpe entry, unrelated to the crash.

I've pasted the current dmesg output here:
http://pasted.co/4b700ee1

Any ideas greatly appreciated.
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


F29: authselect with nis losing mdns4_minimal [NOTFOUND=return], breaks printing

2018-12-20 Thread Terry Barnaby
Having updated to F29 from F27 (which seems to be generally working well 
:) ), I have one issue with our systems due to the new authselect system.


When I enable the nis (YP) system for logins the /etc/nsswitch.conf file 
loses the "mdns4_minimal [NOTFOUND=return]" for hosts searches. One 
result of this is that when you try and add a printer (at least using 
KDE/PLASMA), and the printer is discovered using DNS_SD, then at the 
final adding stage the printer is not actually added (does not appear in 
the list of printers etc). There are no error messages to the user 
(always a bad failing of Fedora/Linux/MsWindows in general), but looking 
under the hood it is due to part of the CUPS system not being able to 
resolve the .local network host address as mdns4_minimal is 
not being used.


I have been setting up nis using: "authconfig --enablenis 
--nisdomain= --nisserver= --update". This appears to 
call authselect as:


"authselect select nis with-fingerprint with-silent-lastlog --force"

So should I be using a different, direct, authselect command to fix 
this, or is it a bug in the authselect system ?


___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: Fedora27: Ethernet interfaces set to 100MBits/sec half duplex

2018-02-13 Thread Terry Barnaby



On Tue, Feb 13, 2018 at 1:16 AM, Terry Barnaby <ter...@beam.ltd.uk> wrote:

On 12/02/18 21:51, Ed Greshko wrote:

On 02/13/18 05:43, Stephen Morris wrote:

I am using a home plug device to get ethernet access across the home
electrical
wires. The home plug device is provide 500 Mb/s, so having seen this
thread I've
checked my ethernet configuration and like Terry is saying my settings
have auto
negotiate unchecked and the link speed is set at 100 Mb/s and Half
Duplex. I have
not explicitly set that configuration but what I don't know, because I
haven't
really taken any notice of it as I only use this connection as a backup
to
wireless, is whether or not those settings have always been there. I have
done any
changes to the configuration since I set it up in F26.


One can always use

ethtool  to determine what is available and what the current
settings are...

This is the view from the Fedora side

This is what the Fedora side it telling the outside world what the HW
supports

  Advertised link modes:  10baseT/Half 10baseT/Full
  100baseT/Half 100baseT/Full
  1000baseT/Half 1000baseT/Full

This is what the device that the Fedora system is connected to is  saying
what it
supports.


  Link partner advertised link modes:  10baseT/Half 10baseT/Full
   100baseT/Half 100baseT/Full
   1000baseT/Full

FWIW, I have never seen an advertised link speed of 500 Mb/s.


Yes, I used ethtool to find out what was happening after the performance of
NFS went down to 10 MBytes/s.

I think what has happened is:

1. I updated these systems (5 off) from F25. This was a clean/new install
but some configuration files were copied from the previous systems.

2. The /etc/sysconfig/network-scripts/ifcfg-Wired_connection_1 file (or
appropriate named one) was copied from the previous system.

3. These files did not have the new ETHTOOL_OPTS="autoneg on" entry.

4. All was working fine (Gigabit full duplex) for a month or so until
someone/something updated the ifcfg file. If you use the KDE-Plasma
NetworkManager settings tool and it sees no ETHTOOL_OPTS entry, it sets the
GUI settings to manual 100Mbits/s half duplex rather than auto. So if you
don't notice this and save the settings the Ethernet will be set to this. I
will try and enter a bug report for this somewhere.

In my case I don't believe I changed the Ethernet settings using the
KDE-Plasma NetworkManager settings tool, certainly not on all of the
systems. I think something else may have written the ETHTOOL_OPTS="100mbps"
entry somehow with an RPM update within the last week.


___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org

On 13/02/18 14:53, Roger Heflin wrote:

If auto neg is set on one end, and not on the other end, then the
standards says set things to 100mb/half (for gbit cards) since you
were unable to get information from the other end I believe this was
judged the most likely to work reasonably by the people who write the
standard.  On 100mbit adapters the fail-safe was 10mb/half.

If the device on the other end is not doing auto-neg then I would
expect 100/half.

Either set the other end to auto-neg or figure out what the other end
does and set the computer end to match how it is explicitly set (if
not auto-neg).

If both ends are set to auto-neg and you are getting 100/half then
something is being detected to be wrong with the wiring and both ends
are using what they believe will run on the given wiring.  This could
be broken wires, badly terminated wires, or not quite plugged in
right, or a number of other things.
In my case it is the KDE-PLASMA/NetworkManager configuration that was 
forcing 100 MBits/s half duplex at the Linux end. There are 
configuration parameters for this now in the KDE-PLASMA applet. Its just 
that these default to 100 MBits/s half duplex when there is no setting 
in the ifcfg-* file rather than a more useful "auto negotiate" setting. 
So if you are using older ifcfg-* files from previous Linux system 
versions, you might have the same issue as i did.

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Re: Fedora27: Ethernet interfaces set to 100MBits/sec half duplex

2018-02-12 Thread Terry Barnaby

On 12/02/18 21:51, Ed Greshko wrote:

On 02/13/18 05:43, Stephen Morris wrote:

I am using a home plug device to get ethernet access across the home electrical
wires. The home plug device is provide 500 Mb/s, so having seen this thread I've
checked my ethernet configuration and like Terry is saying my settings have auto
negotiate unchecked and the link speed is set at 100 Mb/s and Half Duplex. I 
have
not explicitly set that configuration but what I don't know, because I haven't
really taken any notice of it as I only use this connection as a backup to
wireless, is whether or not those settings have always been there. I have done 
any
changes to the configuration since I set it up in F26.


One can always use

ethtool  to determine what is available and what the current 
settings are...

This is the view from the Fedora side

This is what the Fedora side it telling the outside world what the HW supports

     Advertised link modes:  10baseT/Half 10baseT/Full
     100baseT/Half 100baseT/Full
     1000baseT/Half 1000baseT/Full

This is what the device that the Fedora system is connected to is  saying what 
it
supports.


     Link partner advertised link modes:  10baseT/Half 10baseT/Full
  100baseT/Half 100baseT/Full
  1000baseT/Full

FWIW, I have never seen an advertised link speed of 500 Mb/s.


Yes, I used ethtool to find out what was happening after the performance 
of NFS went down to 10 MBytes/s.


I think what has happened is:

1. I updated these systems (5 off) from F25. This was a clean/new 
install but some configuration files were copied from the previous systems.


2. The /etc/sysconfig/network-scripts/ifcfg-Wired_connection_1 file (or 
appropriate named one) was copied from the previous system.


3. These files did not have the new ETHTOOL_OPTS="autoneg on" entry.

4. All was working fine (Gigabit full duplex) for a month or so until 
someone/something updated the ifcfg file. If you use the KDE-Plasma 
NetworkManager settings tool and it sees no ETHTOOL_OPTS entry, it sets 
the GUI settings to manual 100Mbits/s half duplex rather than auto. So 
if you don't notice this and save the settings the Ethernet will be set 
to this. I will try and enter a bug report for this somewhere.


In my case I don't believe I changed the Ethernet settings using the 
KDE-Plasma NetworkManager settings tool, certainly not on all of the 
systems. I think something else may have written the 
ETHTOOL_OPTS="100mbps" entry somehow with an RPM update within the last 
week.


___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Fedora27: Ethernet interfaces set to 100MBits/sec half duplex

2018-02-12 Thread Terry Barnaby
I have just noticed that most of my systems now have their Ethernet 
interfaces running at 100 MBits/s half duplex rather than the expected 
1GBits/s.


I think some update has caused this to happen, probably about 5 days ago 
(noticed something was slow). These are KDE/Plasma GUI systems but I'm 
not sure if this is due to a change in KDE/Plasma, NetworkManager or 
something else.


The KDE/Plasma NeworkManager settings GUI interface now has an "Allow 
auto-negotiation" checkbox which I don't think was there before and 
there is now the entry ETHTOOL_OPTS="autoneg on" in the 
/etc/sysconfig/network-scripts/ifcfg-* file if you set it. It appears 
that the default setting is 100MBits/s half duplex rather than auto 
negotiate...

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Re: Fedora27: Cannot set the default network route

2018-02-08 Thread Terry Barnaby

Hi,

Thanks for the info, not sure i will remember that complex method though !
A bit strange that changing the "default" route has to be done on a 
particular connection though rather than as a system route change.
If NetworkManager is now managing routes like this would be nice if it 
had a "route" command matching the system "ip route" command.


Anyway thanks for the info.

Terry
On 06/02/18 22:07, Rick Stevens wrote:

On 02/06/2018 01:16 PM, Terry Barnaby wrote:

On 06/02/18 20:21, James Hogarth wrote:

On 3 February 2018 at 22:20, Terry Barnaby <ter...@beam.ltd.uk> wrote:

On 02/02/18 16:40, Bill Shirley wrote:

You didn't post the command or its output.  How can anyone help you?

What's the output of these two commands?
ip -o -4 addr
ip -o -4 route

Bill

ip -o -4 addr
1: lo    inet 127.0.0.1/8 scope host lo\   valid_lft forever
preferred_lft forever
2: enp2s0    inet 192.168.202.2/24 brd 192.168.202.255 scope global
dynamic
enp2s0\   valid_lft 1205223sec preferred_lft 1205223sec

ip -o -4 route
default via 192.168.202.1 dev enp2s0 proto static metric 100
192.168.202.0/24 dev enp2s0 proto kernel scope link src 192.168.202.2
metric
100

These are when the route is up normally after a DHCP.
The system is fine normally, its just that I wanted to manually
change the
default route to test a different router.
I have managed to do this now by hardcoding the route on the next boot.
I think the issue must be NetworkManager doing something more than it
used
to.


Since NetworkManager was?is managing that interface did you try using
nmcli conn modify or nmcli con edit to set the route in the connection
profile?

https://www.hogarthuk.com/?q=node/8
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org

No, I didn't know you could change the default route with nmcli and it's
not obvious in the man page how to do this. Will have a look to see how
to do that.

Certainly up to Fedora25 changing the default route, temporarily, with
"route del default; route add default ..." worked.

First, get a list of the connections, such as:

sudo nmcli connection showOR
sudo nmcli connection show --active (for only active ones)

and locate the connection you wish to modify, then:

sudo nmcli connection modify --temporary  gateway 

should change it temporarily. If you omit the "--temporary", it should
make a permanent change.

The command is buried in the nmcli man page, but the parameters are
hidden in the nm-settings(5) man page under the "ipv4" section.
--
- Rick Stevens, Systems Engineer, AllDigitalri...@alldigital.com -
- AIM/Skype: therps2ICQ: 22643734Yahoo: origrps2 -
--
-  I won't rise to the occasion, but I'll slide over to it.  -
--
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Re: Fedora27: Cannot set the default network route

2018-02-06 Thread Terry Barnaby

On 06/02/18 20:21, James Hogarth wrote:

On 3 February 2018 at 22:20, Terry Barnaby <ter...@beam.ltd.uk> wrote:

On 02/02/18 16:40, Bill Shirley wrote:

You didn't post the command or its output.  How can anyone help you?

What's the output of these two commands?
ip -o -4 addr
ip -o -4 route

Bill

ip -o -4 addr
1: loinet 127.0.0.1/8 scope host lo\   valid_lft forever
preferred_lft forever
2: enp2s0inet 192.168.202.2/24 brd 192.168.202.255 scope global dynamic
enp2s0\   valid_lft 1205223sec preferred_lft 1205223sec

ip -o -4 route
default via 192.168.202.1 dev enp2s0 proto static metric 100
192.168.202.0/24 dev enp2s0 proto kernel scope link src 192.168.202.2 metric
100

These are when the route is up normally after a DHCP.
The system is fine normally, its just that I wanted to manually change the
default route to test a different router.
I have managed to do this now by hardcoding the route on the next boot.
I think the issue must be NetworkManager doing something more than it used
to.


Since NetworkManager was?is managing that interface did you try using
nmcli conn modify or nmcli con edit to set the route in the connection
profile?

https://www.hogarthuk.com/?q=node/8
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


No, I didn't know you could change the default route with nmcli and it's 
not obvious in the man page how to do this. Will have a look to see how 
to do that.


Certainly up to Fedora25 changing the default route, temporarily, with 
"route del default; route add default ..." worked.

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Re: Fedora27: Cannot set the default network route

2018-02-03 Thread Terry Barnaby

On 02/02/18 16:40, Bill Shirley wrote:

You didn't post the command or its output.  How can anyone help you?

What's the output of these two commands?
ip -o -4 addr
ip -o -4 route

Bill

ip -o -4 addr
1: lo    inet 127.0.0.1/8 scope host lo\   valid_lft forever 
preferred_lft forever
2: enp2s0    inet 192.168.202.2/24 brd 192.168.202.255 scope global 
dynamic enp2s0\   valid_lft 1205223sec preferred_lft 1205223sec


ip -o -4 route
default via 192.168.202.1 dev enp2s0 proto static metric 100
192.168.202.0/24 dev enp2s0 proto kernel scope link src 192.168.202.2 
metric 100


These are when the route is up normally after a DHCP.
The system is fine normally, its just that I wanted to manually change 
the default route to test a different router.

I have managed to do this now by hardcoding the route on the next boot.
I think the issue must be NetworkManager doing something more than it 
used to.
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Re: Fedora27: Cannot set the default network route

2018-02-02 Thread Terry Barnaby

I tied using "ip route" it had the same effect.

On 02/02/18 14:21, Bill Shirley wrote:

Use 'ip' and add the dev parameter:
ip route add default 173.xxx.yyy.zzz dev ccast

man route:
NOTE
   This program is obsolete. For replacement check ip route.

Bill

On 2/2/2018 8:42 AM, Terry Barnaby wrote:
A strange one this. I was trying to change the default route of a 
machine for testing a different gateway.


I ran:

route del default gw 

route add default gw 

However the second command did not add the route and I could not add 
the old default route back. There were no errors on stdout or in 
/var/log/messages.


Now I found that if there is already a default route you can add 
another one, so could do a:


route add default gw 

route del default gw 

And these worked fine. It seems you cannot set a default route if 
there isn't one set.


Any ideas ?

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org




___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org



___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Fedora27: Cannot set the default network route

2018-02-02 Thread Terry Barnaby
A strange one this. I was trying to change the default route of a 
machine for testing a different gateway.


I ran:

route del default gw 

route add default gw 

However the second command did not add the route and I could not add the 
old default route back. There were no errors on stdout or in 
/var/log/messages.


Now I found that if there is already a default route you can add another 
one, so could do a:


route add default gw 

route del default gw 

And these worked fine. It seems you cannot set a default route if there 
isn't one set.


Any ideas ?

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Re: fedora27: ypbind intermittent startup

2018-02-01 Thread Terry Barnaby

On 02/02/18 07:42, Terry Barnaby wrote:

On 02/02/18 00:41, Ed Greshko wrote:


I've not tried this since I don't have a need for ypbind.

One may also consider copying /lib/systemd/system/ypbind.service to
/etc/systemd/system and then inserting the line,

ExecStartPre=/usr/bin/sleep 5

 From the systemd documentation

ExecStart= commands are only run after all ExecStartPre= commands 
that were not

prefixed with a "-" exit successfully.

Thanks I will try this out, but it would be nice to get it fixed 
properly in Fedora27.


Could it be due to "target Network is Online" being triggered by IPv6 
being configured first ? I don't use IPv6, but I guess the system 
could be setting this up somehow by default.

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Actually I just noticed in the KDE/Plasma network manager settings there 
is now a "IPv4 is required for this connection" setting. From the 
tooltip which states "Allows the connection to complete if IPv4 
configuration fails but IPv6 configuration succeeds". This tooltip 
appears to be written in the negative, but assuming that actual buttons 
text is correct then this may do what I need. I will try that. The 
Ethernet interface appears to have an IPv6 config, not sure where that 
is coming from.

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Re: fedora27: ypbind intermittent startup

2018-02-01 Thread Terry Barnaby

On 02/02/18 00:41, Ed Greshko wrote:


I've not tried this since I don't have a need for ypbind.

One may also consider copying /lib/systemd/system/ypbind.service to
/etc/systemd/system and then inserting the line,

ExecStartPre=/usr/bin/sleep 5

 From the systemd documentation

ExecStart= commands are only run after all ExecStartPre= commands that were not
prefixed with a "-" exit successfully.

Thanks I will try this out, but it would be nice to get it fixed 
properly in Fedora27.


Could it be due to "target Network is Online" being triggered by IPv6 
being configured first ? I don't use IPv6, but I guess the system could 
be setting this up somehow by default.

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


fedora27: ypbind intermittent startup

2018-02-01 Thread Terry Barnaby
I am finding on my systems that ypbind is failing occasionally at boot 
(about 30% of the time).


[  OK  ] Started Network Manager Wait Online.
[  OK  ] Reached target Network is Online.
 Mounting /src...
 Mounting /scratch...
 Starting NIS/YP (Network Information Service) Clients to NIS 
Domain Binder...

 Mounting /home...
 Mounting /opt...
 Mounting /dist...
 Starting Notify NFS peers of a restart...
 Starting Beam BOAP Name Server...
 Mounting /usr/beam...
 Mounting /var/cache/dnf...
[  OK  ] Started Notify NFS peers of a restart.
[  OK  ] Started Beam BOAP Name Server.
[FAILED] Failed to start NIS/YP (Network Information Service) Clients to 
NIS Domain Binder.

See 'systemctl status ypbind.service' for details.
[  OK  ] Reached target User and Group Name Lookups.

The error is:

Feb  1 10:34:36 beam1 ypbind[788]: No NIS server and no -broadcast 
option specified.
Feb  1 10:34:36 beam1 ypbind[788]: Add a NIS server to the /etc/yp.conf 
configuration file,
Feb  1 10:34:36 beam1 ypbind[788]: or start ypbind with the -broadcast 
option.


But /etc/yp.conf has:

# generated by /sbin/dhclient-script
domain beamnet server 192.168.201.1

ypbind starts fine after the system has booted.

I assume that ypbind is being started by systemd before the dhcp client 
has actually written into /etc/yp.conf.


Does the system " target Network is Online" get reached after DHCP 
configuration ?

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Re: Fedora27: NFS v4 terrible write performance

2018-01-29 Thread Terry Barnaby

On 30/01/18 00:32, Ed Greshko wrote:

On 01/29/18 17:40, Terry Barnaby wrote:

Now I understand that NFS's latency with writes is a performance bottleneck, 
but in
the past I have used the "async" mount option to good effect to minimise this. 
It
does not appear to have any effect on my systems. The "async" mount option is 
not
listed when you run "mount" to get a list of the mounts on the client.


Pardon the brevity of this response.

I see pretty much the same numbers as you're seeing.  Sever and Client are both 
F27
and in my case both ends have SSD and the links are 1000Mb/s.

However, I'm not convinced the "issue" is related to write performance.  The 
reason I
say this is if do this on the client side

tar -zcf lin.tar f27k/linux-4.14.15/

meaning I'm reading from the server to create the tar.  The numbers were nearly
identical.  I think it may be more that the tar file has many (61337) files.  
Most of
them rather small.

FWIW, I also performed the tests using vers=3 of nfs with slightly better 
numbers on
average.



___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Thanks for the reply and trying. With your example its a bit different 
as you are creating the tar and compressing. The compression will take 
quite a lot of CPU and this is probably the bottleneck in your case.


If I un-tar directly on the servers disks in question (SATA hard drives 
software RAID-1) the untar takes 13 seconds rather than the NFS 3 
minutes ...


time tar -xf linux-4.14.15.tar.gz -C /data/tmp
7.65user 4.87system 0:13.22elapsed 94%CPU (0avgtext+0avgdata 
3404maxresident)k

305392inputs+1796584outputs (2major+311minor)pagefaults 0swaps
[terry@king nfs]$ sync
[terry@king nfs]$ rm -fr /data/tmp/linux-4.14.15
[terry@king nfs]$ sync
[terry@king nfs]$ make test1
sync; time (tar -xf linux-4.14.15.tar.gz -C /data/tmp; sync)

real    0m13.260s
user    0m7.652s
sys 0m4.914s


___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Re: Fedora27: NFS v4 terrible write performance

2018-01-29 Thread Terry Barnaby

On 29/01/18 09:05, Ed Greshko wrote:

On 01/29/18 15:47, Terry Barnaby wrote:

On 19/01/18 15:11, Terry Barnaby wrote:

When doing a tar -xzf ... of a big source tar on an NFSv4 file system the time
taken is huge. I am seeing an overall data rate of about 1 MByte per second 
across
the network interface.

If I copy a single large file I see a network data rate of about 110 MBytes/sec
which is about the limit of the Gigabit Ethernet interface I am using.

Now, in the past I have used the NFS "async" mount option to help with write 
speed
(lots of small files in the case of an untar of a set of source files).

However, this does not seem to speed this up in Fedora27 and also I don't see 
the
"async" option listed when I run the "mount" command. When I use the "sync" 
option
it does show up in the "mount" list.

The question is, is the "async" option actually working with the NFS v4 in 
Fedora27 ?

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org

Anyone using NFS these days ?

Yes, but only as a client at the moment.


Server is a Fedora27 as well. vers=4.2 the default. Same issue at other sites 
with
Fedora27.

Server export: "/data *.kingnet(rw,async,fsid=17)"

Client fstab: "king.kingnet:/data /data nfs async,nocto 0 0"

Client mount: "king.kingnet:/data on /data type nfs4
(rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,nocto,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.202.2,local_lock=none,addr=192.168.202.1)"




If I have time (my) tomorrow I'll have a look at testing it.  But, could you 
define
what you mean by "big source tar" and "single large file"?

In particular the "tar" procedure.  I'm not sure if you have an nfs mounted file
system and you are creating a tar from data on that file system, which would 
need to
be read and then compressed locally and written back to the nfs mounted file 
system.
So, I just want to get the data flow to match.

And what is the size of the single large file and are you doing a copy from a 
local
file system to the nfs partition?


Thanks for the reply.

As a simple test I am using a Linux kernel tar archive such as: 
https://www.kernel.org/pub/linux/kernel/v4.x/linux-4.14.15.tar.gz


Untaring this "tar -xf linux-4.14.15.tar.gz" while in an NFS mounted 
directory across a Gigabit LAN takes about 3 minutes. Ksysguard reports 
an overall network rate of about 2.4 MBytes per second.


The single large "file" I have been using is 1 GByte. This is created in 
the NFS directory with a simple 'C' test program that opens the file and 
writes 1GByte to it followed by an fsync() timing the procedure. 
Ksysguard reports an overall network rate of about 110 MBytes per second 
during this.


Now I understand that NFS's latency with writes is a performance 
bottleneck, but in the past I have used the "async" mount option to good 
effect to minimise this. It does not appear to have any effect on my 
systems. The "async" mount option is not listed when you run "mount" to 
get a list of the mounts on the client.

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Re: Fedora27: NFS v4 terrible write performance

2018-01-28 Thread Terry Barnaby

On 19/01/18 15:11, Terry Barnaby wrote:
When doing a tar -xzf ... of a big source tar on an NFSv4 file system 
the time taken is huge. I am seeing an overall data rate of about 1 
MByte per second across the network interface.


If I copy a single large file I see a network data rate of about 110 
MBytes/sec which is about the limit of the Gigabit Ethernet interface 
I am using.


Now, in the past I have used the NFS "async" mount option to help with 
write speed (lots of small files in the case of an untar of a set of 
source files).


However, this does not seem to speed this up in Fedora27 and also I 
don't see the "async" option listed when I run the "mount" command. 
When I use the "sync" option it does show up in the "mount" list.


The question is, is the "async" option actually working with the NFS 
v4 in Fedora27 ?


___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Anyone using NFS these days ?

Server is a Fedora27 as well. vers=4.2 the default. Same issue at other 
sites with Fedora27.


Server export: "/data *.kingnet(rw,async,fsid=17)"

Client fstab: "king.kingnet:/data /data nfs async,nocto 0 0"

Client mount: "king.kingnet:/data on /data type nfs4 
(rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,nocto,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.202.2,local_lock=none,addr=192.168.202.1)" 



___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Fedora27: NFS v4 terrible write performance

2018-01-19 Thread Terry Barnaby
When doing a tar -xzf ... of a big source tar on an NFSv4 file system 
the time taken is huge. I am seeing an overall data rate of about 1 
MByte per second across the network interface.


If I copy a single large file I see a network data rate of about 110 
MBytes/sec which is about the limit of the Gigabit Ethernet interface I 
am using.


Now, in the past I have used the NFS "async" mount option to help with 
write speed (lots of small files in the case of an untar of a set of 
source files).


However, this does not seem to speed this up in Fedora27 and also I 
don't see the "async" option listed when I run the "mount" command. When 
I use the "sync" option it does show up in the "mount" list.


The question is, is the "async" option actually working with the NFS v4 
in Fedora27 ?


___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Re: Fedora27: Firefox freezes

2018-01-03 Thread Terry Barnaby

On 01/01/18 20:53, Ed Greshko wrote:

On 01/02/18 04:42, Terry Barnaby wrote:

On 01/01/18 20:38, Joe Zeff wrote:

On 01/01/2018 12:15 PM, Terry Barnaby wrote:

I don't think it is that one. The display is fine all other applications run
fine. Its just that the Firefox tab does not load any content for a while. I can
still operate Firefox clicking on things etc, but no tab will update its 
contents.

Have you tried the Firefox Help Site?  This may not be Linux related, and it
doesn't hurt to ask.
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org

Will do so, just trying to find out if it is well known on Fedora/KDE/Plasma or 
if
it could be a one off due to something in my setup/config somewhere.


FWIW, I do not use FF extensively.  It is configured to use a proxy and I only 
use it
to access US based media sites.  When I do use it I've not noticed any problems.

You mention your home directory is NFS mounted.  You may want to consider 
moving and
creating a symbolic link to ~/.cache/mozilla to local storage.



___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Hi I tried symbolically linking ~/.cache/mozilla to local storage. This had no 
effect.

I have also seen some strange freezing on a separate F27 server. The SDDM login 
GUI was frozen. A ssh to the system would freeze. a web access to port 80 would 
timeout. NFS was still working, although the remote systems had NFS server 
UP/Down messages in the logs. Accessing a Mariadb MySql server on the system 
was fine though. A hard reset was needed. Very strange ...

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Re: Fedora27: Firefox freezes

2018-01-01 Thread Terry Barnaby

On 01/01/18 20:38, Joe Zeff wrote:

On 01/01/2018 12:15 PM, Terry Barnaby wrote:


I don't think it is that one. The display is fine all other 
applications run fine. Its just that the Firefox tab does not load 
any content for a while. I can still operate Firefox clicking on 
things etc, but no tab will update its contents.


Have you tried the Firefox Help Site?  This may not be Linux related, 
and it doesn't hurt to ask.

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Will do so, just trying to find out if it is well known on 
Fedora/KDE/Plasma or if it could be a one off due to something in my 
setup/config somewhere.

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Re: Fedora27: Firefox freezes

2018-01-01 Thread Terry Barnaby

On 01/01/18 18:08, Stephen Perkins wrote:
I had the same problem with FF 57 on Fedora 26; found solution on FF 
forum:


go to about:config
and set accessibility.force_disabled = 1 (it's 0 by default)

worked for me.

On Mon, Jan 1, 2018 at 11:29 AM, Terry Barnaby <ter...@beam.ltd.uk 
<mailto:ter...@beam.ltd.uk>> wrote:


On 01/01/18 15:07, George N. White III wrote:

On 1 January 2018 at 09:27, Fred Smith
<fre...@fcshome.stoneham.ma.us
<mailto:fre...@fcshome.stoneham.ma.us>> wrote:

On Mon, Jan 01, 2018 at 12:00:59PM +, Terry Barnaby wrote:
> Is anyone else seeing issues with Firefox freezing up for
30secs or
> more when web pages are opened in TABS ?


I had problems with Firefox freezes (very long delays responding
to any mouse
click) on several systems (linux and macOS).  Removing the Mr.
Robot add-on
(https://www.engadget.com/2017/12/16/firefox-mr-robot-extension/
<https://www.engadget.com/2017/12/16/firefox-mr-robot-extension/>)
seems
to have restored normal operation.

> Since the latest Firefox 57 this is happening a lot for me
(every 10
> mins or so). Seems to lockup while downloading the page.
You can
> still operate Firefox by clicking on menu's/Tab's etc, but
no tab
> windows have thier web content updated. Normally the new tabs
> content are blank, but sometimes a small amount of the web
page is
> rendered before the freeze. No obvious high CPU usage when it
> happens. Its as if the network has stopped, but other
applications
> such as google-chrome are still working.
>
> This is with Fedora27, KDE Plasma and with an NFS mounted
home directory.

I see something similar with FF57 on Centos-7. it tends to go
on for
much longer than 30 seconds, and the HD activity light is
pegged ON
for the duration. I'm guessing it may be a swap storm, but I
don't see
unusually  high memory usage in top.

--
 Fred Smith -- fre...@fcshome.stoneham.ma.us
<mailto:fre...@fcshome.stoneham.ma.us>
-
                        The Lord is like a strong tower.
             Those who do what is right can run to him for
safety.
--- Proverbs 18:10 (niv)
-
___
users mailing list -- users@lists.fedoraproject.org
<mailto:users@lists.fedoraproject.org>
To unsubscribe send an email to
users-le...@lists.fedoraproject.org
<mailto:users-le...@lists.fedoraproject.org>




-- 
George N. White III <aa...@chebucto.ns.ca

<mailto:aa...@chebucto.ns.ca>>
Head of St. Margarets Bay, Nova Scotia


___
users mailing list --users@lists.fedoraproject.org 
<mailto:users@lists.fedoraproject.org>
To unsubscribe send an email tousers-le...@lists.fedoraproject.org
<mailto:users-le...@lists.fedoraproject.org>


Well, in my case it is also often longer than 30 secs, not sure
how long as I normally go and do something else while its loading
the pages. I don't think it is disk related, no obvious activity
(SSD disk and 16 G of RAM). No addons and reset all Firefox
settings etc. Internet access here is 80/20 Mbps FTTP and seems
solid so I don't think it is Internet connection related.

But the problem does seems to be network access related, maybe
multiple thread locking issues in Firefox ?

Terry


___
users mailing list -- users@lists.fedoraproject.org
<mailto:users@lists.fedoraproject.org>
To unsubscribe send an email to
users-le...@lists.fedoraproject.org
<mailto:users-le...@lists.fedoraproject.org>




--
--
Stephen E. Perkins, RN    -- Poetry and Community Health
RuralTechnologies.net <http://RuralTechnologies.net>           -- 
Linux since Red Hat 5.1, 1998

Open-source Collaboration          -- Fedora since 2003

“In theory there is no difference between theory and practice. In 
practice there is."

   - Walter J. Savitch, /
/
/Pascal: An Introduction to the Art and Science of Programming/
-- 




___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Re: Fedora27: Firefox freezes

2018-01-01 Thread Terry Barnaby

On 01/01/18 17:30, Peter Gueckel wrote:

It sounds like this bug:

https://bugzilla.redhat.com/show_bug.cgi?id=1529922
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


I don't think it is that one. The display is fine all other applications 
run fine. Its just that the Firefox tab does not load any content for a 
while. I can still operate Firefox clicking on things etc, but no tab 
will update its contents.


___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Re: Fedora27: Firefox freezes

2018-01-01 Thread Terry Barnaby

On 01/01/18 15:07, George N. White III wrote:
On 1 January 2018 at 09:27, Fred Smith <fre...@fcshome.stoneham.ma.us 
<mailto:fre...@fcshome.stoneham.ma.us>> wrote:


On Mon, Jan 01, 2018 at 12:00:59PM +, Terry Barnaby wrote:
> Is anyone else seeing issues with Firefox freezing up for 30secs or
> more when web pages are opened in TABS ?


I had problems with Firefox freezes (very long delays responding to 
any mouse
click) on several systems (linux and macOS).  Removing the Mr. Robot 
add-on

(https://www.engadget.com/2017/12/16/firefox-mr-robot-extension/) seems
to have restored normal operation.

> Since the latest Firefox 57 this is happening a lot for me (every 10
> mins or so). Seems to lockup while downloading the page. You can
> still operate Firefox by clicking on menu's/Tab's etc, but no tab
> windows have thier web content updated. Normally the new tabs
> content are blank, but sometimes a small amount of the web page is
> rendered before the freeze. No obvious high CPU usage when it
> happens. Its as if the network has stopped, but other applications
> such as google-chrome are still working.
>
> This is with Fedora27, KDE Plasma and with an NFS mounted home
directory.

I see something similar with FF57 on Centos-7. it tends to go on for
much longer than 30 seconds, and the HD activity light is pegged ON
for the duration. I'm guessing it may be a swap storm, but I don't see
unusually  high memory usage in top.

--
 Fred Smith -- fre...@fcshome.stoneham.ma.us
<mailto:fre...@fcshome.stoneham.ma.us> -
                        The Lord is like a strong tower.
             Those who do what is right can run to him for safety.
--- Proverbs 18:10 (niv)
-
___
users mailing list -- users@lists.fedoraproject.org
<mailto:users@lists.fedoraproject.org>
To unsubscribe send an email to
users-le...@lists.fedoraproject.org
<mailto:users-le...@lists.fedoraproject.org>




--
George N. White III <aa...@chebucto.ns.ca <mailto:aa...@chebucto.ns.ca>>
Head of St. Margarets Bay, Nova Scotia


___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Well, in my case it is also often longer than 30 secs, not sure how long 
as I normally go and do something else while its loading the pages. I 
don't think it is disk related, no obvious activity (SSD disk and 16 G 
of RAM). No addons and reset all Firefox settings etc. Internet access 
here is 80/20 Mbps FTTP and seems solid so I don't think it is Internet 
connection related.


But the problem does seems to be network access related, maybe multiple 
thread locking issues in Firefox ?


Terry

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Fedora27: Firefox freezes

2018-01-01 Thread Terry Barnaby
Is anyone else seeing issues with Firefox freezing up for 30secs or more 
when web pages are opened in TABS ?


Since the latest Firefox 57 this is happening a lot for me (every 10 
mins or so). Seems to lockup while downloading the page. You can still 
operate Firefox by clicking on menu's/Tab's etc, but no tab windows have 
thier web content updated. Normally the new tabs content are blank, but 
sometimes a small amount of the web page is rendered before the freeze. 
No obvious high CPU usage when it happens. Its as if the network has 
stopped, but other applications such as google-chrome are still working.


This is with Fedora27, KDE Plasma and with an NFS mounted home directory.
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Re: Fedora25: SDDM not showing NIS accounts

2016-12-05 Thread Terry Barnaby

On 05/12/16 08:38, Ralf Corsepius wrote:

On 12/05/2016 09:12 AM, Samuel Sieb wrote:

On 12/04/2016 11:54 PM, Terry Barnaby wrote:

On 05/12/16 06:43, Samuel Sieb wrote:

On 12/04/2016 07:42 AM, Terry Barnaby wrote:

In F25 the SDDM no longer shows a list of users when NIS (ypbind) is
enabled and also it does not default to the last user logged in. kdm
appears to be the same (with UserList=true in /etc/kde/kdm/kdmrc).


Are you expecting all the users to be listed or just the ones that have
used that computer before?

I would have assumed all users (uid 1000 up).

Displaying all users might be applicable in very small networks with
very small numbers of users, but would not be behavior helpful in larger
networks with larger numbers (100s/1000s) of users/accounts.
I agree on larger or more security sensitive networks this is likely to 
be the desired behaviour. On a small home network it would be nice to 
have options to at least fill in the username of the last user and/or 
list the users as well as provide a username box.





It might show all local users, but at least with LDAP, it only shows
users that have actually logged in at some point.  The AccountService is
involved in that.  I don't know if it's possible for the login manager
to find all the users from an LDAP source, but if it could, that would
most likely be too large a list to be displaying like that anyway.  I
don't know if NIS works differently, but I expect it would have the same
issues.

I don't know much about LDAP, but in NIS-enabled networks, all non-local
users/accounts usually are available at once.
F23's (and before) SDDM presented a list/pictures of all users in the 
NIS list with no obvious way of turning this off which wasn't good either.




Ralf


___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Re: Fedora25: SDDM not showing NIS accounts

2016-12-04 Thread Terry Barnaby

On 05/12/16 06:43, Samuel Sieb wrote:

On 12/04/2016 07:42 AM, Terry Barnaby wrote:

In F25 the SDDM no longer shows a list of users when NIS (ypbind) is
enabled and also it does not default to the last user logged in. kdm
appears to be the same (with UserList=true in /etc/kde/kdm/kdmrc).


Are you expecting all the users to be listed or just the ones that have
used that computer before?

I would have assumed all users (uid 1000 up).




Note this is on a KDE spin install, but I assume the SDDM would be the
same as for a Gnome install.


A Gnome install uses GDM as the login manager and it still shows the
list of users that have logged in using freeipa.

Ah, ok I will go ask this in the KDE list.
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


Fedora25: SDDM not showing NIS accounts

2016-12-04 Thread Terry Barnaby
In F25 the SDDM no longer shows a list of users when NIS (ypbind) is 
enabled and also it does not default to the last user logged in. kdm 
appears to be the same (with UserList=true in /etc/kde/kdm/kdmrc).


Note this is on a KDE spin install, but I assume the SDDM would be the 
same as for a Gnome install.


Although being able to hide the email list and enter a user name is good 
generally for a server based system with many users, it would be good to 
have this as an option for smaller networks.


Is this a bug or a new feature ?

___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org


F19: thunderbird-24.5.0-1.fc19.i686.rpm package issue

2014-05-06 Thread Terry Barnaby
Hasn't anyone else come across the problem with a corrupted
current thunderbird-24.5.0-1.fc19.i686.rpm package in the updates
repository ?

https://bugzilla.redhat.com/show_bug.cgi?id=1093927

Big issue here is that it has a duff MD5 checksum but
yum/rpm still tries to install it ...
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org


Re: F19: thunderbird-24.5.0-1.fc19.i686.rpm package issue

2014-05-06 Thread Terry Barnaby

On 05/06/2014 08:02 PM, Joe Zeff wrote:

On 05/06/2014 12:26 AM, Terry Barnaby wrote:

Hasn't anyone else come across the problem with a corrupted
current thunderbird-24.5.0-1.fc19.i686.rpm package in the updates
repository ?


Yeah.  It's an inconvenience, but as long as yumex handles the rest of the daily
updates correctly, I'm not too worried.
Its not really the inconvenience of the bad package that is an issue here, those 
things do happen.


It is the fact that the package is signed with an MD5 checksum to validate its 
contents. The MD5 signature showed the package was incorrect. But, yum tried

to install it anyway. This seems to suggest that a packages contents can be
modified after signing and still be installed without any errors notified ...

Shouldn't yum/rpm refuse to install if the MD5 checksum is wrong ?
--
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org


Re: Building X11R6.8 in Fedora 15 - Flex not found

2011-09-27 Thread Terry Barnaby
On 09/27/2011 04:42 PM, Doug Kuvaas wrote:
 On Mon, Sep 26, 2011 at 11:56 PM, Don Quixote de la Mancha
 quix...@dulcineatech.com  wrote:
 X11 is supposed to be upwards binary compatible.  If it won't run old
 applications, you should file a bug with Fedora.  If you can figure
 out what the difference is between the old X and the new X that
 enables your app to run on the old X, mention that in your bug report.

 --
 Don Quixote de la Mancha
 quix...@dulcineatech.com

 Custom Software Development for the iPhone and Mac OS X
 http://www.dulcineatech.com/custom-software-development/
 --
 users mailing list
 users@lists.fedoraproject.org
 To unsubscribe or change subscription options:
 https://admin.fedoraproject.org/mailman/listinfo/users
 Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


 The problem that I am running into is not actually a bug in X, it
 appears to be an incorrect assumption made in our X client. Namely,
 the Xserver is not required to provide the backing store, and may
 refuse it if resources are unavailable.  I have also been posting to
 the Xorg mailing list and have received an answer stating this.  Xorg
 seems to be moving away from supporting some of the old technologies
 that were around a long time again and are now mostly dead.  The
 legacy application I am trying to run was written back when 4 MBit
 Token Ring networks were considered High-speed, and serial terminals
 were not uncommon.  This isn't a problem going forward, as we are
 moving away from using this display technology in favor of using a web
 application, however we have numerous customers still utilizing the
 old system.  Since our software application is more or less custom
 built for each customer, fixing our software may be a larger task than
 finding a way to make new hardware work.  Someone in this thread did
 mention a resources file.  This might be something to look at,
 especially since it doesn't involve changing source code.
Depending on the exact issue with needing backing store, you might be able to 
get around this by using one of the newer X-Server features compositing.
If your graphics driver supports this it will normally need to be enabled from 
the desktop settings. Alternatively, and if that is not suitable, modifying
the code with a set of pixmap buffers where windows contents are saved/restore 
on raise/lower events or what is needed shouldn't be to hard.

Also it may be possible to rebuild the F15 X-Server with backing stores 
enabled. 
It used to be a configurable item during the build ...

Cheers


Terry
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Developers responsibillity to Fedora Users

2011-09-27 Thread Terry Barnaby
On 09/27/2011 02:00 PM, Ian Malone wrote:
 On 27 September 2011 10:59, Andrew Haleya...@redhat.com  wrote:
 On 09/26/2011 11:59 PM, Roger wrote:

 Some say that the new Fedora GUI is unhelpful and possibly difficult to
 use, preferring a simpler desktop.

 Ahh, this is all about GNOME 3.  It's very unfair to describe the
 actions/attitude of GNOME developers at that of all Fedora developers.

 Yes, the KDE spin has been around for a long time and I see there are
 now XFCE and LXDE spins.


 GNOME 3 is, to say the least, controversial. Ubuntu Unity hasn't had a
 uniformly great reception either.  GNOME 3 is still in its first
 release, and it'll be interesting to see how it develops.


 3.2 is out September/October 2011
 (http://library.gnome.org/misc/release-notes/3.0/#rnlookingforward)
 while the new features and enhancements listed don't resemble the
 problems people have had we might hope some of the issues will have
 been addressed.

I don't use Gnome myself, mainly KDE.
But its seems like a lot of people would like Gnome2 back. Why doesn't someone,
who has a problem with it just rebuild and release Gnome2 for F15 (with a 
different package name) ?

-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Fedora14: Very very slow NFS write performance

2011-09-26 Thread Terry Barnaby
On 09/26/2011 01:08 PM, Dr. Michael J. Chudobiak wrote:
 On 09/25/2011 05:45 AM, Terry Barnaby wrote:
 Anyone know why the NFS write performance with Fedora14 may be slow (without
 async) ?
 I have Gigabit networking which is all working fine and the systems in 
 question
 have been running Fedora in various forms for many years.

 I found NFSv4 to be completely unusable on my F14 systems. I could never
 figure out where the problem was. It was unusable on my F14 clients when
 I had a F13 server, and once I upgraded the server to F14, my F13
 clients became unusable...

 In desperation, I had to move to glusterfs, which fixed all my problems:

 http://www.mail-archive.com/users@lists.fedoraproject.org/msg40298.html

 I've been very happy with it since the forced migration! It has a lot
 fewer oddities than NFS, and it handles sqlite/firefox file-locking just
 fine, unlike some (all?) NFS variants.

 - Mike
Hi,

Thanks for the info. Might try that, but I still need NFS for other systems
and having used it for almost 30 years am a bit used to it !
Actually I haven't really had any real problems with NFSv4 under F13 or F14
both at work and home (each network one server and about 6 clients mounting 
/home and /data). The only real issue has been performance over an OpenVPN 
connection over ADSL, but that is not too surprising (although an ls -l takes
significantly longer that it really should) and this performance issue when 
writing multiple files.

Does gluserfs support client side file and attribute caching (cachefilesd is 
used with NFS) ? I use that with NFS over the OpenVPN/ADSL link which helps a
bit (although I think it should work much better than it actually does).

Cheers



Terry
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Building X11R6.8 in Fedora 15 - Flex not found

2011-09-26 Thread Terry Barnaby
On 09/26/2011 09:49 PM, Doug Kuvaas wrote:
 I am trying to build an older version of X Server on Fedora 15 to
 allow me to run a legacy application.  Fedora 8, which functions
 correctly for our application, will not install on a new computer for
 some reason, presumably because the new hardware uses UEFI, and Fedora
 15 does not correctly run the application.

 When I try to build X11R6.8 from source, I get the following error
 ld:  cannot find -lfl.  Doing some digging, I found that this error
 is because the linker cannot find the flex library, however flex is
 installed.  I am running a minimal install of Fedora 15 i686 with the
 dev libraries and tools installed, nothing else.

 Kernel Version is 2.6.38.6-26.rc1.fc15.i686.PAE

 Does anyone have any ideas on how to clear this error?  All the
 answers I found just said to install flex.
You may have to install the package flex-static to get a static version of
the flex library.

However, I don't think you will have much luck in building all of
X11R6.8 and getting it to work under F15, at least not the XServer.
The graphics driver, DRM and kernel API interfaces have significantly changed
over the years. Also it is likely that the graphics board you are using
isn't even supported ...
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Building X11R6.8 in Fedora 15 - Flex not found

2011-09-26 Thread Terry Barnaby
On 09/26/2011 10:15 PM, Doug Kuvaas wrote:
 On Mon, Sep 26, 2011 at 4:02 PM, Terry Barnabyter...@beam.ltd.uk  wrote:
 On 09/26/2011 09:49 PM, Doug Kuvaas wrote:

 I am trying to build an older version of X Server on Fedora 15 to
 allow me to run a legacy application.  Fedora 8, which functions
 correctly for our application, will not install on a new computer for
 some reason, presumably because the new hardware uses UEFI, and Fedora
 15 does not correctly run the application.

 When I try to build X11R6.8 from source, I get the following error
 ld:  cannot find -lfl.  Doing some digging, I found that this error
 is because the linker cannot find the flex library, however flex is
 installed.  I am running a minimal install of Fedora 15 i686 with the
 dev libraries and tools installed, nothing else.

 Kernel Version is 2.6.38.6-26.rc1.fc15.i686.PAE

 Does anyone have any ideas on how to clear this error?  All the
 answers I found just said to install flex.

 You may have to install the package flex-static to get a static version of
 the flex library.

 However, I don't think you will have much luck in building all of
 X11R6.8 and getting it to work under F15, at least not the XServer.
 The graphics driver, DRM and kernel API interfaces have significantly
 changed
 over the years. Also it is likely that the graphics board you are using
 isn't even supported ...


 Thanks, this did fix the error that I was getting.  I don't have a
 whole lot of faith in this as a possible solution, but it seems like
 it may be about the only thing I can do.  Since the application I am
 looking to run is 2D graphics only, just using the VESA driver is good
 enough.
Is there any mileage in fixing the problem why your application doesn't run 
correctly under F15 ?
Do you know if it is an XWindows Library issue or an XServer issue (Will it run
under F15 with its GUI displayed on an F8 system or visa versa for example).
Is it a Desktop issue, try using it under different desktop systems...
Terry
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Fedora14: Very very slow NFS write performance

2011-09-25 Thread Terry Barnaby
Anyone know why the NFS write performance with Fedora14 may be slow (without 
async) ?
I have Gigabit networking which is all working fine and the systems in question
have been running Fedora in various forms for many years.

Writing a single large file across NFS is fine, about: 32MBytes/sec without
async and 51 MBytes/sec with async. However if I untar a 30MByte tar file
that has 2466 files in it into an NFS mounted directory with async it takes 
about 8 seconds but without async it takes over 9 minutes !!

Is this expected ?
If not any ideas what is wrong ?

I am seeing the following:

data.bin binary file 100MB
jpgraph-3.5.0b1.tar tar archive 29MB 2466 files

# Test1, defaults: nfs version 4,
Server /etc/exports:/data *.kingnet(rw)
Client /etc/fstab:  king.kingnet:/data /data nfs defaults 0 0

dd if=/tmp/data.bin of=/data/tmp/data.bin bs=102400
32.9 MB/s

dd if=/data/tmp/data.bin of=/tmp/data1.bin bs=102400
66.5 MB/s

time tar -xf /tmp/jpgraph-3.5.0b1.tar
real9m35.235s

# Test2, nfs version 4, async
Server /etc/exports:/data *.kingnet(rw,async)
Client /etc/fstab:  king.kingnet:/data /data nfs defaults 0 0

dd if=/tmp/data.bin of=/data/tmp/data.bin bs=102400
51.3 MB/s

dd if=/data/tmp/data.bin of=/tmp/data1.bin bs=102400
126 MB/s

time tar -xf /tmp/jpgraph-3.5.0b1.tar
real0m7.938s

# Test3, nfs version 3, async
Server /etc/exports:/data *.kingnet(rw,async)
Client /etc/fstab:  king.kingnet:/data /data nfs defaults,nfsvers=3 0 0

dd if=/tmp/data.bin of=/data/tmp/data.bin bs=102400
52.6 MB/s

dd if=/data/tmp/data.bin of=/tmp/data1.bin bs=102400
146 MB/s

time tar -xf /tmp/jpgraph-3.5.0b1.tar
real0m4.920s

Terry
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Fedora14: Very very slow NFS write performance

2011-09-25 Thread Terry Barnaby
On 09/25/2011 04:39 PM, JB wrote:
 Terry Barnabyterry1at  beam.ltd.uk  writes:

 ...

 It would be useful to publish  a test for
 # Test4, defaults: nfs version 3
 that is with sync option, so we could see if similar degradation was
 present with older protocol ?

 One other thing: it would be interesting to see the results if instead of
 in-place (/data/...) execution of
 time tar -xf ...

 you first untared your files into non-NFS dir
 mkdir -p /tmp/untared
 cd /tmp/untared
 tar -xf /tmp/jpgraph-3.5.0b1.tar

 and then executed
 'time cp /tmp/untared/* /data/tmp/'

 JB


Thanks for the reply. The extra test data is:

# Test1, defaults: nfs version 4, sync
Server /etc/exports:/data *.kingnet(rw)
Client /etc/fstab:  king.kingnet:/data /data nfs defaults 0 0

dd if=/tmp/data.bin of=/data/tmp/data.bin bs=102400
32.9 MB/s

dd if=/data/tmp/data.bin of=/tmp/data1.bin bs=102400
66.5 MB/s

time tar -xf /tmp/jpgraph-3.5.0b1.tar
real9m35.235s

time cp -a /tmp/jpgraph-3.5.0b1 /data/tmp
real6m6.373s

# Test4, nfs version 3, sync
Server /etc/exports:/data *.kingnet(rw,async)
Client /etc/fstab:  king.kingnet:/data /data nfs defaults,nfsvers=3 0 0

dd if=/tmp/data.bin of=/data/tmp/data.bin bs=102400
34.6 MB/s

dd if=/data/tmp/data.bin of=/tmp/data1.bin bs=102400
120 MB/s

time tar -xf /tmp/jpgraph-3.5.0b1.tar
real4m20.355s

time cp -a /tmp/jpgraph-3.5.0b1 /data/tmp
real6m8.394s

Basically NFS with sync for NFS version 3 and 4 is very slow, NFSV4 seems
slowest.

Note that CPU usage on both client and server is virtually nill during the
tests (server waiting is for I/O 48% on dual core system). The server has
2 x SATA disks in RAID-1 for /data.
I wonder if NFS is doing a complete sync() to disk on each file close ??

Cheers


Terry
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Fedora14: Very very slow NFS write performance

2011-09-25 Thread Terry Barnaby
On 09/25/2011 06:13 PM, Don Quixote de la Mancha wrote:
 Are you using userspace NFS or the kernel NFS?  The kernel NFS
 _should_ be faster.

 On Sun, Sep 25, 2011 at 10:02 AM, Terry Barnabyter...@beam.ltd.uk  wrote:
 I wonder if NFS is doing a complete sync() to disk on each file close ??

 If you are using the userspace NFS, the strace command will show all
 the system calls it makes.  Run strace against the NFS daemon and
 untar a small tarball with just a few files, otherwise it will spew
 mountains of logs.

 strace -f will trace child processes in the event that the daemon
 forks when you use it.

As far as I was aware kernel space NFS has been the default for many years.
Will check to see if somehow user space NFS is being used ...
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Fedora14: Very very slow NFS write performance

2011-09-25 Thread Terry Barnaby
On 09/25/2011 06:41 PM, JB wrote:
 Terry Barnabyterry1at  beam.ltd.uk  writes:

 ...
 # Test1, defaults: nfs version 4, sync
 Server /etc/exports: /data *.kingnet(rw)
 Client /etc/fstab:   king.kingnet:/data /data nfs defaults 0 0

 dd if=/tmp/data.bin of=/data/tmp/data.bin bs=102400
 32.9 MB/s

 dd if=/data/tmp/data.bin of=/tmp/data1.bin bs=102400
 66.5 MB/s+++

 time tar -xf /tmp/jpgraph-3.5.0b1.tar
 real9m35.235s===

 time cp -a /tmp/jpgraph-3.5.0b1 /data/tmp
 real6m6.373s

 # Test4, nfs version 3, sync
 Server /etc/exports: /data *.kingnet(rw,async)
 Client /etc/fstab:   king.kingnet:/data /data nfs defaults,nfsvers=3 0 0

 dd if=/tmp/data.bin of=/data/tmp/data.bin bs=102400
 34.6 MB/s

 dd if=/data/tmp/data.bin of=/tmp/data1.bin bs=102400
 120 MB/s+++

 time tar -xf /tmp/jpgraph-3.5.0b1.tar
 real4m20.355s===

 time cp -a /tmp/jpgraph-3.5.0b1 /data/tmp
 real6m8.394s
 ...

 The results marked (===,+++) show a regression for NFS v4 vice NFS v3,
 a decline of 100% for each test case.
 I think you should let them know (BZ report) - the type of test performed
 would be of interest to them.

 I would suggest this as well:
 to test by eliminating a possible problem with network, you could perform NFS
 tests (as above) locally (on your client machine in question), that is,
 install NFS server and client software locally, export some dir locally, and
 mount the exported dir locally.
 That is, you would mostly test the NFS software only.

 JB


I don't think it is a network issue. iperf shows this as Ok as do the NFS tests 
with async set.
Anyway a test with local nfs:
time tar -xf /tmp/jpgraph-3.5.0b1.tar
real3m27.320s

The local disk access speeds in both systems are fine.
time (tar -xf /tmp/jpgraph-3.5.0b1.tar; sync)
real0m2.816s

Cheers


Terry

-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Fedora14: Very very slow NFS write performance

2011-09-25 Thread Terry Barnaby
On 09/25/2011 07:03 PM, Don Quixote de la Mancha wrote:
 Try writing to a different type of filesystem.  It might be the
 filesystem's fault.  MacTCP on the Classic Mac OS got a real bad rap
 because FTP writes were very slow, but it was easy to show that the
 problem was in the Heirarchical Filesystem.

 Is your destination filesystem journaled?  Maybe flushing the journal
 is causing your hit.

I don't think it is the file systems fault, at least not directly, NFS
with async is fine. Both systems are using ext4, so they are journaled.

Cheers

Terry
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Fedora14: udev and $tempnode

2011-04-19 Thread Terry Barnaby
Hi,

I am trying to get the Xilinx FPGA tools USB cables running under Fedora14
and have come up against a problem. The Xilinx system includes
a udev rules file that has a line entry like:

SUBSYSTEMS==usb, ACTION==add, ATTRS{idVendor}==03fd,
ATTRS{idProduct}==000d, RUN+=/sbin/fxload -v -t fx2 -I
/usr/share/xusb_emb.hex -D $tempnode

After a bit of debugging, when I plug in the USB device the rule gets
activated but the value of $tempnode is  (ie null string). The udev
documentation states that this gets filled in with the name of a temporary
device node

Any ideas on why this is not happening with F14 (it works fine under RHEL5).


Cheers


Terry

-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Advanced format drives with block errors

2011-02-06 Thread Terry Barnaby
On 02/05/2011 08:28 PM, compdoc wrote:
 One of these drives had a faulty block which the drive had not been able
 to automatically relocate.


 Just curious, what's the reallocated sector count for the drive? And how
 many bad sectors do you feel comfortable with?

 I used to live with drives with a low reallocated sector count, but I
 recently noticed a problem: when a drive develops a new bad sector it can
 cause one of two things to happen to a server:

 1) if the drive is part of a 3ware raid5 array, it is dropped by the
 controller, causing a degraded array.

 2) if the drive is connected directly to the motherboard's onboard sata
 ports, it causes the system to hang for a period of time. Completely
 unresponsive.


 So now I change the drive if a drive's reallocated sector count is greater
 than zero.

 By the way, I hear the WD green drives spin down after a period of time with
 no access. Is this a problem for you?







The Reallocated_Event_Count is now 0 ...
I suspect that the drive managed to write over the duff sector.
When I wrote over the sector with dd the Current_Pending_Sector
went down to 0 and short tests returned no errors but the
Offline_Uncorrectable and Multi_Zone_Error_Rate when up to 1.
After some time (10 hours ?), the drive has reset the
Offline_Uncorrectable and Multi_Zone_Error_Rate values to 0.

As to how many I feel comfortable with, 0 ...
I haven't had any known hard bad sectors on disks for a number of years
now, but just had a spate with WD20EARS drives. One relatively
new one (3 months) reported 17 bad sectors all in a row. This sounded
like a head crash to me so I asked Western Digital and they replaced
the drive. The new one had
this single bad sector after copying on 1.5T of data and I thought
it was worth fixing this one. The drive is going to be used as
a non powered backup drive for my MythTv video archive and won't
be backing up any important data.

Yes, the WD green drives remove the heads and I think lower spin speed
after a time. They cannot be used in RAID arrays, at least if you want
more than a few megabytes/sec out of them. Been there, done that.

You have to make sure the drives are rated as suitable for RAID to
use them reliably under raid. There are features needed such a TLER
(time limited error recovery). Modern desktop drives have many
optimisation and other features such as spinning faster/slower,
removing heads, and trying for long times to recover data from duff
sectors. The TLER issue will cause the problems you mentioned.
With a RAID drive it should should only try for a short time to
recover a duff sector, if it takes to long the OS will assume the
drive is dead and take it out of the array. The OS is happy for it
to return the fact that the sector is duff as it may be able to work
around this. With a RAID array the OS can re-write the block as it
knows the data from the other drive/drives. The drives own recovery
system will get in the way.

At work we only use RAID rated drives. At home I have used normal drives
in the past.

I did try to use some Green drives in a RAID array, on my home server
to lower power usage and it caused no end of problems. The main one being
performance. The data access speed would
drop from around 90 MBytes/sec to around 5MBytes per second and stay there.
I'm not sure what was happening, but I suspect interaction between the Linux 
kernel and the two Green drives as they sped up and down or changed the kernels 
block ordering to its own scheme causing interaction problems.
Not sure what, but after reading about TLER and trying to sort out my 
performance issues, I will always make sure the drives are rated for RAID
now ...

Terry


-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Advanced format drives with block errors

2011-02-05 Thread Terry Barnaby
Hi,

Just a bit of info. I have some Western Digital Caviar Green (Adv. Format),
WD20EARS drives. These have the new 4096 byte physical sector.
One of these drives had a faulty block which the drive had not been able
to automatically relocate.

I tried to force a relocation by overwriting the block with dd:
dd if=/dev/zero of=/dev/sdb count=8 seek=694341800

This failed with a write error and a kernel message:
sd 3:0:0:0: [sdb] Add. Sense: Unrecovered read error - auto reallocate failed

Eventually I tried:
dd if=/dev/zero of=/dev/sdb bs=4096 count=1 seek=86792725

This worked. It makes sense, I guess, as in the first dd it may have tried
to do a single 512 byte block write using a read/modify/write cycle which
would fail as the drive could not read the 4096 byte block in to modify
the 512 bytes contained within.

I wonder what would happen if a program creates a file that ends up spanning
a duff block on one of these drives ?  With a 512byte sector drive, the drive
would automatically relocate the sector and no one would notice. What would
happen with a 4096 byte sector drive ?
Will the kernel output 4096byte blocks or multiple 512byte blocks during the
write ? If the latter, and I guess it depends on the program, then the file
write will fail and manual block repair would be needed. This would not
be good ...

Perhaps one thing to watch out when using these 4096 byte sector drives.
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Fedora 14: Shutdown problem

2011-02-04 Thread Terry Barnaby
On 02/04/2011 12:21 PM, Tim wrote:
 On Thu, 2011-02-03 at 20:47 +, Terry Barnaby wrote:
 There is no need for NetworkManager in a home network anyway, it
 just adds complications, gets in the way and consumes resources.

 Maybe on *your* home network.  But it's good for mine, and when I take
 the laptop around to visit other places.  I just connect to it, be it
 wireless or ethernet.  I don't have to fiddle around with reconfiguring
 the network for each one.

True, for Laptops NetworkManager is good. And as I said on my Laptop I do
have a separate init script that automatically chooses to use NetworkManager or 
network depending on where I am (does a iwlist | grep on the WiFi).

However, in my case, at home most of my systems are desktops, settop boxes, 
kitchen radios etc and don't have WiFi just hard wired Ethernet and they all,
including the laptop, have /home and /data mounted with NFS and use NIS, NTPD 
etc. They all run from a local Linux server. So all the users and other data, 
videos, music, documents etc are on the server, the PC's just have OS and 
anyone 
can log in from anywhere.

I did try to use NetworkManager for this, but it had problems with NFS, NIS and 
sleeping and in the end I could not use it (there are a few bug reports in 
BugZilla which I suspect are still there).

-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Fedora 14: Shutdown problem

2011-02-03 Thread Terry Barnaby
On 02/01/2011 06:30 PM, JB wrote:
 Terry Barnabyterry1at  beam.ltd.uk  writes:

 ...
 Note I am using the network not NetworkManager service. The 
 NetworkManager
 service does not work well for me with systems using networked /home and
 other file systems.
 ...

 Looks like the reason for this is:

 # ls /etc/rc0.d/*
 ...
 S00killall
 S01halt

 # cat /etc/rc0.d/S00killall
 ...
  # Networking could be needed for NFS root.
  [ $subsys = network ]  continue
 ...

 The NetworkManager is set in
 /var/lock/subsys/NetworkManager
 but it is not skipped as well, so the network is brought down.

 Next, the halt script is executed that will try to unmount external fs shares
 (e.g. nfs), and of course will fail.

 The fix:
  # Networking could be needed for NFS root.
  [ $subsys = NetworkManager ]  continue

 Will that make you use NetworkManager now ? :-)

 JB


No, it won't make me use Networkmanager !

I am not using NFS root, only /home, /data. When I used NetworkManager
it had problems on my laptop and desktop PC's when going to and coming
out of sleep with the NFS mounts. I can't remember the exact details, but it
was thought to be insolvable at the time.
There is no need for NetworkManager in a home network anyway, it
just adds complications, gets in the way and consumes resources.

I do use it on my laptop when roaming however, to allow easy access
to WiFi networks.

Cheers


Terry
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Fedora 14: Shutdown problem

2011-01-30 Thread Terry Barnaby
On 01/29/2011 11:42 PM, JB wrote:
 Terry Barnabyterry1at  beam.ltd.uk  writes:

 ...

 Give us unedited outputs:
 $ cat /etc/fstab
 $ cat /etc/mtab
 $ cat /proc/mounts

 JB




The above files:


/etc/fstab
=

#
# /etc/fstab
# Created by anaconda on Fri Nov 26 19:45:43 2010
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=5f18c8b8-2817-47cd-90d0-44f4dcf063de /   ext4 
defaults1 1
UUID=450baf62-09d3-4618-bcd7-79bd440a6c71 swapswap 
defaults0 0
tmpfs   /dev/shmtmpfs   defaults0 0
devpts  /dev/ptsdevpts  gid=5,mode=620  0 0
sysfs   /syssysfs   defaults0 0
proc/proc   procdefaults0 0
king.kingnet:/home  /home   nfs defaults0 0
king.kingnet:/data  /data   nfs defaults0 0
#king.kingnet:/data/video   /data/video nfs defaults0 0
king.kingnet:/var/cache/yum /var/cache/yum  nfs defaults0 0
/dev/sdb1   /datal  autodefaults1 2

/etc/mtab
=
/dev/sda1 / ext4 rw 0 0
proc /proc proc rw 0 0
sysfs /sys sysfs rw 0 0
devpts /dev/pts devpts rw,gid=5,mode=620 0 0
tmpfs /dev/shm tmpfs rw 0 0
/dev/sdb1 /datal ext3 rw 0 0
none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0
king.kingnet:/home /home nfs rw,addr=192.168.2.1 0 0
king.kingnet:/data /data nfs rw,vers=4,addr=192.168.2.1,clientaddr=192.168.2.2 
0 0
king.kingnet:/var/cache/yum /var/cache/yum nfs 
rw,vers=4,addr=192.168.2.1,clientaddr=192.168.2.2 0 0
fusectl /sys/fs/fuse/connections fusectl rw 0 0
gvfs-fuse-daemon /home/dawn/.gvfs fuse.gvfs-fuse-daemon 
rw,nosuid,nodev,user=dawn 0 0

/proc/mounts
=
rootfs / rootfs rw 0 0
/proc /proc proc rw,relatime 0 0
/sys /sys sysfs rw,relatime 0 0
udev /dev devtmpfs rw,relatime,size=506280k,nr_inodes=126570,mode=755 0 0
devpts /dev/pts devpts rw,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /dev/shm tmpfs rw,relatime 0 0
/dev/sda1 / ext4 rw,relatime,barrier=1,data=ordered 0 0
/proc/bus/usb /proc/bus/usb usbfs rw,relatime 0 0
/dev/sdb1 /datal ext3 rw,relatime,errors=continue,barrier=0,data=ordered 0 0
none /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
king.kingnet:/home /home nfs 
rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.2.1,mountvers=3,mountport=48450,mountproto=udp,addr=192.168.2.1
 
0 0
king.kingnet:/data/ /data nfs4 
rw,relatime,vers=4,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.2.2,minorversion=0,addr=192.168.2.1
 
0 0
king.kingnet:/var/cache/yum/ /var/cache/yum nfs4 
rw,relatime,vers=4,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.2.2,minorversion=0,addr=192.168.2.1
 
0 0
/etc/auto.misc /misc autofs 
rw,relatime,fd=6,pgrp=1539,timeout=300,minproto=5,maxproto=5,indirect 0 0
-hosts /net autofs 
rw,relatime,fd=12,pgrp=1539,timeout=300,minproto=5,maxproto=5,indirect 0 0
fusectl /sys/fs/fuse/connections fusectl rw,relatime 0 0
gvfs-fuse-daemon /home/dawn/.gvfs fuse.gvfs-fuse-daemon 
rw,nosuid,nodev,relatime,user_id=1020,group_id=1020 0 0



Cheers


Terry
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Fedora 14: Shutdown problem

2011-01-30 Thread Terry Barnaby
On 01/30/2011 02:11 PM, JB wrote:
 JBjb.1234abcdat  gmail.com  writes:


 # 
 # debugging snapshot statements
 # 
 date  /halt.debug
 cat /etc/mtab  /halt.debug
 cat /proc/mounts  /halt.debug
 # 


 I think correction is needed as /proc is not available any more because it
 was unmounted immediatelly prior to our debugging statements.
 So, remove that:
 cat /proc/mounts  /halt.debug

 JB


I added the debug, and basically it was the same when it shutdown cleanly
and when it failed.

# A bad one
Sun Jan 30 17:12:08 GMT 2011
/dev/sda1 / ext4 rw 0 0
proc /proc proc rw 0 0
sysfs /sys sysfs rw 0 0
Mount:
/dev/sda1 on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
fstab-decode mount -n -o ro,remount /dev/sda1 /
fstab-decode mount -n -o ro,remount proc /proc
fstab-decode mount -n -o ro,remount sysfs /sys

# A good one, / has been remounted ro and so the last two unmount messages are 
not present
Sun Jan 30 17:18:16 GMT 2011
/dev/sda1 / ext4 rw 0 0
proc /proc proc rw 0 0
sysfs /sys sysfs rw 0 0
Mount:
/dev/sda1 on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
fstab-decode mount -n -o ro,remount /dev/sda1 /

I put a /bin/sh after this so I could have a look at the systems state at this 
point when the remount failed. The last few items of the ps ax list is shown:

  1282 ?S  0:00 [rpciod/1]
  1378 ?S  0:00 [nfsiod]
  1381 ?S  0:00 [lockd]
  1960 ?D  0:00 [flush-0:19]
  2006 ?Zl 0:00 [akonadi_control] defunct
  2008 ?Z  0:00 [akonadiserver] defunct
  2010 ?Zl 0:00 [mysqld] defunct
  2125 ?Ds 0:00 [pulseaudio]
  2332 ?Z  0:00 [gconf-helper] defunct
  2365 ?D  0:00 [dcopserver]
  2448 ?Ss 0:00 /bin/bash /etc/rc0.d/S01halt start
  3001 ?S  0:00 /bin/sh
  3019 ?R  0:00 ps ax

It looks like some processes are left over from the GUI (KDE).
I suspect they have log files or something else opened on /
in write mode and this is stopping the remount to ro working.
Running mount -o remount,ro / at this point fails with / is busy.
They are probably waiting for /home, which is an NFS files system, that
was unmounted earlier on in the shutdown process.
I restarted the network and netfs and these processes disappeared. After
shuting down netfs and network as well as some other processes left over
the remount command worked fine and the system shutdown.

Note I am using the network not NetworkManager service. The NetworkManager
service does not work well for me with systems using networked /home and
other file systems.

I suspect an issue further up the shudown chain where the system should
wait for all of the processes to shutdown before unmounting the NFS files 
systems. I will have a look here, any ideas ?

Terry

-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Fedora 14: Shutdown problem

2011-01-30 Thread Terry Barnaby
On 01/30/2011 05:55 PM, Terry Barnaby wrote:
 On 01/30/2011 02:11 PM, JB wrote:
 JBjb.1234abcdat   gmail.com   writes:


 # 
 # debugging snapshot statements
 # 
 date   /halt.debug
 cat /etc/mtab   /halt.debug
 cat /proc/mounts   /halt.debug
 # 


 I think correction is needed as /proc is not available any more because it
 was unmounted immediatelly prior to our debugging statements.
 So, remove that:
 cat /proc/mounts   /halt.debug

 JB


 I added the debug, and basically it was the same when it shutdown cleanly
 and when it failed.

 # A bad one
 Sun Jan 30 17:12:08 GMT 2011
 /dev/sda1 / ext4 rw 0 0
 proc /proc proc rw 0 0
 sysfs /sys sysfs rw 0 0
 Mount:
 /dev/sda1 on / type ext4 (rw)
 proc on /proc type proc (rw)
 sysfs on /sys type sysfs (rw)
 fstab-decode mount -n -o ro,remount /dev/sda1 /
 fstab-decode mount -n -o ro,remount proc /proc
 fstab-decode mount -n -o ro,remount sysfs /sys

 # A good one, / has been remounted ro and so the last two unmount messages are
 not present
 Sun Jan 30 17:18:16 GMT 2011
 /dev/sda1 / ext4 rw 0 0
 proc /proc proc rw 0 0
 sysfs /sys sysfs rw 0 0
 Mount:
 /dev/sda1 on / type ext4 (rw)
 proc on /proc type proc (rw)
 sysfs on /sys type sysfs (rw)
 fstab-decode mount -n -o ro,remount /dev/sda1 /

 I put a /bin/sh after this so I could have a look at the systems state at this
 point when the remount failed. The last few items of the ps ax list is 
 shown:

1282 ?S  0:00 [rpciod/1]
1378 ?S  0:00 [nfsiod]
1381 ?S  0:00 [lockd]
1960 ?D  0:00 [flush-0:19]
2006 ?Zl 0:00 [akonadi_control]defunct
2008 ?Z  0:00 [akonadiserver]defunct
2010 ?Zl 0:00 [mysqld]defunct
2125 ?Ds 0:00 [pulseaudio]
2332 ?Z  0:00 [gconf-helper]defunct
2365 ?D  0:00 [dcopserver]
2448 ?Ss 0:00 /bin/bash /etc/rc0.d/S01halt start
3001 ?S  0:00 /bin/sh
3019 ?R  0:00 ps ax

 It looks like some processes are left over from the GUI (KDE).
 I suspect they have log files or something else opened on /
 in write mode and this is stopping the remount to ro working.
 Running mount -o remount,ro / at this point fails with / is busy.
 They are probably waiting for /home, which is an NFS files system, that
 was unmounted earlier on in the shutdown process.
 I restarted the network and netfs and these processes disappeared. After
 shuting down netfs and network as well as some other processes left over
 the remount command worked fine and the system shutdown.

 Note I am using the network not NetworkManager service. The NetworkManager
 service does not work well for me with systems using networked /home and
 other file systems.

 I suspect an issue further up the shudown chain where the system should
 wait for all of the processes to shutdown before unmounting the NFS files
 systems. I will have a look here, any ideas ?

 Terry

I am guessing this is primarily a KDE problem (although the system should still
shutdown cleanly even if processes are still there waiting on NFS). I presume
the KDE shutdown should wait for all of its processes to complete exit before
it asks init to shutdown the system ...

Terry
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Fedora 14: Shutdown problem

2011-01-30 Thread Terry Barnaby
On 01/30/2011 06:51 PM, JB wrote:
 Terry Barnabyterry1at  beam.ltd.uk  writes:

 ...

 Firstly, I have to re-correct myself - my original debugging statemets were
 correct.
 I checked it on my machine and /proc/mounts is still available, so we should
 include it as it has more info than /etc/mtab. It could give us a clue about
 any other mount-related things.

 ...
 # 
 # debugging snapshot statements
 # 
 echo date  /halt.debug
 date  /halt.debug
 echo cat /etc/mtab  /halt.debug
 cat /etc/mtab  /halt.debug
 echo cat /proc/mounts  /halt.debug
 cat /proc/mounts  /halt.debug
 # 
 ...

 Secondly, I have few things to check with regard to all of this. Perhaps
 something pops up.

 JB






I am fairly sure the problem is the akonadi/pulseaudio/gconf-helper/dcopserver
processes that are still hanging around due to the fact that the NFS mounts they
are using have gone away.

As I said remounting the NFS /home allows them to exit and allows / to then be 
remounted ro and the system shutdown.

I think there are three bugs here:
1. KDE is not waiting for all of its sessions processes to exit before
telling init to halt the system.
2. The rc0 scripts are not making sure all processes using the NFS files
systems have exited prior to unmounting them.
3. The rc0 final / remount ro commadn should make sure all processes have
been killed prior to issuing the remount command. (will the kernel allow
them to be killed when waiting on unmounted NFS ? Kernel bug ?)

Cheers


Terry
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Fedora 14: Shutdown problem

2011-01-30 Thread Terry Barnaby
On 01/30/2011 08:40 PM, JB wrote:
 Terry Barnabyterry1at  beam.ltd.uk  writes:

 ...

 Your analysis is very plausible.
 I remember from Slackware (many years ago ...) - it took explicit steps to
 TERM active processes, reasonably waited for them, and then killed them.

 I tried to follow the selinux line as well.
 The support for nfs home dirs caused problems in the past.
 You have a mix of nfs3 and nfs4, and the nfs4 may be buggy (some selinux and
 'mount' related features are scheduled to be ironed out in F15).

 I would dive in, just for kicks, and tried both cases:
 - switch selinux to permissive mode ; this may not be enough, so ...
 - disable selinux entirely
 You can do it on the kernel command line or /etc/sysconfig/selinux - but you
 have to shutdown twice in order to test the halt script.

 JB






Hi,

Thanks for the info. selinux is actually disabled on all of these
systems. I'm not sure why /home uses NFS 3 while the others use NFS4.
They are from the same server and there is no specific config
for 3 or 4, so on Fedora 14 I would have expected them to be 4.
The server is Fedora 14 as well.

I think it is the /home mount that is likely to be causing the
problem (as the GUI programs are probably accessing files in the users 
directory) and this uses NFS V3. So I wouldn't have expected the
NFS V4 code to be much involved here.

Note this is a Fedora 14 issue. Fedora 12 has been running in this environment 
for more than a year with the same setup without this issue.

I could add a delay before the unmount the NFS file systems as see if this
reduces the problem.

Terry
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Fedora 14: Shutdown problem

2011-01-29 Thread Terry Barnaby
Hi,

About 1 in 4 times Fedora 14 hangs during shutdown on at least 4 of my systems.
Looking at the shutdown messages (ESC in the splash screen) and adding some 
debug statements to /etc/rc.d/rc0.d/S01halt, it hangs after the messages:

Unmounting file systems
init: Re-executing /sbin/init

with the message:

mount: you must specify the file system type.

Adding some debug, this appears after the following command is executed:
fstab-decode mount -n -o ro,remount /dev/sda1 /

The file system is ext4 on all of the systems and that command looks ok.

Any ideas ?

Terry
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Fedora14: Strange and intermittent very slow disks on server

2010-12-19 Thread Terry Barnaby
On 12/18/2010 04:02 PM, Lamar Owen wrote:
 On Saturday, December 18, 2010 03:08:45 am Terry Barnaby wrote:
 It is strange, however, how the system can run perfectly fine with good
 fast disk IO for a while and then go into this slow mode. In the slow
 mode a command can take 30seconds or more to run on an unloaded system.
 It smacks of some Linux kernel SATA driver/RAID1 versus WD EARS drive
 interaction to me.

 It's definitely something; the TLER discussions I've seen are just partial 
 explanations at best.

 However, I think I will change the drives. I was hoping to try some WD10EADS
 ones I have, but after your issues I will look at the RE series or
 another make ...

 The RE series is WD's 'RAID Enterprise' or 'RAID Enabled' (depending on how 
 you look at it) drives, and cost more.  They should work fine in RAID.  The 
 lower cost WD drives have been giving problems in RAID, and not just on 
 Linux.  WD even says they are not designed for RAID.

 Please see the responses at:
 http://community.wdc.com/t5/Other-Internal-Drives/1-TB-WD10EARS-desynch-issues-in-RAID/m-p/11559
 Also see:
 http://wdc.custhelp.com/cgi-bin/wdc.cfg/php/enduser/std_adp.php?p_faqid=1397

 That last link is to WD's FAQ; it explains the root cause of the issue, that 
 of deep cycle recovery (saying point blank that the drive could take *2* 
 *minutes* to recover *one* *sector* in error).  So basically any time the 
 drive hits an error, things slow to a crawl as the iowaits pile up.  This is 
 the info iostat -x 1 will give you; watch the await time (given in 
 milliseconds); I saw awaits of up to 20,000 ms while trying to use my 
 WD15EADS drive in RAID1.

Yes, I have used the WD RE drives in RAID servers for a number of years, with 
no 
issues. The system in question is now running on one of these awaiting a second.

All of the explanations, TLER included make sense and point against RAID use,
but I don't think are causing the issues I am seeing at the moment. I'm sure 
they would hit at some time into the future though.

The WD10EARS drives I am using are new and the SMART reports indicate no bad 
sectors so no error recovery should be going on at the moment. Also there
is no problem with drives being kicked out of the array etc.
The disk system actually can work well (60MBytes/s write, 95Mbytes/s read).
Its just that the system goes into very slow mode occasionally. Also
all disks are affected on the system, not just the two raided WD10EARS
ones, a third WD20EARS also goes slow when the systems gets into the fault
mode.

I noted it start to happen once. I have a number or separately RAID'ed 
partitions on these disks. The rootfs is at the start and there is a
video storage (MythTv) near the end. The system went slow when MythTv was
recording video at about 2.5 MBytes/s.

I strongly feel there is a Linux kernel/disk interaction going on here.
Maybe something like:
Linux driver orders block requests based on head position
blocking some requests until a reasonable number has been completed.
I suspect the drive also orders requests by head position with a similar 
algorithm, these drives have a 64M buffer. Perhaps the two systems interact
with one another and this results in huge delays for particular block 
read/writes although most are done quickly. iostat does not show to long 
average 
wait times (await) times when things are slow (~1000ms).

If I had the time I would investigate further, but the system is fine with
a different disk at the moment.

-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Fedora14: Strange and intermittent very slow disks on server

2010-12-18 Thread Terry Barnaby
On 12/18/2010 12:00 AM, Lamar Owen wrote:
 On Friday, December 17, 2010 06:14:31 pm Terry Barnaby wrote:
 The two main RAID1 disks are WD10EARS (Green). I have seen reported some
 issues with the performance of these but in my case they appear to work
 fine when the system is running ok.
 [snip]
 Anyone seen this sort of behaviour before ?
 Any ideas one where to look ?

 Yes, I have.

 Use a different drive.  Use iostat -x 1 to trace which disk in the RAID1 is 
 causing problems; you'll likely find that the WD10EARS are throwing long 
 awaits.  Rumor is that this is by design; WD has enterprise 'RAID ready' 
 drives and don't rate the lower priced drives for RAID.  I have a WD15EADS 
 that does this.  At least the EARS version can possibly be put in a 'TLER' 
 mode that allows RAID use.

 In my case, I had the WD15EADS drive as one half of a RAID1, with the other 
 half being a Seagate 1.5TB drive of the same LBA.  Every once in a while, 
 performance would absolutely go to pot, and stay that way for minutes at a 
 time (load averages10 on a single core system).  Using iostat -x 1 I was 
 able to isolate the issue to that particular drive (I swapped controller 
 channels, swapped cables, swapped out the power supply, swapped to a 
 different controller chip on the motherboard, swapped motherboards, and the 
 issue was always on this drive).

 When I replaced the WD15EADS with another Seagate 1.5TB, performance came 
 back to normal.  I'm using the WD15EADS in a single mode, now, with much 
 lighter usage, and realizing that performance is not its strong suite.

 Also, the EARS version might use 4K sectors, exposing 512 byte sectors in an 
 'emulation' mode; properly aligning partitions to 4K boundaries solves that.

 Google 'WD EARS TLER' and get the whole story.  You'll also want to disable 
 the 'green' mode, as that will also negatively impact performance.  There are 
 tools out there to do that.

Thanks for the info.

I did play with setting a partition on a 4096 byte (8 x 512Byte sector) 
boundary, but saw no change in random 512Byte block write speed with
a simple test program. These are recent drives so I wondered if things
had changed in this regard.

It is strange, however, how the system can run perfectly fine with good
fast disk IO for a while and then go into this slow mode. In the slow
mode a command can take 30seconds or more to run on an unloaded system.
It smacks of some Linux kernel SATA driver/RAID1 versus WD EARS drive
interaction to me.

However, I think I will change the drives. I was hoping to try some WD10EADS
ones I have, but after your issues I will look at the RE series or
another make ...

Cheers


Terry
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Fedora14: Strange and intermittent very slow disks on server

2010-12-17 Thread Terry Barnaby
This is a strange one.

I have a home server (Pentium Core 2, Intel ICH10, 1G RAM, 2x SATA 1TByte
disks in Raid1, 1x 2TByte SATA and 1 x 1TByte SATA). This is used for normal
NFS and MythTv usage as well as httpd, network routing openvpn etc. It has been
running for about 2 years and 1 year under Fedora12 with no issues.
I have recently updated it to Fedora14 and updated the disks to the above 
configuration (Was 3 x 320G SATA in Raid5 + 2x 1TByte SATA).
Generally all works fine, but the system has recently started going into a
a running very very slow mode. Once running slow it can take 5mins to login
on an NFS mounted client. Running commands through a ssh take an age. Top 
reports low CPU usage ( 5% but the wait time is above 80%). There is
obviously an issue with disk IO and processes being locked out of disk
access for large periods. Rebooting does not normally clear the issue
but sometimes does.

There do not appear to be any processes doing large amounts of disk IO,
ksysguard reports low disk IO bandwidth in use. I have killed off most of
the processes when the system was running slow with no real effect.
When the problem occurs the Disk I/O is very very slow, normally
hdparam -t /dev/sd... gives around 90MBytes/sec. When in slow mode it can
be as low as 2MBytes/sec. Disk I/O tests as the ext4 file system level are
just the same (writes being even worse). This slowness applies to all of the
disks most of which are not being accessed otherwise.
A yum update (on a reasonably fast Internet link) took 6 hours (should
have been more like 10 mins).
There are no messages in /var/log/messages or from dmesg.

It is also intermittent. I have rebooted it a few times, sometimes
when it comes up it is fine, and other times it is not.
The two main RAID1 disks are WD10EARS (Green). I have seen reported some
issues with the performance of these but in my case they appear to work
fine when the system is running ok. Also the system has a WD10EVD disk and
this also goes slow when the problem occurs.

This is with kernel 2.6.35.9-64.fc14.i686.PAE.
I tried installing 2.6.35.6-48.fc14.i686.PAE and booted with that and all was 
fine, however when I went back to 2.6.35.9-64.fc14.i686.PAE it was still fine.
I will run with 2.6.35.6-48.fc14.i686.PAE for a while and see if the problem
re-occurs with that kernel.

Anyone seen this sort of behaviour before ?
Any ideas one where to look ?

Cheers


Terry
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: FC14 good/bad news

2010-11-29 Thread Terry Barnaby
On 11/28/2010 10:52 PM, Hiisi wrote:
 su, 2010-11-28 kello 20:06 +, Terry Barnaby kirjoitti:
 This is not so important as running 3D apps though. Blender will not
 run on 
 either of these systems (huge delays when posting menus etc). Mind you
 on
 Intel I945 hardware blender just causes the X-Server to lock-up hard,
 so its
 not just ATI cards. 
 
 It's probably has nothing in common with your problem, but on my system
 (2.6.32.23-170.fc12.i686, upgraded from F11) I've aliased blender as
 follows:
 ~]$ which blender
 alias blender='LIBGL_ALWAYS_SOFTWARE=1 blender'
 My videocard is:
 product: RV350 AP [Radeon 9600]
 vendor: ATI Technologies Inc
 (driver is radeon)
 Before doing that I had some problems with blender (i.e. white screen
 instead of coordinate grid, empty boxes where different menus and popus
 should be, etc.). Now it works like a charm.

Thanks for the info. But I think using software rendering will make bender
to slow for the sort of 3D performance I need. Will try it though.
But really, 3D hardware support for major graphics chipsets should be relatively
solid by now...
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: FC14 good/bad news

2010-11-28 Thread Terry Barnaby
On 11/28/2010 07:22 PM, Bruno Wolff III wrote:
 On Sun, Nov 28, 2010 at 10:48:34 -0500,
Bill Davidsendavid...@tmr.com  wrote:
 It absolutely is, possibly I should have mentioned that. But it is not
 intuitive that accelerated video drivers would map to lower frame rates
 and less smooth screen updates. If the old drivers could use horizontal
 retrace sync (or perhaps none at all?) why are better drivers slower?

 I think the answer is glxgears isn't really a benchmark tool and being able
 to sync with vertical refresh is good for preventing tearing.

 And more importantly, why do the gears visibly twitch and change
 rotational rate when they appeared smooth with the old drivers?

 That doesn't sound good, but I don't have an answer.

 I have said before that too much effort was going into eye candy like
 wobbly windows, and not enough into support for older video hardware.

 OpenGL support is important and is fundemental to better 3d support for all
 things. This is especially true for cards with shaders. But working 2d support
 is supposed to be important to the Red Hat support people. I expect that
 noticing regressions is hard because there is so much different hardware out
 there and one doesn't typically do a lot of testing on old hardware unless it
 is all you have.

 The 2nd part HAS improved, at least in the case of my Radeon hardware, I
 don't have to use vesa or vendor drivers to have stable operation. In
 the next few days I will be trying some non-Radeon video, I look forward
 to the information.

 My rv530 is working nicely on F14. My rv280 had an issue triggered by a
 suboptimal aperture setting in the bios that I hope I have provided enough
 info to have a fix made upstream. I haven't retested a problem I had where
 the output went to the unused output by default. When using a monitor without
 EDID support on the DVI port, output appeared to have went to the (unused)
 VGA port instead. Rawhide desktops are a bit messy now, but possibly not due
 to the drivers. And it seems to be getting better. I am saw some crashes when
 testing a 2.6.37-rc3 kernel that looked graphics driver related. But I haven't
 collected any data to verify that. My nv28 doesn't get much use, but for
 non-3d stuff it seems OK modulo the rawhide desktop issues I see with my
 rv280. It still breaks some games (notably warzone2000 doesn't work if
 you install the experimental driver package).
Unfortunately for me F14 has regressed even further than F12 as far as 
3D/compositing (straight 2D does appear fine so far). As soon as you turn
on compositing with my ATI Mobility Radeon X300 or RV535 [Radeon X1650 Series]
systems the graphics becomes really sluggish and unusable (glxgears judders
with about a 30Hz about frame rate (should be 60/75Hz on these systems and
is without compositing).
This is not so important as running 3D apps though. Blender will not run on 
either of these systems (huge delays when posting menus etc). Mind you on
Intel I945 hardware blender just causes the X-Server to lock-up hard, so its
not just ATI cards.
It's been nearly 2 years now (since about Fedora8) that blender has been able
to run on any of my Fedora systems!

-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Fedora14 Audio mixers in KDE

2010-11-26 Thread Terry Barnaby
I have just updated some systems for Fedora12 to Fedora14.
All has gone well and is working well, so far, except that
I don't seem to be able to control the mixer on the sound
hardware. Sound is working, in that I get the login music,
beeps etc.

Specifically I want to listen to audio from the Line input. The
KDE sound icon has an overall level control but no other
controls. The Configure Channels settings in kmix do not
list any other channels. This on three systems, with different sound
hardware although all Intel I think.

Any ideas, has something changed with the pulseaudio config ?
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Fedora14 3D graphics with ATI, very very slow

2010-11-26 Thread Terry Barnaby
Hi,

I have just updated from Fedora12 to Fedora14 on a few systems.
There is a problem with 3D graphics on at least two of these systems.
The one I am currently using has:

08:00.0 VGA compatible controller: ATI Technologies Inc RV535 [Radeon X1650 
Series] (rev 9e)
OpenGL renderer string: Gallium 0.4 on RV530

If I try and run blender, it comes up but there appear to be multiple second 
delays when trying to click on menus etc.
If I try and turn compositing on then the graphics is very slow moving
windows etc. Is this due to the new Gallium 3D driver ?

A laptop with a Mobility Radeon X300 appears the same.

Any ideas ?
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


F14: system-config-kickstart: Could not open display because no X server is running

2010-11-09 Thread Terry Barnaby
The system-config-kickstart utility gives the error Could not open display
because no X server is running when started on the command line from a terminal
in X. (Does not work from the menus either).

Any ideas ?
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: F14: system-config-kickstart: Could not open display because no X server is running

2010-11-09 Thread Terry Barnaby
On 11/09/2010 03:13 PM, Terry Barnaby wrote:
 The system-config-kickstart utility gives the error Could not open display
 because no X server is running when started on the command line from a 
 terminal
 in X. (Does not work from the menus either).
 
 Any ideas ?
Just found that there is an updated rpm in updates-testing that fixes this.

-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Fedora 14 Kickstart Problems

2010-11-08 Thread Terry Barnaby
On 11/08/2010 01:09 PM, Bernd Nies wrote:
 Hi,
 
 Im trying to do automated installation with Fedora 14 kickstart and the 
 configuration [2] shown at 
 the end of this message. The installation never runs through. Anaconda still 
 displays these screens 
 and waits for user input:
 
- Welcome
- License Information
- Create User
- Date and Time
- Hardware Profile
 
 According to the Anaconda/Kickstart configuration guide [1] there is no 
 further option I know of. 
 Clicking through these screens for while setting up some hundred workstations 
 is quite annoying.
 
 Furthermore the password of the created admin user is empty in /etc/shadow 
 although I give it as an 
 option in the kickstart config file.
 
 What am I missing? Or is there another better way to install some hundred 
 Workstations with as 
 little user untervention as possible. Currently we're using autoyast for 
 setting up openSUSE 11.0 
 installations.
 
 
 Best regards,
 Bernd Nies
 
 
 [1] http://fedoraproject.org/wiki/Anaconda/Kickstart
 
 [2] Kickstart config:
 
 == CUT ==
 install
 nfs --server=installserver.example.com --dir=/export/Fedora-14-i386-DVD
 logging --host=logserver.example.com --port=514 --level=debug
 reboot --eject
 
 ignoredisk --only-use sda
 clearpart --all --initlabel
 partition /boot --asprimary --fstype=ext4 --label BOOT --size=128
 partition swap  --asprimary --fstype=swap --label SWAP --size=2048
 partition / --asprimary --fstype=ext4 --label ROOT --size=1 --grow
 bootloader --location=mbr
 
 lang en_US.UTF-8
 keyboard us
 
 network --bootproto=static --hostname=myhost --ip=192.168.12.48 
 --netmask=255.255.252.0 
 --gateway=192.168.12.1 --nameserver=192.168.4.7,192.168.4.8
 timezone --utc Europe/Zurich
 
 firewall --disabled
 authconfig --enableshadow --enablemd5
 rootpw --plaintext gaga11
 user --name=admin --gecos=Administrator --homedir=/var/home/admin 
 --shell=/bin/bash --password=changeme
 selinux --disabled
 
 graphical
 xconfig --defaultdesktop=GNOME --startxonboot
 
 repo --name Fedora 14 DVD 
 --baseurl=nfs://installserver.example.com://export/Fedora-14-i386-DVD 
 --cost=0
 
 %packages
 @Administration Tools
 @Base
 @Books and Guides
 @Buildsystem building group
 @Core
 @Critical Path (Base)
 @Critical Path (GNOME)
 @Development Libraries
 @Development Tools
 @Editors
 @Educational Software
 @Electronic Lab
 @Engineering and Scientific
 @Fedora Eclipse
 @Fedora Packager
 @Filesystems
 @Font design and packaging
 @Fonts
 @GNOME Desktop Environment
 @GNOME Software Development
 @Games and Entertainment
 @Graphical Internet
 @Graphics
 @Hardware Support
 @Input Methods
 @Java
 @Java Development
 @KDE Software Compilation
 @KDE Software Development
 @LXDE
 @Office/Productivity
 @Online Help and Documentation
 @Perl Development
 @Printing Support
 @Sound and Video
 @Sugar Desktop Environment
 @System Tools
 @Text-based Internet
 @Web Development
 @Window Managers
 @X Software Development
 @X Window System
 @XFCE
 @XFCE Software Development
 %end
 == CUT ==
Hi,

In the past I have used the following to enable/disable the post installation
screens. (With YES/NO).

%post
echo RUN_FIRSTBOOT=NO  /etc/sysconfig/firstboot

The kickstart password setting command:

rootpw  --iscrypted encrypted password

works for me, never tried the --plaintext option though.

Terry
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Fedora 14 Kickstart Problems

2010-11-08 Thread Terry Barnaby
On 11/08/2010 01:48 PM, Terry Barnaby wrote:
 Hi,
 
 In the past I have used the following to enable/disable the post installation
 screens. (With YES/NO).
 
 %post
 echo RUN_FIRSTBOOT=NO  /etc/sysconfig/firstboot
 
 The kickstart password setting command:
 
 rootpw  --iscrypted encrypted password
 
 works for me, never tried the --plaintext option though.
 
 Terry
Actualy I note in the Fedora14 installation guide:


firstboot (optional)
Determine whether the Firstboot starts the first time the system is booted.
If enabled, the firstboot package must be installed. If not specified, this
option is disabled by default.

*
  --enable or --enabled — The Setup Agent is started the first time the
system boots.
*
  --disable or --disabled — The Setup Agent is not started the first
time the system boots.
*
  --reconfig — Enable the Setup Agent to start at boot time in
reconfiguration mode. This mode enables the language, mouse, keyboard, root
password, security level, time zone, and networking configuration options in
addition to the default ones.

Terry
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


Re: Fedora14: Problme with pungi ?

2010-11-06 Thread Terry Barnaby
On 11/05/2010 06:52 PM, mike cloaked wrote:
 On Fri, Nov 5, 2010 at 5:09 PM, Terry Barnaby ter...@beam.ltd.uk wrote:
 
 I am trying to track this down. Running the low level
 /usr/libexec/anaconda/buildinstall script gives me errors saying that 
 strip is
 missing (No errors seen running pungi ...)
 This could be causing the binary install to the initrd to fail prior to
 setting permissions.
 I am installing binutils and going to try again. I have been using a really
 basic Fedora install from the livecd with the pungi package and some others
 installed. If this works I suspect that pungi should have the package 
 binutils
 added as a dependency and a bit more error checking/reporting
 should be in pungi 

 Cheers
 
 It is not entirely clear what exactly you are doing - are you running
 this from an f14 system? Are you running a livecd for f14 and once
 that is running to then run pungi from the livecd?  If so have you
 executed a yum update before running pungi (you can do this even
 running a livecd)?  Have you pulled the latest kickstart file to
 execute with pungi (from git)?  Was the kickstart file edited from
 that provided from a standard livecd? Are you running this from
 another system such as f13 or f12?
 
 I have been running f14 builds from a mock chroot and executing pungi
 there with no problems once the kickstart files have been customised.
 
 Clearly since I have builds that work and you do not there must be
 something different about how the build procedure has been done!
 
What I did was to install a system from the Fedora14 live disk. I then
updated that system and installed the pungi package along with a few others.
The idea was to download a basic Fedora 14 system so I could produce a
full Fedora14 DVD with all updates and my extra packages.

This all appeared to work fine, but the DVD produced bu pungi had the errors
I reported.

I now have it working fine.
The problem was that the stip utility, which is in the binutils package was
not installed on the system. Ok, most people would probbaly have this but
the anaconda package used by pungi did not have this as a package dependency
and no error status was returned when it was not found.

So there are two small bugs:

1. The anacoda package should have binutils as a dependancy as
   buildinstall requires strip.
2. The buildinstall utility should have better error checking and return an
   error status when strip fails.

I will try and add the bug to the anaconda package.

Cheers

Terry
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines


  1   2   >