[Bug 1893109] Re: [plugin][ceph] collect ceph balancer and pr-autoscale status

2024-04-18 Thread Dan Hill
** Changed in: sosreport (Ubuntu Bionic)
   Status: In Progress => Fix Released

** Changed in: sosreport (Ubuntu Xenial)
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1893109

Title:
  [plugin][ceph] collect ceph balancer and pr-autoscale status

To manage notifications about this bug go to:
https://bugs.launchpad.net/sosreport/+bug/1893109/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1969670] Re: Connection reuse sometimes corrupts status line

2022-04-20 Thread Dan Hill
** Tags added: sts

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1969670

Title:
  Connection reuse sometimes corrupts status line

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/1969670/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1904580] Re: Permissions 0644 for '/var/lib/nova/.ssh/id_rsa' are too open

2022-04-04 Thread Dan Hill
** Tags added: sts

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1904580

Title:
  Permissions 0644 for '/var/lib/nova/.ssh/id_rsa' are too open

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-nova-compute/+bug/1904580/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1919261] Re: Upgrading Ceph from 14.2.11-0ubuntu0.19.10.1~cloud4 to 15.2.8-0ubuntu0.20.04.1~cloud0 fails when ceph-mds is installed

2021-03-15 Thread Dan Hill
** Tags added: sts

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1919261

Title:
  Upgrading Ceph from 14.2.11-0ubuntu0.19.10.1~cloud4 to
  15.2.8-0ubuntu0.20.04.1~cloud0 fails when ceph-mds is installed

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1919261/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1908375] Re: ceph-volume lvm list calls blkid numerous times for differrent devices

2021-01-11 Thread Dan Hill
** Also affects: ceph (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Also affects: ceph (Ubuntu Groovy)
   Importance: Undecided
   Status: New

** Changed in: ceph (Ubuntu Groovy)
   Status: New => Fix Released

** Changed in: ceph (Ubuntu Focal)
   Status: New => Fix Released

** Changed in: ceph (Ubuntu)
   Importance: Undecided => Medium

** Changed in: ceph (Ubuntu Groovy)
   Importance: Undecided => Medium

** Changed in: ceph (Ubuntu Focal)
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1908375

Title:
  ceph-volume lvm list  calls blkid numerous times for
  differrent devices

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1908375/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1909162] [NEW] cluster log slow request spam

2020-12-23 Thread Dan Hill
Public bug reported:

[Impact]

A recent change (issue#43975 [0]) was made to slow request logging to
include detail on each operation in the cluster logs. With this change,
detail for every slow request is always sent to the monitors and added
to the cluster logs.

This does not scale. Large, high-throughput clusters can overwhelm their
monitors with spurious logs in the event of a performance issue.
Disrupting the monitors can then cause further instability in the
cluster.

This SRU reverts the cluster logging of every slow request the osd is
processing.

The slow request clog change was added in nautilus (14.2.10) and octopus
(15.2.0).

[Test Case]

Stress the cluster with a benchmarking tool to generate slow requests
and observe the cluster logs.

[Where problems could occur]

The cluster logs contain detailed debug information on slow requests
that is useful for smaller, low-throughput clusters. While these logs
are not used by ceph, they may be used by the cluster administrators
(for monitoring or alerts). Changing this logging behavior may be
unexpected.

[Other Info]

The intent is to re-enable this feature behind a configurable setting,
but the solution must be discussed upstream.

The same slow request detail can be enabled for each osd by raising the
"debug osd" log level to 20.

[0] https://tracker.ceph.com/issues/43975

** Affects: cloud-archive
 Importance: High
 Status: In Progress

** Affects: cloud-archive/train
 Importance: High
 Assignee: gerald.yang (gerald-yang-tw)
 Status: In Progress

** Affects: cloud-archive/ussuri
 Importance: High
 Assignee: gerald.yang (gerald-yang-tw)
 Status: In Progress

** Affects: ceph (Ubuntu)
 Importance: High
 Assignee: gerald.yang (gerald-yang-tw)
 Status: In Progress

** Affects: ceph (Ubuntu Focal)
 Importance: High
 Assignee: gerald.yang (gerald-yang-tw)
 Status: In Progress

** Affects: ceph (Ubuntu Groovy)
 Importance: High
 Assignee: gerald.yang (gerald-yang-tw)
 Status: In Progress

** Affects: ceph (Ubuntu Hirsute)
 Importance: High
 Assignee: gerald.yang (gerald-yang-tw)
 Status: In Progress


** Tags: seg sts

** Also affects: ceph (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Also affects: ceph (Ubuntu Hirsute)
   Importance: Undecided
   Status: New

** Also affects: ceph (Ubuntu Groovy)
   Importance: Undecided
   Status: New

** Tags added: seg sts

** Also affects: cloud-archive
   Importance: Undecided
   Status: New

** Also affects: cloud-archive/train
   Importance: Undecided
   Status: New

** Also affects: cloud-archive/ussuri
   Importance: Undecided
   Status: New

** Changed in: ceph (Ubuntu Hirsute)
   Status: New => In Progress

** Changed in: ceph (Ubuntu Hirsute)
   Importance: Undecided => High

** Changed in: ceph (Ubuntu Groovy)
   Importance: Undecided => High

** Changed in: ceph (Ubuntu Focal)
   Importance: Undecided => High

** Changed in: cloud-archive/ussuri
   Importance: Undecided => High

** Changed in: cloud-archive/train
   Importance: Undecided => High

** Changed in: cloud-archive
   Importance: Undecided => High

** Changed in: ceph (Ubuntu Groovy)
   Status: New => In Progress

** Changed in: ceph (Ubuntu Focal)
   Status: New => In Progress

** Changed in: cloud-archive/ussuri
   Status: New => In Progress

** Changed in: cloud-archive/train
   Status: New => In Progress

** Changed in: cloud-archive
   Status: New => In Progress

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1909162

Title:
  cluster log slow request spam

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1909162/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1840348] Re: Sharded OpWQ drops suicide_grace after waiting for work

2020-12-17 Thread Dan Hill
** Changed in: cloud-archive/train
   Importance: Undecided => Medium

** Changed in: cloud-archive/stein
   Importance: Undecided => Medium

** Changed in: cloud-archive/queens
   Importance: Undecided => Medium

** Changed in: cloud-archive
   Importance: Undecided => Medium

** Changed in: cloud-archive/rocky
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1840348

Title:
  Sharded OpWQ drops suicide_grace after waiting for work

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1840348/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1840348] Re: Sharded OpWQ drops suicide_grace after waiting for work

2020-12-08 Thread Dan Hill
** Changed in: cloud-archive/train
   Status: New => Fix Released

** Changed in: cloud-archive/rocky
   Status: New => Invalid

** Changed in: cloud-archive/stein
   Status: New => In Progress

** Changed in: cloud-archive/train
 Assignee: (unassigned) => Dan Hill (hillpd)

** Changed in: cloud-archive/stein
 Assignee: (unassigned) => Dan Hill (hillpd)

** Changed in: cloud-archive/rocky
 Assignee: (unassigned) => Dan Hill (hillpd)

** Changed in: cloud-archive/queens
 Assignee: (unassigned) => Dan Hill (hillpd)

** Changed in: ceph (Ubuntu Bionic)
   Status: Confirmed => In Progress

** Changed in: cloud-archive/queens
   Status: New => In Progress

** Changed in: cloud-archive
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1840348

Title:
  Sharded OpWQ drops suicide_grace after waiting for work

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1840348/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1906496] Re: mgr can be very slow in a large ceph cluster

2020-12-04 Thread Dan Hill
Just a quick note:

This bug is causing sosreport to time out commands. This can truncate
important items like `ceph pg dump` on larger clusters.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1906496

Title:
  mgr can be very slow in a large ceph cluster

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1906496/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1840348] Re: Sharded OpWQ drops suicide_grace after waiting for work

2020-12-02 Thread Dan Hill
** Also affects: cloud-archive
   Importance: Undecided
   Status: New

** Changed in: cloud-archive
 Assignee: (unassigned) => Dan Hill (hillpd)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1840348

Title:
  Sharded OpWQ drops suicide_grace after waiting for work

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1840348/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1840348] Re: Sharded OpWQ drops suicide_grace after waiting for work

2020-11-30 Thread Dan Hill
The SRUs for 15.2.5 [0] and 14.2.11 [1] have been released and contain a
fix for this issue.

We are currently evaluating the need for a fix in Luminous
(Bionic/Queens) and Mimic (Stein).

[0] https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1898200
[1] https://bugs.launchpad.net/cloud-archive/+bug/1891077

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1840348

Title:
  Sharded OpWQ drops suicide_grace after waiting for work

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1840348/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1840348] Re: Sharded OpWQ drops suicide_grace after waiting for work

2020-11-30 Thread Dan Hill
** Changed in: ceph (Ubuntu Focal)
   Status: Confirmed => Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1840348

Title:
  Sharded OpWQ drops suicide_grace after waiting for work

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1840348/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1819437] Re: transient mon<->osd connectivity HEALTH_WARN events don't self clear in 13.2.4

2020-11-13 Thread Dan Hill
The 12.2.13 SRU for bionic, and queens is available in -updates (bug
1861793).

** Changed in: ceph (Ubuntu Bionic)
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1819437

Title:
  transient mon<->osd connectivity HEALTH_WARN events don't self clear
  in 13.2.4

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1819437/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1893109] Re: [plugin][ceph] collect ceph balancer and pr-autoscale status

2020-10-08 Thread Dan Hill
Verified sos collects the two new commands on focal.

# lsb_release -a

 No LSB modules are available.  

 
Distributor ID: Ubuntu  


Description:Ubuntu 20.04.1 LTS  


Release:20.04   


Codename:   focal   



# dpkg -l | grep sos

  
ii  sosreport4.0-1~ubuntu0.20.04.2 
amd64Set of tools to gather troubleshooting data from a system  
 

# sos report -e ceph

  
...
Your sosreport has been generated and saved in: 


/tmp/sosreport-juju-ffa3a9-lp1893109-0-2020-10-09-nfxoyto.tar.xz
...

/tmp/sosreport-juju-ffa3a9-lp1893109-0-2020-10-09-nfxoyto/sos_commands/ceph# 
cat ceph_osd_pool_autoscale-status
POOL SIZE  TARGET SIZE  RATE  RAW CAPACITY   RATIO  TARGET 
RATIO  EFFECTIVE RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE  
device_health_metrics  0 3.030708M  0.  
1.0   1  on 
glance 0 3.030708M  0.
0.0500   1.   1.0 128  on 

/tmp/sosreport-juju-ffa3a9-lp1893109-0-2020-10-09-nfxoyto/sos_commands/ceph# 
cat ceph_balancer_status
{
"active": false,
"last_optimize_duration": "",
"last_optimize_started": "",
"mode": "none",
"optimize_result": "",
"plans": []
}


** Tags removed: verification-needed-focal
** Tags added: verification-done-focal

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1893109

Title:
  [plugin][ceph] collect ceph balancer and pr-autoscale status

To manage notifications about this bug go to:
https://bugs.launchpad.net/sosreport/+bug/1893109/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1893109] Re: [plugin][ceph] collect ceph balancer and pr-autoscale status

2020-09-30 Thread Dan Hill
** Description changed:

  [Impact]
  
  It would be nice to collect:
  
  ceph osd pool autoscale-status
  ceph balancer status
  
  https://docs.ceph.com/docs/master/rados/operations/placement-groups/
  
  VIEWING PG SCALING RECOMMENDATIONS
  You can view each pool, its relative utilization, and any suggested changes 
to the PG count with this command:
  
  ceph osd pool autoscale-status
  https://docs.ceph.com/docs/mimic/mgr/balancer/
  
  STATUS
  The current status of the balancer can be check at any time with:
  
  ceph balancer status
  
  [Test Case]
  
  * Install latest sosreport found in -updates
  * Run sosreport -o ceph (version 3.X and/or 4.X) or sos report -o ceph (4.X 
only)
  * Look content inside /path_to_sosreport/sos_command/ceph/
- * Make sure the 2 new commands are found there. 
+ * Make sure the 2 new commands are found there.
  * There will be 3 additional files, as the autoscale-status is also captured 
in JSON format.
  
  [Regression Potential]
- This patch adds two commands to the collected command output. Potential 
regressions would include a command typo or a code typo. A command typo would 
result in a failed command which should capture the command error output. A 
code typo will raise an exception in the ceph plug-in halting further ceph data 
capture. 
+ This patch adds two commands to the collected command output. Potential 
regressions would include a command typo, a command hang, a code typo. 
+ - A command typo would result in a failed command which should capture the 
command error output.
+ - A command hang will result in the ceph plug-in taking a long time to 
complete (hitting the default sos timeout). 
+ - A code typo will raise an exception in the ceph plug-in halting further 
ceph data capture.
  
  [Other Info]
+ Both commands are querying the ceph's internal state, without grabbing any 
locks or performing any modifications. The commands are expected to return very 
quickly. 
  
  [Original Description]
  It would be nice to collect:
  
  ceph osd pool autoscale-status
  ceph balancer status
  
  Upstream report: https://github.com/sosreport/sos/issues/2211
  Upstream commit: 
https://github.com/sosreport/sos/commit/52f4661e2b594134b98e2967b02cc860d7963fef

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1893109

Title:
  [plugin][ceph] collect ceph balancer and pr-autoscale status

To manage notifications about this bug go to:
https://bugs.launchpad.net/sosreport/+bug/1893109/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1893109] Re: [plugin][ceph] collect ceph balancer and pr-autoscale status

2020-09-30 Thread Dan Hill
** Description changed:

  [Impact]
  
  It would be nice to collect:
  
  ceph osd pool autoscale-status
  ceph balancer status
  
  https://docs.ceph.com/docs/master/rados/operations/placement-groups/
  
  VIEWING PG SCALING RECOMMENDATIONS
  You can view each pool, its relative utilization, and any suggested changes 
to the PG count with this command:
  
  ceph osd pool autoscale-status
  https://docs.ceph.com/docs/mimic/mgr/balancer/
  
  STATUS
  The current status of the balancer can be check at any time with:
  
  ceph balancer status
  
  [Test Case]
  
  * Install latest sosreport found in -updates
- * Run sosreport -o ceph (version 3.X and/or 4.X) or sos report -o ceph (4.X 
only) 
+ * Run sosreport -o ceph (version 3.X and/or 4.X) or sos report -o ceph (4.X 
only)
  * Look content inside /path_to_sosreport/sos_command/ceph/
- * Make sure the 2 new commands are found there.
+ * Make sure the 2 new commands are found there. 
+ * There will be 3 additional files, as the autoscale-status is also captured 
in JSON format.
  
  [Regression Potential]
+ This patch adds two commands to the collected command output. Potential 
regressions would include a command typo or a code typo. A command typo would 
result in a failed command which should capture the command error output. A 
code typo will raise an exception in the ceph plug-in halting further ceph data 
capture. 
  
  [Other Info]
  
  [Original Description]
  It would be nice to collect:
  
  ceph osd pool autoscale-status
  ceph balancer status
  
  Upstream report: https://github.com/sosreport/sos/issues/2211
  Upstream commit: 
https://github.com/sosreport/sos/commit/52f4661e2b594134b98e2967b02cc860d7963fef

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1893109

Title:
  [plugin][ceph] collect ceph balancer and pr-autoscale status

To manage notifications about this bug go to:
https://bugs.launchpad.net/sosreport/+bug/1893109/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1891567] Re: [SRU] ceph_osd crash in _committed_osd_maps when failed to encode first inc map

2020-09-10 Thread Dan Hill
** Tags removed: verification-needed verification-needed-done
** Tags added: verification-done verification-done-focal

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1891567

Title:
  [SRU] ceph_osd crash in _committed_osd_maps when failed to encode
  first inc map

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1891567/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1891567] Re: [SRU] ceph_osd crash in _committed_osd_maps when failed to encode first inc map

2020-08-19 Thread Dan Hill
** Description changed:

  [Impact]
  Upstream tracker: issue#46443 [0].
  
  The ceph-osd service can crash when processing osd map updates.
  
  When the osd encounters a CRC error while processing an incremental map
  update, it will request a full map update from its peers. In this code
  path, an uninitialized variable was recently introduced and that will
  get de-referenced causing a crash.
  
  The uninitialized variable was introduced in nautilus 14.2.10, and
  octopus 15.2.1.
  
  [Test Case]
  # Inject osd_inject_bad_map_crc_probability = 1
  sudo ceph daemon osd.{id} config set osd_inject_bad_map_crc_probability 1
  
  # Trigger some osd map updates by restarting a different osd
  sudo systemctl restart osd@{diff-id}
  
  [Regression Potential]
- The code has been updated to leave handle_osd_maps() early if a CRC error is 
encountered, therefore preventing the commit if the failure is encountered 
while processing an incremental map update. This will make the full map update 
take longer but should prevent the crash that resulted in this bug. 
Additionally _committed_osd_maps() is now coded to abort if first <= last, but 
it is assumed that code should never be reached.
+ The code has been updated to leave handle_osd_maps() early if a CRC error is 
encountered, therefore preventing the map commit if the failure is encountered 
while processing an incremental map update. This will make the full map update 
take longer but should prevent the crash that resulted in this bug. 
Additionally, _committed_osd_maps() is now coded to assert if first <= last, 
but it is assumed that code should never be reached.
  
  [Other Info]
  Upstream has released a fix for this issue in Nautilus 14.2.11. The SRU for 
this point release is being tracked by LP: #1891077
  
  Upstream has merged a fix for this issue in Octopus [1], but there is no
  current release target. The ceph packages in focal, groovy, and the
  ussuri cloud archive are exposed to this critical regression.
  
  [0] https://tracker.ceph.com/issues/46443
  [1] https://github.com/ceph/ceph/pull/36340

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1891567

Title:
  [SRU] ceph_osd crash in _committed_osd_maps when failed to encode
  first inc map

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1891567/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1891567] [NEW] [SRU] ceph_osd crash in _committed_osd_maps when failed to encode first inc map

2020-08-13 Thread Dan Hill
Public bug reported:

[Impact]
Upstream tracker: issue#46443 [0].

The ceph-osd service can crash when processing osd map updates.

When the osd encounters a CRC error while processing an incremental map
update, it will request a full map update from its peers. In this code
path, an uninitialized variable was recently introduced and that will
get de-referenced causing a crash.

The uninitialized variable was introduced in nautilus 14.2.10, and
octopus 15.2.1.

[Test Case]
# Inject osd_inject_bad_map_crc_probability = 1
sudo ceph daemon osd.{id} config set osd_inject_bad_map_crc_probability 1

# Trigger some osd map updates by restarting a different osd
sudo systemctl restart osd@{diff-id}

[Other Info]
Upstream has released a fix for this issue in Nautilus 14.2.11. The SRU for 
this point release is being tracked by LP: #1891077

Upstream has merged a fix for this issue in Octopus [1], but there is no
current release target. The ceph packages in focal, groovy, and the
ussuri cloud archive are exposed to this critical regression.

[0] https://tracker.ceph.com/issues/46443
[1] https://github.com/ceph/ceph/pull/36340

** Affects: ceph (Ubuntu)
 Importance: Undecided
 Status: New

** Affects: ceph (Ubuntu Focal)
 Importance: Undecided
 Status: New

** Affects: ceph (Ubuntu Groovy)
 Importance: Undecided
 Status: New


** Tags: seg sts

** Description changed:

  [Impact]
  Upstream tracker: issue#46443 [0].
  
  The ceph-osd service can crash when processing osd map updates.
  
  When the osd encounters a CRC error while processing an incremental map
  update, it will request a full map update from its peers. In this code
  path, an uninitialized variable was recently introduced and that will
  get de-referenced causing a crash.
  
  The uninitialized variable was introduced in nautilus 14.2.10, and
  octopus 15.2.1.
  
  [Test Case]
  # Inject osd_inject_bad_map_crc_probability = 1
  sudo ceph daemon osd.{id} config set osd_inject_bad_map_crc_probability 1
  
  # Trigger some osd map updates by restarting a different osd
  sudo systemctl restart osd@{diff-id}
  
  [Other Info]
  Upstream has released a fix for this issue in Nautilus 14.2.11. The SRU for 
this point release is being tracked by LP: #1891077
  
- Upstream has merged a fix for this issue in Octopus, but there is no
+ Upstream has merged a fix for this issue in Octopus [1], but there is no
  current release target. The ceph packages in focal, groovy, and the
  ussuri cloud archive are exposed to this critical regression.
  
  [0] https://tracker.ceph.com/issues/46443
  [1] https://github.com/ceph/ceph/pull/36340

** Also affects: ceph (Ubuntu Groovy)
   Importance: Undecided
   Status: New

** Also affects: ceph (Ubuntu Focal)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1891567

Title:
  [SRU] ceph_osd crash in _committed_osd_maps when failed to encode
  first inc map

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1891567/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1890334] Re: ceph: nautilus: backport fixes for msgr/eventcenter

2020-08-13 Thread Dan Hill
Due a critical regression in 14.2.10 [0], this will be fixed in the SRU
for 14.2.11.

[0] https://bugs.launchpad.net/cloud-archive/+bug/1891077/comments/3

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1890334

Title:
  ceph: nautilus: backport fixes for msgr/eventcenter

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1890334/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1873368] Re: ssshuttle server fails to connect endpoints with python 3.8

2020-06-18 Thread Dan Hill
** Changed in: sshuttle (Ubuntu Focal)
 Assignee: (unassigned) => Dan Hill (hillpd)

** Changed in: sshuttle (Ubuntu Groovy)
 Assignee: (unassigned) => Dan Hill (hillpd)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1873368

Title:
  ssshuttle server fails to connect endpoints with python 3.8

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/sshuttle/+bug/1873368/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1873368] Re: ssshuttle server fails to connect endpoints with python 3.8

2020-06-18 Thread Dan Hill
This is fixed by pr#431 [0], which landed in v1.0.1.

[0] https://github.com/sshuttle/sshuttle/pull/431

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1873368

Title:
  ssshuttle server fails to connect endpoints with python 3.8

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/sshuttle/+bug/1873368/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1883469] [NEW] idmap_sss: improve man page

2020-06-14 Thread Dan Hill
Public bug reported:

idmap_sss man page for bionic [0] contains misleading instructions. This
was improved by pr#731 [1]. Opening this bug to cherry-pick the
improvement.

[0] http://manpages.ubuntu.com/manpages/bionic/en/man5/sssd.conf.5.html
[1] https://github.com/SSSD/sssd/pull/731

** Affects: sssd (Ubuntu)
 Importance: Undecided
 Assignee: Dan Hill (hillpd)
 Status: New


** Tags: sts

** Tags added: sts

** Changed in: sssd (Ubuntu)
 Assignee: (unassigned) => Dan Hill (hillpd)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1883469

Title:
   idmap_sss: improve man page

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/sssd/+bug/1883469/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1842656] Re: Bitmap allocator returns duplicate entries

2020-06-11 Thread Dan Hill
This has been fixed in 12.2.13-0ubuntu0.18.04.2, available in bionic-
updates.

** Changed in: ceph (Ubuntu Bionic)
   Status: Won't Fix => Fix Released

** Changed in: ceph (Ubuntu)
   Status: Won't Fix => Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1842656

Title:
  Bitmap allocator returns duplicate entries

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1842656/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1880084] Re: [SRU] ceph 15.2.2

2020-06-02 Thread Dan Hill
Upstream has addressed a regression in 15.2.2 causing potential OSD
corruption [0]. 15.2.3 was just released to address this issue [1].

The recommendation is to upgrade to 15.2.3 directly, skipping 15.2.2, or apply 
a workaround disabling bluefs_preextend_wal_files while on 15.2.2:
ceph config set osd bluefs_preextend_wal_files false

[0] https://tracker.ceph.com/issues/45613
[1] https://ceph.io/community/v15-2-3-octopus-released/

** Bug watch added: tracker.ceph.com/issues #45613
   http://tracker.ceph.com/issues/45613

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1880084

Title:
  [SRU] ceph 15.2.2

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1880084/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1873368] Re: ssshuttle server fails to connect endpoints with python 3.8

2020-05-06 Thread Dan Hill
RHEL 7.7 packages python 3.6. That's why it works.

If the host is running python 3.8+, then you have to work around this
issue by pointing to a different version of python that's available on
the host.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1873368

Title:
  ssshuttle server fails to connect endpoints with python 3.8

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/sshuttle/+bug/1873368/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1873368] Re: ssshuttle server fails to connect endpoints with python 3.8

2020-04-28 Thread Dan Hill
** Changed in: sshuttle (Ubuntu)
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1873368

Title:
  ssshuttle server fails to connect endpoints with python 3.8

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/sshuttle/+bug/1873368/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1874939] Re: ceph-osd can't connect after upgrade to focal

2020-04-27 Thread Dan Hill
The same guidelines apply to hyper-converged architectures.

Package updates are not applied until their corresponding service
restarts. Ceph packaging does not automatically restart any services.
This is by design so you can safely install on a hyper-converged host,
and then control the order in which service updates are applied.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1874939

Title:
  ceph-osd can't connect after upgrade to focal

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1874939/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1874939] Re: ceph-osd can't connect after upgrade to focal

2020-04-24 Thread Dan Hill
Eoan packages Nautilus, while Focal packages Octopus:
 ceph | 14.2.2-0ubuntu3  | eoan
 ceph | 14.2.4-0ubuntu0.19.10.2  | eoan-security   
 ceph | 14.2.8-0ubuntu0.19.10.1  | eoan-updates
 ceph | 15.2.1-0ubuntu1  | focal   
 ceph | 15.2.1-0ubuntu2  | focal-proposed  

When upgrading your cluster, make sure to follow the Octopus upgrade
guidelines [0]. Specifically, the Mon and Mgr nodes must be upgraded and
their services restarted before upgrading OSD nodes.

[0] https://docs.ceph.com/docs/master/releases/octopus/#upgrading-from-
mimic-or-nautilus

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1874939

Title:
  ceph-osd can't connect after upgrade to focal

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1874939/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1873368] Re: ssshuttle server fails to connect endpoints with python 3.8

2020-04-16 Thread Dan Hill
** Tags added: seg

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1873368

Title:
  ssshuttle server fails to connect endpoints with python 3.8

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/sshuttle/+bug/1873368/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1873368] Re: ssshuttle server fails to connect endpoints with python 3.8

2020-04-16 Thread Dan Hill
The python docs [0] indicate that the file descriptor should be a socket:
"The file descriptor should refer to a socket, but this is not checked — 
subsequent operations on the object may fail if the file descriptor is invalid."

The docs do need to be corrected. bpo#35415 now explicitly checks fd to
ensure they are sockets.

The fix for this issue likely needs to be in sshuttle, not python.

[0] https://docs.python.org/3/library/socket.html#socket.fromfd

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1873368

Title:
  ssshuttle server fails to connect endpoints with python 3.8

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/sshuttle/+bug/1873368/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1873368] [NEW] ssshuttle server fails to connect endpoints with python 3.8

2020-04-16 Thread Dan Hill
Public bug reported:

Client
$ python3 --version
Python 3.8.2
$ lsb_release -rd
Description:Ubuntu Focal Fossa (development branch)
Release:20.04
$ apt-cache policy sshuttle 
sshuttle:
  Installed: 0.78.5-1
  Candidate: 0.78.5-1

Server
$ python3 --version
Python 3.8.2
$ lsb_release -rd
Description:Ubuntu 20.04 LTS
Release:20.04
$ apt-cache policy openssh-server
openssh-server:
  Installed: 1:8.2p1-4
  Candidate: 1:8.2p1-4

$ sshuttle -r ubuntu@{ip-addr} {subnet-1} {subnet-2}
assembler.py:3: DeprecationWarning: the imp module is deprecated in favour of 
importlib; see the module's documentation for alternative uses
client: Connected.
Traceback (most recent call last):
  File "", line 1, in 
  File "assembler.py", line 38, in 
  File "sshuttle.server", line 298, in main
  File "/usr/lib/python3.8/socket.py", line 544, in fromfd
return socket(family, type, proto, nfd)
  File "/usr/lib/python3.8/socket.py", line 231, in __init__
_socket.socket.__init__(self, family, type, proto, fileno)
OSError: [Errno 88] Socket operation on non-socket
client: fatal: server died with error code 1

The sshuttle upstream tracker is issue#381 [0]. They are waiting on a response 
to bpo#39685 [1].
  
This regression was introduced in python 3.8 by bpo#35415 [2], which restricts 
socket.fromfd() calls to provide valid socket family file descriptors.

[0] https://github.com/sshuttle/sshuttle/issues/381
[1] https://bugs.python.org/issue39685
[2] https://bugs.python.org/issue35415

** Affects: sshuttle (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1873368

Title:
  ssshuttle server fails to connect endpoints with python 3.8

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/sshuttle/+bug/1873368/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1840348] Re: Sharded OpWQ drops suicide_grace after waiting for work

2020-04-15 Thread Dan Hill
** Description changed:

  [Impact]
  The Sharded OpWQ will opportunistically wait for more work when processing an 
empty queue. While waiting, the heartbeat timeout and suicide_grace values are 
modified. The `threadpool_default_timeout` grace is left applied and 
suicide_grace is disabled.
  
  After finding work, the original work queue grace/suicide_grace values
  are not re-applied. This can result in hung operations that do not
  trigger an OSD suicide recovery.
  
  The missing suicide recovery was observed on Luminous 12.2.11. The
  environment was consistently hitting a known authentication race
  condition (issue#37778 [0]) due to repeated OSD service restarts on a
  node exhibiting MCEs from a faulty DIMM.
  
  The auth race condition would stall pg operations. In some cases, the
  hung ops would persist for hours without suicide recovery.
  
  [Test Case]
- - In-Progress -
- Haven't landed on a reliable reproducer. Currently testing the fix by 
exercising I/O. Since the fix applies to all version of Ceph, the plan is to 
let this bake in the latest release before considering a back-port.
+ I have not identified a reliable reproducer. Currently testing the fix by 
exercising I/O. 
+ 
+ Recommend letting this bake upstream before considering a back-port.
  
  [Regression Potential]
  This fix improves suicide_grace coverage of the Sharded OpWq.
  
  This change is made in a critical code path that drives client I/O. An
  OSD suicide will trigger a service restart and repeated restarts
  (flapping) will adversely impact cluster performance.
  
  The fix mitigates risk by keeping the applied suicide_grace value
  consistent with the value applied before entering
  `OSD::ShardedOpWQ::_process()`. The fix is also restricted to the empty
  queue edge-case that drops the suicide_grace timeout. The suicide_grace
  value is only re-applied when work is found after waiting on an empty
  queue.
  
  - In-Progress -
- The fix needs to bake upstream on later levels before back-port consideration.
+ Opened upstream tracker for issue#45076 [1] and fix pr#34575 [2]
+ 
+ [0] https://tracker.ceph.com/issues/37778
+ [1] https://tracker.ceph.com/issues/45076
+ [2] https://github.com/ceph/ceph/pull/34575

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1840348

Title:
  Sharded OpWQ drops suicide_grace after waiting for work

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1840348/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1819437] Re: transient mon<->osd connectivity HEALTH_WARN events don't self clear in 13.2.4

2020-04-15 Thread Dan Hill
Posting an update with recent SRU activity.

The 12.2.13 SRU is in progress, the package is held up due to a regression 
tracked by bug 1871820.
The 13.2.8 SRU for rocky, and stein is now available in -updates (bug 1864514). 
 
The 14.2.8 SRU for eoan, and train is now available in -updates (bug 1861789).

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1819437

Title:
  transient mon<->osd connectivity HEALTH_WARN events don't self clear
  in 13.2.4

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1819437/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1819437] Re: transient mon<->osd connectivity HEALTH_WARN events don't self clear in 13.2.4

2020-04-15 Thread Dan Hill
** Changed in: ceph (Ubuntu Eoan)
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1819437

Title:
  transient mon<->osd connectivity HEALTH_WARN events don't self clear
  in 13.2.4

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1819437/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1840348] Re: Sharded OpWQ drops suicide_grace after waiting for work

2020-04-13 Thread Dan Hill
** Description changed:

  [Impact]
- The Sharded OpWQ will opportunistically wait for more work when processing an 
empty queue. While waiting, the heartbeat timeout and suicide_grace values are 
modified. On Luminous, the `threadpool_default_timeout` grace is left applied 
and suicide_grace is left disabled. On later releases both the grace and 
suicide_grace are left disabled.
+ The Sharded OpWQ will opportunistically wait for more work when processing an 
empty queue. While waiting, the heartbeat timeout and suicide_grace values are 
modified. The `threadpool_default_timeout` grace is left applied and 
suicide_grace is disabled.
  
  After finding work, the original work queue grace/suicide_grace values
  are not re-applied. This can result in hung operations that do not
  trigger an OSD suicide recovery.
  
  The missing suicide recovery was observed on Luminous 12.2.11. The
  environment was consistently hitting a known authentication race
  condition (issue#37778 [0]) due to repeated OSD service restarts on a
  node exhibiting MCEs from a faulty DIMM.
  
  The auth race condition would stall pg operations. In some cases, the
  hung ops would persist for hours without suicide recovery.
  
  [Test Case]
  - In-Progress -
  Haven't landed on a reliable reproducer. Currently testing the fix by 
exercising I/O. Since the fix applies to all version of Ceph, the plan is to 
let this bake in the latest release before considering a back-port.
  
  [Regression Potential]
  This fix improves suicide_grace coverage of the Sharded OpWq.
  
  This change is made in a critical code path that drives client I/O. An
  OSD suicide will trigger a service restart and repeated restarts
  (flapping) will adversely impact cluster performance.
  
  The fix mitigates risk by keeping the applied suicide_grace value
  consistent with the value applied before entering
  `OSD::ShardedOpWQ::_process()`. The fix is also restricted to the empty
  queue edge-case that drops the suicide_grace timeout. The suicide_grace
  value is only re-applied when work is found after waiting on an empty
  queue.
  
  - In-Progress -
  The fix needs to bake upstream on later levels before back-port consideration.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1840348

Title:
  Sharded OpWQ drops suicide_grace after waiting for work

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1840348/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1840348] Re: Sharded OpWQ drops suicide_grace after waiting for work

2020-04-10 Thread Dan Hill
** Also affects: ceph (Ubuntu Focal)
   Importance: Medium
 Assignee: Dan Hill (hillpd)
   Status: Triaged

** Also affects: ceph (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: ceph (Ubuntu Eoan)
   Importance: Undecided
   Status: New

** Changed in: ceph (Ubuntu Bionic)
   Status: New => Confirmed

** Changed in: ceph (Ubuntu Bionic)
 Assignee: (unassigned) => Dan Hill (hillpd)

** Changed in: ceph (Ubuntu Eoan)
 Assignee: (unassigned) => Dan Hill (hillpd)

** Changed in: ceph (Ubuntu Bionic)
   Importance: Undecided => Medium

** Changed in: ceph (Ubuntu Eoan)
   Importance: Undecided => Medium

** Changed in: ceph (Ubuntu Eoan)
   Status: New => Confirmed

** Changed in: ceph (Ubuntu Focal)
   Status: Triaged => Confirmed

** Description changed:

  [Impact]
  The Sharded OpWQ will opportunistically wait for more work when processing an 
empty queue. While waiting, the heartbeat timeout and suicide_grace values are 
modified. On Luminous, the `threadpool_default_timeout` grace is left applied 
and suicide_grace is left disabled. On later releases both the grace and 
suicide_grace are left disabled.
  
  After finding work, the original work queue grace/suicide_grace values
  are not re-applied. This can result in hung operations that do not
  trigger an OSD suicide recovery.
  
  The missing suicide recovery was observed on Luminous 12.2.11. The
  environment was consistently hitting a known authentication race
  condition (issue#37778 [0]) due to repeated OSD service restarts on a
  node exhibiting MCEs from a faulty DIMM.
  
  The auth race condition would stall pg operations. In some cases, the
  hung ops would persist for hours without suicide recovery.
  
  [Test Case]
  - In-Progress -
  Haven't landed on a reliable reproducer. Currently testing the fix by 
exercising I/O. Since the fix applies to all version of Ceph, the plan is to 
let this bake in the latest release before considering a back-port.
  
  [Regression Potential]
  This fix improves suicide_grace coverage of the Sharded OpWq.
  
  This change is made in a critical code path that drives client I/O. An
  OSD suicide will trigger a service restart and repeated restarts
  (flapping) will adversely impact cluster performance.
  
  The fix mitigates risk by keeping the applied suicide_grace value
  consistent with the value applied before entering
  `OSD::ShardedOpWQ::_process()`. The fix is also restricted to the empty
  queue edge-case that drops the suicide_grace timeout. The suicide_grace
  value is only re-applied when work is found after waiting on an empty
  queue.
  
  - In-Progress -
- The fix will bake upstream on later levels before back-port consideration.
+ The fix needs to bake upstream on later levels before back-port consideration.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1840348

Title:
  Sharded OpWQ drops suicide_grace after waiting for work

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1840348/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1840348] Re: Sharded OpWQ drops suicide_grace after waiting for work

2020-04-10 Thread Dan Hill
Attaching the proposed fix for 12.2.13 that I am testing.

** Patch added: "ceph_12.2.13-0ubuntu0.18.04.1+20200409sf00238701b1.debdiff"
   
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1840348/+attachment/5351517/+files/ceph_12.2.13-0ubuntu0.18.04.1+20200409sf00238701b1.debdiff

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1840348

Title:
  Sharded OpWQ drops suicide_grace after waiting for work

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1840348/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1840348] Re: Sharded OpWQ drops suicide_grace after waiting for work

2020-04-10 Thread Dan Hill
There are two edge-cases in 12.2.11 where a worker thread's suicide_grace value 
gets dropped:
[0] In the Threadpool context, Threadpool:worker() drops suicide_grace while 
waiting on an empty work queue.
[1] In the ShardedThreadpool context, OSD::ShardedOpWQ::_process() drops 
suicide_grace while opportunistically waiting for more work (to prevent 
additional lock contention).

The Threadpool context always re-assigns suicide_grace before driving
any work. The ShardedThreadpool context does not follow this pattern.
After delaying to find additional work, the default sharded work queue
timeouts are not re-applied.

This oversight exists in Luminous on-wards. Mimic, and Nautilus have
each reworked the ShardedOpWQ code path, but did not address the
problem.

[0] https://github.com/ceph/ceph/blob/v12.2.11/src/common/WorkQueue.cc#L137
[1] https://github.com/ceph/ceph/blob/v12.2.11/src/osd/OSD.cc#L10476

** Description changed:

- Multiple incidents have been seen where ops were blocked for various
- reasons and the suicide_grace timeout was not observed, meaning that the
- OSD failed to suicide as expected.
+ [Impact]
+ The Sharded OpWQ will opportunistically wait for more work when processing an
+ empty queue. While waiting, the heartbeat timeout and suicide_grace values are
+ modified. On Luminous, the `threadpool_default_timeout` grace is left applied
+ and suicide_grace is left disabled. On later releases both the grace and
+ suicide_grace are left disabled. 
+ 
+ After finding work, the original work queue grace/suicide_grace values are
+ not re-applied. This can result in hung operations that do not trigger an OSD
+ suicide recovery.
+ 
+ The missing suicide recovery was observed on Luminous 12.2.11. The environment
+ was consistently hitting a known authentication race condition (issue#37778
+ [0]) due to repeated OSD service restarts on a node exhibiting MCEs from a
+ faulty DIMM. 
+ 
+ The auth race condition would stall pg operations. In some cases the hung ops
+ would persist for hours without suicide recovery.
+ 
+ [Test Case]
+ - In-Progress -
+ Haven't landed on a reliable reproducer. Currently testing the fix by
+ exercising I/O. Since the fix applies to all version of Ceph, the plan is to
+ let this bake in the latest release before considering a back-port. 
+ 
+ [Regression Potential]
+ This fix improves suicide_grace coverage of the Sharded OpWq. 
+ 
+ This change is made in a critical code path that drives client I/O. An OSD
+ suicide will trigger a service restart and repeated restarts (flapping) will
+ adversely impact cluster performance. 
+ 
+ The fix mitigates risk by keeping the applied suicide_grace value consistent
+ with the value applied before entering `OSD::ShardedOpWQ::_process()`. The fix
+ is also restricted to the empty queue edge-case that drops the suicide_grace
+ timeout. The suicide_grace value is only re-applied when work is found after
+ waiting on an empty queue. 
+ 
+ - In-Progress -
+ The fix will bake upstream on later levels before back-port consideration.

** Description changed:

  [Impact]
- The Sharded OpWQ will opportunistically wait for more work when processing an
- empty queue. While waiting, the heartbeat timeout and suicide_grace values are
- modified. On Luminous, the `threadpool_default_timeout` grace is left applied
- and suicide_grace is left disabled. On later releases both the grace and
- suicide_grace are left disabled. 
+ The Sharded OpWQ will opportunistically wait for more work when processing an 
empty queue. While waiting, the heartbeat timeout and suicide_grace values are 
modified. On Luminous, the `threadpool_default_timeout` grace is left applied 
and suicide_grace is left disabled. On later releases both the grace and 
suicide_grace are left disabled.
  
- After finding work, the original work queue grace/suicide_grace values are
- not re-applied. This can result in hung operations that do not trigger an OSD
- suicide recovery.
+ After finding work, the original work queue grace/suicide_grace values
+ are not re-applied. This can result in hung operations that do not
+ trigger an OSD suicide recovery.
  
- The missing suicide recovery was observed on Luminous 12.2.11. The environment
- was consistently hitting a known authentication race condition (issue#37778
- [0]) due to repeated OSD service restarts on a node exhibiting MCEs from a
- faulty DIMM. 
+ The missing suicide recovery was observed on Luminous 12.2.11. The
+ environment was consistently hitting a known authentication race
+ condition (issue#37778 [0]) due to repeated OSD service restarts on a
+ node exhibiting MCEs from a faulty DIMM.
  
- The auth race condition would stall pg operations. In some cases the hung ops
- would persist for hours without suicide recovery.
+ The auth race condition would stall pg operations. In some cases, the
+ hung ops would persist for hours without suicide recovery.
  
  [Test Case]
  - In-Progress -
- Haven't landed on a reliable reproducer. 

[Bug 1840348] Re: Sharded OpWQ drops suicide_grace after waiting for work

2020-04-10 Thread Dan Hill
** Summary changed:

- Ceph 12.2.11-0ubuntu0.18.04.2 doesn't honor suicide_grace
+ Sharded OpWQ drops suicide_grace after waiting for work

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1840348

Title:
  Sharded OpWQ drops suicide_grace after waiting for work

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1840348/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1833593] Re: fstrim.timer always triggers at 00:00, use RandomizedDelaySec

2020-04-09 Thread Dan Hill
Also reported in sf#00272957

** Tags added: sts

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1833593

Title:
  fstrim.timer always triggers at 00:00, use RandomizedDelaySec

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/util-linux/+bug/1833593/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1844173] Re: libwbclient-sssd package installation does not re-create expected symlink to libwbclient.so.0

2020-03-06 Thread Dan Hill
Marking this invalid. Upstream sssd is deprecating the libwbclient
libraray and does not recommend its usage.

** Changed in: sssd (Ubuntu)
   Status: Triaged => Invalid

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1844173

Title:
  libwbclient-sssd package installation does not re-create expected
  symlink to libwbclient.so.0

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/samba/+bug/1844173/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1862830] Re: Update sosreport to 3.9

2020-03-02 Thread Dan Hill
Tested sosreport collection on a bionic-rocky mimic ceph cluster.
Everything worked as expected.

Regarding comments #8, #12: upstream adds 8 random characters to the
filename in 'friendly' mode. Practically, this should prevent all
collisions. If we need to approximate legacy formatting for some reason,
the label can be used to include the missing timestamp: `--label=$(date
+%T%Z)`

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1862830

Title:
  Update sosreport to 3.9

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/sosreport/+bug/1862830/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1862226] Re: /usr/sbin/sss_obfuscate fails to run: ImportError: No module named pysss

2020-02-19 Thread Dan Hill
** Patch added: "lp1862226-bionic.debdiff"
   
https://bugs.launchpad.net/ubuntu/+source/sssd/+bug/1862226/+attachment/5329605/+files/lp1862226-bionic.debdiff

** Changed in: sssd (Ubuntu Bionic)
   Status: Confirmed => In Progress

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1862226

Title:
  /usr/sbin/sss_obfuscate fails to run: ImportError: No module named
  pysss

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/sssd/+bug/1862226/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1844173] Re: libwbclient-sssd package installation does not re-create expected symlink to libwbclient.so.0

2020-02-18 Thread Dan Hill
A recent sssd issue was opened re: libwbclient API version 15 [0].

Upstream sssd plan to deprecate the alternative winbind client library,
which is why they haven't implemented the 0.15 interface. The sssd
library version hasn't maintained feature parity for some years [1].
Upstream recommends using idmap_sss as a winbind plug-in instead, which
is available in the sssd-common package [2].

[0] https://pagure.io/SSSD/sssd/issue/4158
[1] 
https://lists.fedorahosted.org/archives/list/sssd-us...@lists.fedorahosted.org/message/BY3FRMUHXNV3OMGYL5WDYCIIPIR3UJSB/
[2] https://manpages.ubuntu.com/manpages/eoan/man8/idmap_sss.8.html

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1844173

Title:
  libwbclient-sssd package installation does not re-create expected
  symlink to libwbclient.so.0

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/samba/+bug/1844173/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1861793] Re: [SRU] ceph 12.2.13

2020-02-14 Thread Dan Hill
Upstream 12.2.13 did not pick up lp#1838109 [0].

Please ensure that our SRU includes the civetweb back-port supporting a
configurable 'max_connections' listen backlog.

[0] https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1838109

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1861793

Title:
  [SRU] ceph 12.2.13

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1861793/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1702777] Re: ceph package is not built with jemalloc support

2020-02-13 Thread Dan Hill
** Changed in: ceph (Ubuntu)
   Status: Triaged => Won't Fix

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1702777

Title:
  ceph package is not built with jemalloc support

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1702777/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1819437] Re: transient mon<->osd connectivity HEALTH_WARN events don't self clear in 13.2.4

2020-02-13 Thread Dan Hill
** Changed in: ceph (Ubuntu Bionic)
   Status: New => In Progress

** Changed in: ceph (Ubuntu Eoan)
   Status: New => In Progress

** Changed in: ceph (Ubuntu Bionic)
 Assignee: (unassigned) => Dan Hill (hillpd)

** Changed in: ceph (Ubuntu Eoan)
 Assignee: (unassigned) => Dan Hill (hillpd)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1819437

Title:
  transient mon<->osd connectivity HEALTH_WARN events don't self clear
  in 13.2.4

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1819437/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1855859] Re: [SRU] ceph 13.2.7

2020-02-13 Thread Dan Hill
Due to a missed backport, clusters in the process of being upgraded from
13.2.6 to 13.2.7 might suffer an OSD crash in build_incremental_map_msg.
This regression was reported [0] in and is fixed in 13.2.8.

Users of 13.2.6 can upgrade to 13.2.8 directly - i.e., skip 13.2.7 - to
avoid this.

[0] https://tracker.ceph.com/issues/43106

** Bug watch added: tracker.ceph.com/issues #43106
   http://tracker.ceph.com/issues/43106

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1855859

Title:
  [SRU] ceph 13.2.7

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1855859/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1819437] Re: transient mon<->osd connectivity HEALTH_WARN events don't self clear in 13.2.4

2020-02-13 Thread Dan Hill
This issue has been resolved upstream:
pr#30519 in 12.2.13
pr#30481 in 13.2.7
pr#30480 in 14.2.5

The mimic fix has been released, but be advised that upgrading from
13.2.6 -> 13.2.7 may cause OSD crashes [0]. We will be updating our
packaging to 13.2.8 to address this issue.

The 12.2.13 and 14.2.7 point releases landed upstream last week. We are
working on stable release updates (SRUs) for these packages. You can
follow and contribute to the SRU progress at [1], and [2] respectively.

[0] https://tracker.ceph.com/issues/43106
[1] https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1861793
[2] https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1861789


** Bug watch added: tracker.ceph.com/issues #43106
   http://tracker.ceph.com/issues/43106

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1819437

Title:
  transient mon<->osd connectivity HEALTH_WARN events don't self clear
  in 13.2.4

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1819437/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1861789] Re: [SRU] ceph 14.2.7

2020-02-13 Thread Dan Hill
Should this target ussuri?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1861789

Title:
  [SRU] ceph 14.2.7

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1861789/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1848950] Re: pg not [deep] scrubbed since 0.000000 -- w/fix

2020-02-13 Thread Dan Hill
pr#28869 fixed upstream in 14.2.3. This version is in the eoan-updates
and focal pockets.

For UCA, both train-proposed and ussuri-proposed have this fixed, but these 
have not yet landed:
 ceph | 14.2.2-0ubuntu3~cloud0  | train   | bionic-updates  | 
source
 ceph | 14.2.4-0ubuntu0.19.10.1~cloud0  | train-proposed  | bionic-proposed | 
source
 ceph | 14.2.2-0ubuntu3~cloud0  | ussuri  | bionic-updates  | 
source
 ceph | 14.2.5-3ubuntu4~cloud0  | ussuri-proposed | bionic-proposed | 
source


** Also affects: ceph (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Also affects: ceph (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Also affects: ceph (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: ceph (Ubuntu Eoan)
   Importance: Undecided
   Status: New

** Changed in: ceph (Ubuntu Xenial)
   Status: New => Won't Fix

** Changed in: ceph (Ubuntu Bionic)
   Status: New => Won't Fix

** Changed in: ceph (Ubuntu Eoan)
   Status: New => Fix Released

** Changed in: ceph (Ubuntu Focal)
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1848950

Title:
  pg not [deep] scrubbed since 0.00 -- w/fix

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1848950/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1702777] Re: ceph package is not built with jemalloc support

2020-02-13 Thread Dan Hill
The use of jemalloc is not possible with rocksdb [0], and has the option
has been removed in luminous+, which use bluestore by default [1].

[0] https://tracker.ceph.com/issues/20557
[1] https://github.com/ceph/ceph/pull/18486

** Bug watch added: tracker.ceph.com/issues #20557
   http://tracker.ceph.com/issues/20557

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1702777

Title:
  ceph package is not built with jemalloc support

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1702777/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1840348] Re: Ceph 12.2.11-0ubuntu0.18.04.2 doesn't honor suicide_grace

2020-02-13 Thread Dan Hill
** Changed in: ceph (Ubuntu)
   Status: New => Triaged

** Changed in: ceph (Ubuntu)
 Assignee: (unassigned) => Dan Hill (hillpd)

** Changed in: ceph (Ubuntu)
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1840348

Title:
  Ceph 12.2.11-0ubuntu0.18.04.2 doesn't honor suicide_grace

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1840348/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1862850] Re: ceph-mds dependency

2020-02-11 Thread Dan Hill
** Tags added: seg

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1862850

Title:
  ceph-mds dependency

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1862850/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1844173] Re: libwbclient-sssd package installation does not re-create expected symlink to libwbclient.so.0

2020-02-07 Thread Dan Hill
Alternatives is an option, but would require updating both samba and
libwbclient-sssd packaging to cleanly manage the registered wbclient
library versions and their associated priorities.

For eoan and focal the sssd libwbclient version mismatch needs to be
addressed. Either sssd should be updated to match the 0.15 interface, or
the libwbclient.so.0 library links should be split out to include both
major + minor version.

The latter is certainly not appealing. Mixing library sources that
implement the same interface is already problematic. Managing multiple
versions adds unnecessary complexity.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1844173

Title:
  libwbclient-sssd package installation does not re-create expected
  symlink to libwbclient.so.0

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/samba/+bug/1844173/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1844173] Re: libwbclient-sssd package installation does not re-create expected symlink to libwbclient.so.0

2020-02-07 Thread Dan Hill
Dug into this a bit further.
  
Fedora packages both samba [0] and sssd [1] with scripts that use the 
update-alternatives '--install' action. The sssd wbclient library gets 
configured with a lower priority (5) compared to samba's (10). This gives 
samba's library preference when both packages are installed.

This solution provides a consistent install configuration regardless of package 
ordering. 
It also properly registers both package libraries as alternatives. The 
update-alternatives  '--remove' action will only revert symlinks to installed 
alternatives. If an unregistered symlink exists prior to an '--install' action, 
it gets clobbered by the first '--install' action and can not be recovered.

We could consider adopting a similar model, but it worth noting that
fedora is packaging two different wbclient versions. With this commit
[2] samba updated the wbclient interface to minor version 15. The
upstream sssd wbclient library has not yet followed suit [3].

Their linking is setup as follows:
/usr/lib64/libwbclient.so.0.14 -> /etc/alternatives/libwbclient.so.0.14-64
/usr/lib64/libwbclient.so.0.15 -> /etc/alternatives/libwbclient.so.0.15-64

With the generic names pointing to:
/etc/alternatives/libwbclient.so.0.14-64 -> 
/usr/lib64/sssd/modules/libwbclient.so.0.14
/etc/alternatives/libwbclient.so.0.15-64 -> 
/usr/lib64/samba/wbclient/libwbclient.so.0.15

In bionic, the sssd and samba wbclient interface versions are identical (0.14), 
although implemented differently. There is a version mismatch present in eoan 
and focal. 
ubuntu@bionic-sssd:~$ apt-file search libwbclient.so.0
libwbclient-sssd: /usr/lib/x86_64-linux-gnu/sssd/modules/libwbclient.so.0
libwbclient-sssd: /usr/lib/x86_64-linux-gnu/sssd/modules/libwbclient.so.0.14.0
libwbclient0: /usr/lib/x86_64-linux-gnu/libwbclient.so.0
libwbclient0: /usr/lib/x86_64-linux-gnu/libwbclient.so.0.14
  
ubuntu@eoan-sssd:~$ apt-file search libwbclient.so.0
libwbclient-sssd: /usr/lib/x86_64-linux-gnu/sssd/modules/libwbclient.so.0
libwbclient-sssd: /usr/lib/x86_64-linux-gnu/sssd/modules/libwbclient.so.0.14.0
libwbclient0: /usr/lib/x86_64-linux-gnu/libwbclient.so.0
libwbclient0: /usr/lib/x86_64-linux-gnu/libwbclient.so.0.15

ubuntu@focal-sssd:~$ apt-file search libwbclient.so.0
libwbclient-sssd: /usr/lib/x86_64-linux-gnu/sssd/modules/libwbclient.so.0
libwbclient-sssd: /usr/lib/x86_64-linux-gnu/sssd/modules/libwbclient.so.0.14.0
libwbclient0: /usr/lib/x86_64-linux-gnu/libwbclient.so.0
libwbclient0: /usr/lib/x86_64-linux-gnu/libwbclient.so.0.15
  
[0] https://src.fedoraproject.org/rpms/sssd/blob/master/f/sssd.spec#_1062
[1] https://src.fedoraproject.org/rpms/samba/blob/master/f/samba.spec#_1151
[2] 
https://github.com/samba-team/samba/commit/1834513ebe394f4e5111665a21df652e59b3b0b6#diff-0ac0c581f17b5fa5e558397e97fb574d
[3] 
https://pagure.io/SSSD/sssd/blob/master/f/src/sss_client/libwbclient/wbclient_sssd.h#_81

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1844173

Title:
  libwbclient-sssd package installation does not re-create expected
  symlink to libwbclient.so.0

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/samba/+bug/1844173/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1844173] Re: libwbclient-sssd package installation does not re-create expected symlink to libwbclient.so.0

2020-02-06 Thread Dan Hill
** Changed in: samba (Ubuntu)
   Status: Invalid => Won't Fix

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1844173

Title:
  libwbclient-sssd package installation does not re-create expected
  symlink to libwbclient.so.0

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/samba/+bug/1844173/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1843085] Re: Backport of zero-length gc chain fixes to Luminous

2020-01-16 Thread Dan Hill
Sorry, I should have clearly indicated that the test cases were
exercised.

Verification has been completed on both bionic and queens.

** Tags removed: verification-needed verification-needed-bionic
** Tags added: verification-done verification-done-bionic

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1843085

Title:
  Backport of zero-length gc chain fixes to Luminous

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1843085/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1843085] Re: Backport of zero-length gc chain fixes to Luminous

2020-01-15 Thread Dan Hill
** Tags removed: verification-needed
** Tags added: verification-done

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1843085

Title:
  Backport of zero-length gc chain fixes to Luminous

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1843085/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1843085] Re: Backport of zero-length gc chain fixes to Luminous

2020-01-14 Thread Dan Hill
** Description changed:

  [Impact]
  Cancelling large S3/Swift object puts may result in garbage collection 
entries with zero-length chains. Rados gateway garbage collection does not 
efficiently process and clean up these zero-length chains.
  
  A large number of zero-length chains will result in rgw processes
  quickly spinning through the garbage collection lists doing very little
  work. This can result in abnormally high cpu utilization and op
  workloads.
  
  [Test Case]
  Modify garbage collection parameters by editing ceph.conf on the target rgw:
  ```
- [client.rgw.juju-29f238-sf00242079-4]
  rgw enable gc threads = false
  rgw gc obj min wait = 60
  rgw gc processor period = 60
  ```
  
  Restart the ceph-radosgw service to apply the new configuration:
- `sudo systemctl restart ceph-rado...@rgw.juju-29f238-sf00242079-4`
+ `sudo systemctl restart ceph-radosgw@rgw.$HOSTNAME`
  
  Repeatedly interrupt 512MB object put requests for randomized object names:
  ```
  for i in {0..1000}; do 
f=$(mktemp); fallocate -l 512M $f
-   s3cmd put $f s3://test_bucket.juju-29f238-sf00242079-4 --disable-multipart &
+   s3cmd put $f s3://test_bucket --disable-multipart &
pid=$!
sleep $((RANDOM % 7 + 3)); kill $pid
rm $f
  done
  ```
  
  Delete all objects in the bucket index:
  ```
- for f in $(s3cmd ls s3://test_bucket.juju-29f238-sf00242079-4 | awk '{print 
$4}'); do
+ for f in $(s3cmd ls s3://test_bucket | awk '{print $4}'); do
s3cmd del $f
  done
  ```
  
  By default rgw_max_gc_objs splits the garbage collection list into 32 shards.
  Capture omap detail and verify zero-length chains were left over:
  ```
+ export CEPH_ARGS="--id=rgw.$HOSTNAME"
  for i in {0..31}; do 
-   sudo rados -p default.rgw.log --namespace gc listomapvals gc.$i
+   sudo -E rados -p default.rgw.log --namespace gc listomapvals gc.$i
  done
  ```
  
  Confirm the garbage collection list contains expired objects by listing 
expiration timestamps:
- `sudo radosgw-admin gc list | grep time; date`
+ `sudo -E radosgw-admin gc list | grep time; date`
  
  Raise the debug level and process the garbage collection list:
- `CEPH_ARGS="--debug-rgw=20 --err-to-stderr" sudo -E radosgw-admin gc process`
+ `sudo -E radosgw-admin --debug-rgw=20 --err-to-stderr gc process`
  
  Use the logs to verify the garbage collection process iterates through all 
remaining omap entry tags. Then confirm all rados objects have been cleaned up:
- `sudo rados -p default.rgw.buckets.data ls`
- 
+ `sudo -E rados -p default.rgw.buckets.data ls`
  
  [Regression Potential]
  Backport has been accepted into the Luminous release stable branch upstream.
  
  [Other Information]
  This issue has been reported upstream [0] and was fixed in Nautilus alongside 
a number of other garbage collection issues/enhancements in pr#26601 [1]:
  * adds additional logging to make future debugging easier.
  * resolves bug where the truncated flag was not always set correctly in 
gc_iterate_entries
  * resolves bug where marker in RGWGC::process was not advanced
  * resolves bug in which gc entries with a zero-length chain were not trimmed
  * resolves bug where same gc entry tag was added to list for deletion 
multiple times
  
  These fixes were slated for back-port into Luminous and Mimic, but the
  Luminous work was not completed because of a required dependency: AIO GC
  [2]. This dependency has been resolved upstream, and is pending SRU
  verification in Ubuntu packages [3].
  
  [0] https://tracker.ceph.com/issues/38454
  [1] https://github.com/ceph/ceph/pull/26601
  [2] https://tracker.ceph.com/issues/23223
  [3] https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1838858

** Tags removed: verification-queens-needed
** Tags added: verification-queens-done

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1843085

Title:
  Backport of zero-length gc chain fixes to Luminous

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1843085/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1843085] Re: Backport of zero-length gc chain fixes to Luminous

2020-01-13 Thread Dan Hill
** Description changed:

  [Impact]
  Cancelling large S3/Swift object puts may result in garbage collection 
entries with zero-length chains. Rados gateway garbage collection does not 
efficiently process and clean up these zero-length chains.
  
  A large number of zero-length chains will result in rgw processes
  quickly spinning through the garbage collection lists doing very little
  work. This can result in abnormally high cpu utilization and op
  workloads.
  
  [Test Case]
- Disable garbage collection:
- `juju config ceph-radosgw config-flags='{"rgw": {"rgw enable gc threads": 
"false"}}'`
+ Modify garbage collection parameters by editing ceph.conf on the target rgw:
+ ```
+ [client.rgw.juju-29f238-sf00242079-4]
+ rgw enable gc threads = false
+ rgw gc obj min wait = 60
+ rgw gc processor period = 60
+ ```
  
- Repeatedly kill 256MB object put requests for randomized object names.
- `for i in {0.. 1000}; do f=$(mktemp); fallocate -l 256M $f; s3cmd put $f 
s3://test_bucket &; pid=$!; sleep $((RANDOM % 3)); kill $pid; rm $f; done`
+ Restart the ceph-radosgw service to apply the new configuration:
+ `sudo systemctl restart ceph-rado...@rgw.juju-29f238-sf00242079-4`
  
- Capture omap detail. Verify zero-length chains were created:
- `for i in $(seq 0 ${RGW_GC_MAX_OBJS:-32}); do rados -p default.rgw.log 
--namespace gc listomapvals gc.$i; done`
+ Repeatedly interrupt 512MB object put requests for randomized object names:
+ ```
+ for i in {0..1000}; do 
+   f=$(mktemp); fallocate -l 512M $f
+   s3cmd put $f s3://test_bucket.juju-29f238-sf00242079-4 --disable-multipart &
+   pid=$!
+   sleep $((RANDOM % 7 + 3)); kill $pid
+   rm $f
+ done
+ ```
  
- Raise radosgw debug levels, and enable garbage collection:
- `juju config ceph-radosgw config-flags='{"rgw": {"rgw enable gc threads": 
"false"}}' loglevel=20`
+ Delete all objects in the bucket index:
+ ```
+ for f in $(s3cmd ls s3://test_bucket.juju-29f238-sf00242079-4 | awk '{print 
$4}'); do
+   s3cmd del $f
+ done
+ ```
  
- Verify zero-lenth chains are processed correctly by inspecting radosgw
- logs.
+ By default rgw_max_gc_objs splits the garbage collection list into 32 shards.
+ Capture omap detail and verify zero-length chains were left over:
+ ```
+ for i in {0..31}; do 
+   sudo rados -p default.rgw.log --namespace gc listomapvals gc.$i
+ done
+ ```
+ 
+ Confirm the garbage collection list contains expired objects by listing 
expiration timestamps:
+ `sudo radosgw-admin gc list | grep time; date`
+ 
+ Raise the debug level and process the garbage collection list:
+ `CEPH_ARGS="--debug-rgw=20 --err-to-stderr" sudo -E radosgw-admin gc process`
+ 
+ Use the logs to verify the garbage collection process iterates through all 
remaining omap entry tags. Then confirm all rados objects have been cleaned up:
+ `sudo rados -p default.rgw.buckets.data ls`
+ 
  
  [Regression Potential]
  Backport has been accepted into the Luminous release stable branch upstream.
  
  [Other Information]
  This issue has been reported upstream [0] and was fixed in Nautilus alongside 
a number of other garbage collection issues/enhancements in pr#26601 [1]:
  * adds additional logging to make future debugging easier.
  * resolves bug where the truncated flag was not always set correctly in 
gc_iterate_entries
  * resolves bug where marker in RGWGC::process was not advanced
  * resolves bug in which gc entries with a zero-length chain were not trimmed
  * resolves bug where same gc entry tag was added to list for deletion 
multiple times
  
  These fixes were slated for back-port into Luminous and Mimic, but the
  Luminous work was not completed because of a required dependency: AIO GC
  [2]. This dependency has been resolved upstream, and is pending SRU
  verification in Ubuntu packages [3].
  
  [0] https://tracker.ceph.com/issues/38454
  [1] https://github.com/ceph/ceph/pull/26601
  [2] https://tracker.ceph.com/issues/23223
  [3] https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1838858

** Tags removed: verification-needed-bionic
** Tags added: verification-done-bionic

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1843085

Title:
  Backport of zero-length gc chain fixes to Luminous

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1843085/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1843085] Re: Backport of zero-length gc chain fixes to Luminous

2019-11-21 Thread Dan Hill
Upstream back-port is being tracked by issue#38714, and the pr#31664 [1]
is pending upstream review.

[0] https://tracker.ceph.com/issues/38714
[1] https://github.com/ceph/ceph/pull/31664

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1843085

Title:
  Backport of zero-length gc chain fixes to Luminous

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1843085/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1842656] Re: Bitmap allocator returns duplicate entries

2019-11-07 Thread Dan Hill
issue#40080 has been addressed upstream and the fix will be delivered in
the next point release of luminous (12.2.13).

The bitmap allocator is an experimental feature, that has been reported
unstable by upstream [0]. There isn't an urgent need to drive this fix
ahead of upstream.

Marking this bug as 'won't fix' and will update to 'fix released' when
an SRU for 12.2.13 lands.

[0] https://www.spinics.net/lists/ceph-devel/msg44622.html

** Changed in: ceph (Ubuntu)
   Status: New => Won't Fix

** Changed in: ceph (Ubuntu Bionic)
   Status: New => Won't Fix

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1842656

Title:
  Bitmap allocator returns duplicate entries

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1842656/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1843085] Re: Backport of zero-length gc chain fixes to Luminous

2019-09-18 Thread Dan Hill
Want to clearly state that while AIO GC is a dependency, these fixes do
not address anything introduced by that feature.

The fixes address bugs that existed prior to AIO GC.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1843085

Title:
  Backport of zero-length gc chain fixes to Luminous

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1843085/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1838858] Re: backport GC AIO to Luminous

2019-09-18 Thread Dan Hill
The issues and fixes discuss in comments #3 and #4 are being tracked
under lp#1843085.

They have AIO GC as a dependency, but are not required for this SRU.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1838858

Title:
  backport GC AIO to Luminous

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1838858/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1843085] Re: Backport of zero-length gc chain fixes to Luminous

2019-09-17 Thread Dan Hill
** Changed in: ceph (Ubuntu Bionic)
   Status: New => In Progress

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1843085

Title:
  Backport of zero-length gc chain fixes to Luminous

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1843085/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1843085] Re: Backport of zero-length gc chain fixes to Luminous

2019-09-17 Thread Dan Hill
pr#30367 [0] is currently pending upstream review, but needs to have
build issues resolved.

[0] https://github.com/ceph/ceph/pull/30367

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1843085

Title:
  Backport of zero-length gc chain fixes to Luminous

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1843085/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1843085] Re: Need backport of 0-length gc chain fixes to Luminous

2019-09-17 Thread Dan Hill
** Description changed:

- This issue in the Ceph tracker has been encountered repeatedly with 
significant adverse effects on Ceph 12.2.11/12 in Bionic:
- https://tracker.ceph.com/issues/38454
+ [Impact]
+ Cancelling large S3/Swift object puts may result in garbage collection 
entries with zero-length chains. Rados gateway garbage collection does not 
efficiently process and clean up these zero-length chains.
  
- This PR is the likely candidate for backporting to correct the issue:
- https://github.com/ceph/ceph/pull/26601
+ A large number of zero-length chains will result in rgw processes
+ quickly spinning through the garbage collection lists doing very little
+ work. This can result in abnormally high cpu utilization and op
+ workloads.
+ 
+ [Test Case]
+ Disable garbage collection:
+ `juju config ceph-radosgw config-flags='{"rgw": {"rgw enable gc threads": 
"false"}}'`
+ 
+ Repeatedly kill 256MB object put requests for randomized object names.
+ `for i in {0.. 1000}; do f=$(mktemp); fallocate -l 256M $f; s3cmd put $f 
s3://test_bucket &; pid=$!; sleep $((RANDOM % 3)); kill $pid; rm $f; done`
+ 
+ Capture omap detail. Verify zero-length chains were created:
+ `for i in $(seq 0 ${RGW_GC_MAX_OBJS:-32}); do rados -p default.rgw.log 
--namespace gc listomapvals gc.$i; done`
+ 
+ Raise radosgw debug levels, and enable garbage collection:
+ `juju config ceph-radosgw config-flags='{"rgw": {"rgw enable gc threads": 
"false"}}' loglevel=20`
+ 
+ Verify zero-lenth chains are processed correctly by inspecting radosgw
+ logs.
+ 
+ [Regression Potential]
+ {Pending} Back-port still needs to be accepted upstream. Need complete fix to 
assess regression potential.
+ 
+ [Other Information]
+ This issue has been reported upstream [0] and was fixed in Nautilus alongside 
a number of other garbage collection issues/enhancements in pr#26601 [1]:
+ * adds additional logging to make future debugging easier.
+ * resolves bug where the truncated flag was not always set correctly in 
gc_iterate_entries
+ * resolves bug where marker in RGWGC::process was not advanced
+ * resolves bug in which gc entries with a zero-length chain were not trimmed
+ * resolves bug where same gc entry tag was added to list for deletion 
multiple times
+ 
+ These fixes were slated for back-port into Luminous and Mimic, but the
+ Luminous work was not completed because of a required dependency: AIO GC
+ [2]. This dependency has been resolved upstream, and is pending SRU
+ verification in Ubuntu packages [3].
+ 
+ [0] https://tracker.ceph.com/issues/38454
+ [1] https://github.com/ceph/ceph/pull/26601
+ [2] https://tracker.ceph.com/issues/23223
+ [3] https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1838858

** Also affects: cloud-archive
   Importance: Undecided
   Status: New

** Summary changed:

- Need backport of 0-length gc chain fixes to Luminous
+ Backport of zero-length gc chain fixes to Luminous

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1843085

Title:
  Backport of zero-length gc chain fixes to Luminous

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1843085/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1843085] Re: Need backport of 0-length gc chain fixes to Luminous

2019-09-16 Thread Dan Hill
** Also affects: ceph (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Changed in: ceph (Ubuntu)
 Assignee: Dan Hill (hillpd) => (unassigned)

** Changed in: ceph (Ubuntu Bionic)
 Assignee: (unassigned) => Dan Hill (hillpd)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1843085

Title:
  Need backport of 0-length gc chain fixes to Luminous

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1843085/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1843085] Re: Need backport of 0-length gc chain fixes to Luminous

2019-09-16 Thread Dan Hill
** Changed in: ceph (Ubuntu)
 Assignee: (unassigned) => Dan Hill (hillpd)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1843085

Title:
  Need backport of 0-length gc chain fixes to Luminous

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1843085/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1838109] Re: civetweb does not allow tuning of maximum socket connections

2019-08-30 Thread Dan Hill
** Description changed:

  [Impact]
  RADOS gateway can run out of sockets prior to consuming the CPU and memory 
resources on the server on which it is running.
  
  [Test Case]
  Deploy RGW to a large server; scale test - RGW processes will only be able to 
service around 100 open connections.
  
  [Regression Potential]
  Medium; the fix introduces a new configuration option for civetweb (the web 
connector for RGW) to allow the max connections to be set via configuration, 
rather than being set during compilation; improvement has been accepted 
upstream in the civetweb project.
  
  [Original Bug Report]
  Civetweb does not offer an option for configuring the maximum number of 
sockets available. Some users run out of sockets and are left with no 
workaround.
  
- This patch adds a new user-configurable parameter, "so_max_connections".
+ This patch adds a new user-configurable parameter, "max_connections".
  
  See:
  https://github.com/civetweb/civetweb/issues/775

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1838109

Title:
  civetweb does not allow tuning of maximum socket connections

To manage notifications about this bug go to:
https://bugs.launchpad.net/ceph/+bug/1838109/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1840347] Re: Ceph 12.2.12 restarts services during upgrade

2019-08-15 Thread Dan Hill
The expected behavior during a package upgrade is to leave all the ceph
service states unmodified. They should not be enabled/disabled or
stopped/started.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1840347

Title:
  Ceph 12.2.12  restarts services during upgrade

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1840347/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1840347] Re: Ceph 12.2.12 restarts services during upgrade

2019-08-15 Thread Dan Hill
** Description changed:

  Upgrading from 12.2.11-0ubuntu0.18.04.2 to 12.2.12-0ubuntu0.18.04.1 on
  Ubuntu 18.04 causes the ceph-osd services to be restarted without
  prompting.
  
- This appears to be in the configure section on the postinst:
+ This appears to be in the configure section on the ceph-common,postinst:
  # Automatically added by dh_systemd_start/11.1.6ubuntu2
  if [ "$1" = "configure" ] || [ "$1" = "abort-upgrade" ] || [ "$1" = 
"abort-deconfigure" ] || [ "$1" = "abort-remove" ] ; then
- if [ -d /run/systemd/system ]; then
- systemctl --system daemon-reload >/dev/null || true
- if [ -n "$2" ]; then
- _dh_action=restart
- else
- _dh_action=start
- fi
- deb-systemd-invoke $_dh_action 'ceph.target' >/dev/null || 
true
- fi
+ if [ -d /run/systemd/system ]; then
+ systemctl --system daemon-reload >/dev/null || true
+ if [ -n "$2" ]; then
+ _dh_action=restart
+ else
+ _dh_action=start
+ fi
+ deb-systemd-invoke $_dh_action 'ceph.target' >/dev/null || 
true
+ fi
  fi
  # End automatically added section
- ```
  
+ dpkg.log after the upgrade shows that "configure" was exercised:
  2019-08-15 10:49:18 upgrade ceph-common:amd64 12.2.11-0ubuntu0.18.04.2 
12.2.12-
  ...
  2019-08-15 10:49:29 configure ceph-common:amd64 12.2.12-0ubuntu0.18.04.1 

  ..
  2019-08-15 10:49:56 status installed ceph-common:amd64 
12.2.12-0ubuntu0.18.04.1

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1840347

Title:
  Ceph 12.2.12  restarts services during upgrade

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1840347/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1828467] Re: [sru] remove juju-db stop/start service interactions

2019-05-09 Thread Dan Hill
** Changed in: sosreport (Ubuntu Eoan)
   Status: New => In Progress

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1828467

Title:
  [sru] remove juju-db stop/start service interactions

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/sosreport/+bug/1828467/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1828467] Re: [sru] remove juju-db stop/start service interactions

2019-05-09 Thread Dan Hill
** Also affects: sosreport (Ubuntu Cosmic)
   Importance: Undecided
   Status: New

** Also affects: sosreport (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: sosreport (Ubuntu Trusty)
   Importance: Undecided
   Status: New

** Also affects: sosreport (Ubuntu Eoan)
   Importance: Undecided
   Status: New

** Also affects: sosreport (Ubuntu Disco)
   Importance: Undecided
   Status: New

** Also affects: sosreport (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Changed in: sosreport (Ubuntu Xenial)
 Assignee: (unassigned) => Dan Hill (hillpd)

** Changed in: sosreport (Ubuntu Bionic)
 Assignee: (unassigned) => Dan Hill (hillpd)

** Changed in: sosreport (Ubuntu Cosmic)
 Assignee: (unassigned) => Dan Hill (hillpd)

** Changed in: sosreport (Ubuntu Disco)
 Assignee: (unassigned) => Dan Hill (hillpd)

** Changed in: sosreport (Ubuntu Eoan)
 Assignee: (unassigned) => Dan Hill (hillpd)

** Changed in: sosreport (Ubuntu Trusty)
 Assignee: (unassigned) => Dan Hill (hillpd)

** Changed in: sosreport (Ubuntu Trusty)
   Importance: Undecided => Low

** Changed in: sosreport (Ubuntu Xenial)
   Importance: Undecided => Low

** Changed in: sosreport (Ubuntu Bionic)
   Importance: Undecided => Low

** Changed in: sosreport (Ubuntu Cosmic)
   Importance: Undecided => Low

** Changed in: sosreport (Ubuntu Disco)
   Importance: Undecided => Low

** Changed in: sosreport (Ubuntu Eoan)
   Importance: Undecided => Low

** Tags added: sts

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1828467

Title:
  [sru] remove juju-db stop/start service interactions

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/sosreport/+bug/1828467/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1828467] [NEW] [sru] remove juju-db stop/start service interactions

2019-05-09 Thread Dan Hill
Public bug reported:

[Impact]

The juju plugin will stop and start the juju-db service during data collection.
sosreport should not impact running services, or attempt to recover them.

This has been reported upstream and will be fixed by the juju 2.x refactor:
https://github.com/sosreport/sos/issues/1653

This is a stop-gap tracking the removal of the juju-db service restart code in
existing sosreport releases.
 
[Test Case]

 * Install sosreport
 * Run sosreport, ensuring that the juju plugin is exercised.
 * Confirm the juju-db service was not restarted, and mongoexport data captured.

Check for errors while running, or in /tmp/sosreport-*/sos_logs/

[Regression Potential]

 * Risk is low.
 * Change is limited in scope to the juju plugin.
 * Worst-case scenario is that the mongoexport command will fail to collect
   any data.

** Affects: sosreport (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1828467

Title:
  [sru] remove juju-db stop/start service interactions

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/sosreport/+bug/1828467/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1189721] Re: Ralink RT3290 doesn't have a bluetooth driver

2015-03-13 Thread Dan Hill
I do not have Bluetooth here either. None of the proposed drivers would
compile.

Ubuntu 14.10
3.16.0-31-generic 
HP Envy Touchsmart j009wm

$ rfkill list all
0: phy0: Wireless LAN
Soft blocked: no
Hard blocked: no

$ lspci | grep Bluetooth
04:00.1 Bluetooth: Ralink corp. RT3290 Bluetooth

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1189721

Title:
  Ralink RT3290 doesn't have a bluetooth driver

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1189721/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs