Public bug reported:

After installing cephadm on Resolute:

ii cephadm 20.2.0-0ubuntu2 all

Trying to bootstrap a cluster:

$ sudo cephadm bootstrap --mon-ip 10.3.1.190
Creating directory /etc/ceph for ceph.conf
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chrony.service is enabled and running
Repeating the final host check...
docker (/usr/bin/docker) is present
systemctl is present
lvcreate is present
Unit chrony.service is enabled and running
Host looks OK
Cluster fsid: e4638df2-4eaf-11f1-8000-fa163ebbf652
Verifying IP 10.3.1.190 port 3300 ...
Verifying IP 10.3.1.190 port 6789 ...
Mon IP `10.3.1.190` is in CIDR network `10.3.0.0/21`
Mon IP `10.3.1.190` is in CIDR network `10.3.0.0/21`
Mon IP `10.3.1.190` is in CIDR network `10.3.0.1/32`
Mon IP `10.3.1.190` is in CIDR network `10.3.0.1/32`
Mon IP `10.3.1.190` is in CIDR network `10.3.1.100/32`
Mon IP `10.3.1.190` is in CIDR network `10.3.1.100/32`
Internal network (--cluster-network) has not been provided, OSD replication 
will default to the public_network
Pulling container image quay.io/ceph/ceph:v20...
Ceph version: ceph version 20.2.1 (6a49aff47758778a5f5951e731d437c317f72fb2) 
tentacle (stable)
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Non-zero exit code 1 from install -d -m0770 -o 167 -g 167 
/var/run/ceph/e4638df2-4eaf-11f1-8000-fa163ebbf652
install: stderr install: invalid user: '167'
RuntimeError: Failed command: install -d -m0770 -o 167 -g 167 
/var/run/ceph/e4638df2-4eaf-11f1-8000-fa163ebbf652: install: invalid user: '167'

        ***************
        Cephadm hit an issue during cluster installation. Current cluster files 
will be deleted automatically.
        To disable this behaviour you can pass the --no-cleanup-on-failure 
flag. In case of any previous
        broken installation, users must use the following command to completely 
delete the broken cluster:

        > cephadm rm-cluster --force --zap-osds --fsid <fsid>

        for more information please refer to 
https://docs.ceph.com/en/latest/cephadm/operations/#purging-a-cluster
        ***************

Deleting cluster with fsid: e4638df2-4eaf-11f1-8000-fa163ebbf652
Traceback (most recent call last):
  File "/usr/sbin/cephadm", line 5288, in <module>
    main()
    ~~~~^^
  File "/usr/sbin/cephadm", line 5276, in main
    r = ctx.func(ctx)
  File "/usr/sbin/cephadm", line 2524, in _rollback
    return func(ctx)
  File "/usr/sbin/cephadm", line 434, in _default_image
    return func(ctx)
  File "/usr/sbin/cephadm", line 2684, in command_bootstrap
    make_var_run(ctx, fsid, uid, gid)
    ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
  File "/usr/sbin/cephadm", line 508, in make_var_run
    call_throws(ctx, ['install', '-d', '-m0770', '-o', str(uid), '-g', str(gid),
    ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                      '/var/run/ceph/%s' % fsid])
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/cephadmlib/call_wrappers.py", line 307, 
in call_throws
    raise RuntimeError(
        f'Failed command: {" ".join(command)}: {s}'
    )

RuntimeError: Failed command: install -d -m0770 -o 167 -g 167
/var/run/ceph/e4638df2-4eaf-11f1-8000-fa163ebbf652: install: invalid
user: '167'

On RHEL, uid/gid is reserved for the ceph user, however we do not have
that on Ubuntu and this needs to get fixed on the next SRU.

Thanks,
Alan

** Affects: ceph (Ubuntu)
     Importance: Undecided
         Status: New

** Description changed:

  After installing cephadm on Resolute:
  
  ii cephadm 20.2.0-0ubuntu2 all
  
  Trying to bootstrap a cluster:
  
  $ sudo cephadm bootstrap --mon-ip 10.3.1.190
- 
- $ sudo cephadm bootstrap --mon-ip 10.3.1.190                                  
                                                                                
                                                         
- Creating directory /etc/ceph for ceph.conf                                    
                                                                                
                                                                                
- Verifying podman|docker is present...                                         
                                                                                
                                                                                
- Verifying lvm2 is present...                                                  
                                                                                
                                                                                
- Verifying time synchronization is in place...                                 
                                                                                
                                                                                
- Unit chrony.service is enabled and running                                    
                                                                                
                                                                                
- Repeating the final host check...                                             
                                                                                
                                                                                
- docker (/usr/bin/docker) is present                                           
                                                                                
                                                                                
- systemctl is present                                                          
                                                                                
                                                                                
- lvcreate is present                                                           
                                                                                
                                                                                
- Unit chrony.service is enabled and running                                    
                                                                                
                                                                                
- Host looks OK                                                                 
                                                                                
                                                                                
- Cluster fsid: e4638df2-4eaf-11f1-8000-fa163ebbf652                            
                                                                                
                                                                                
- Verifying IP 10.3.1.190 port 3300 ...                                         
                                                                                
                                                                                
- Verifying IP 10.3.1.190 port 6789 ...                                         
                                                                                
                                                                                
- Mon IP `10.3.1.190` is in CIDR network `10.3.0.0/21`                          
                                                                                
                                                                                
- Mon IP `10.3.1.190` is in CIDR network `10.3.0.0/21`                          
                                                                                
                                                                                
- Mon IP `10.3.1.190` is in CIDR network `10.3.0.1/32`                          
                                                                                
                                                                                
- Mon IP `10.3.1.190` is in CIDR network `10.3.0.1/32`                          
                                                                                
                                                                                
- Mon IP `10.3.1.190` is in CIDR network `10.3.1.100/32`                        
                                                                                
                                                                                
+ Creating directory /etc/ceph for ceph.conf
+ Verifying podman|docker is present...
+ Verifying lvm2 is present...
+ Verifying time synchronization is in place...
+ Unit chrony.service is enabled and running
+ Repeating the final host check...
+ docker (/usr/bin/docker) is present
+ systemctl is present
+ lvcreate is present
+ Unit chrony.service is enabled and running
+ Host looks OK
+ Cluster fsid: e4638df2-4eaf-11f1-8000-fa163ebbf652
+ Verifying IP 10.3.1.190 port 3300 ...
+ Verifying IP 10.3.1.190 port 6789 ...
+ Mon IP `10.3.1.190` is in CIDR network `10.3.0.0/21`
+ Mon IP `10.3.1.190` is in CIDR network `10.3.0.0/21`
+ Mon IP `10.3.1.190` is in CIDR network `10.3.0.1/32`
+ Mon IP `10.3.1.190` is in CIDR network `10.3.0.1/32`
+ Mon IP `10.3.1.190` is in CIDR network `10.3.1.100/32`
  Mon IP `10.3.1.190` is in CIDR network `10.3.1.100/32`
  Internal network (--cluster-network) has not been provided, OSD replication 
will default to the public_network
  Pulling container image quay.io/ceph/ceph:v20...
  Ceph version: ceph version 20.2.1 (6a49aff47758778a5f5951e731d437c317f72fb2) 
tentacle (stable)
  Extracting ceph user uid/gid from container image...
  Creating initial keys...
  Creating initial monmap...
  Creating mon...
  Non-zero exit code 1 from install -d -m0770 -o 167 -g 167 
/var/run/ceph/e4638df2-4eaf-11f1-8000-fa163ebbf652
  install: stderr install: invalid user: '167'
  RuntimeError: Failed command: install -d -m0770 -o 167 -g 167 
/var/run/ceph/e4638df2-4eaf-11f1-8000-fa163ebbf652: install: invalid user: '167'
  
+         ***************
+         Cephadm hit an issue during cluster installation. Current cluster 
files will be deleted automatically.
+         To disable this behaviour you can pass the --no-cleanup-on-failure 
flag. In case of any previous
+         broken installation, users must use the following command to 
completely delete the broken cluster:
  
-         ***************
-         Cephadm hit an issue during cluster installation. Current cluster 
files will be deleted automatically.
-         To disable this behaviour you can pass the --no-cleanup-on-failure 
flag. In case of any previous
-         broken installation, users must use the following command to 
completely delete the broken cluster:
+         > cephadm rm-cluster --force --zap-osds --fsid <fsid>
  
-         > cephadm rm-cluster --force --zap-osds --fsid <fsid>
- 
-         for more information please refer to 
https://docs.ceph.com/en/latest/cephadm/operations/#purging-a-cluster
-         ***************
+         for more information please refer to 
https://docs.ceph.com/en/latest/cephadm/operations/#purging-a-cluster
+         ***************
  
  Deleting cluster with fsid: e4638df2-4eaf-11f1-8000-fa163ebbf652
  Traceback (most recent call last):
-   File "/usr/sbin/cephadm", line 5288, in <module>
-     main()
-     ~~~~^^
-   File "/usr/sbin/cephadm", line 5276, in main
-     r = ctx.func(ctx)
-   File "/usr/sbin/cephadm", line 2524, in _rollback
-     return func(ctx)
-   File "/usr/sbin/cephadm", line 434, in _default_image
-     return func(ctx)
-   File "/usr/sbin/cephadm", line 2684, in command_bootstrap
-     make_var_run(ctx, fsid, uid, gid)
-     ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
-   File "/usr/sbin/cephadm", line 508, in make_var_run
-     call_throws(ctx, ['install', '-d', '-m0770', '-o', str(uid), '-g', 
str(gid),
-     
~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-                       '/var/run/ceph/%s' % fsid])
-                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
-   File "/usr/lib/python3/dist-packages/cephadmlib/call_wrappers.py", line 
307, in call_throws
-     raise RuntimeError(
-         f'Failed command: {" ".join(command)}: {s}'
-     )
+   File "/usr/sbin/cephadm", line 5288, in <module>
+     main()
+     ~~~~^^
+   File "/usr/sbin/cephadm", line 5276, in main
+     r = ctx.func(ctx)
+   File "/usr/sbin/cephadm", line 2524, in _rollback
+     return func(ctx)
+   File "/usr/sbin/cephadm", line 434, in _default_image
+     return func(ctx)
+   File "/usr/sbin/cephadm", line 2684, in command_bootstrap
+     make_var_run(ctx, fsid, uid, gid)
+     ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
+   File "/usr/sbin/cephadm", line 508, in make_var_run
+     call_throws(ctx, ['install', '-d', '-m0770', '-o', str(uid), '-g', 
str(gid),
+     
~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+                       '/var/run/ceph/%s' % fsid])
+                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
+   File "/usr/lib/python3/dist-packages/cephadmlib/call_wrappers.py", line 
307, in call_throws
+     raise RuntimeError(
+         f'Failed command: {" ".join(command)}: {s}'
+     )
  
  RuntimeError: Failed command: install -d -m0770 -o 167 -g 167
  /var/run/ceph/e4638df2-4eaf-11f1-8000-fa163ebbf652: install: invalid
  user: '167'
  
- 
- On RHEL, uid/gid is reserved for the ceph user, however we do not have that 
on Ubuntu and this needs to get fixed on the next SRU.
+ On RHEL, uid/gid is reserved for the ceph user, however we do not have
+ that on Ubuntu and this needs to get fixed on the next SRU.
  
  Thanks,
  Alan

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2152460

Title:
  Ceph 20.2.0 - cephadm cluster bootstrap fails due to hardcoded uid and
  gid

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/2152460/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to