Dear Ceph Team,
Our cluster includes three Ceph nodes with 1 MON and 1 OSD in each. All nodes
are running on CentOS 6.5 (kernel 2.6.32) VMs in a testing cluster, not
production. The script we’re using is a simplified sequence of steps that does
more or less what the ceph-cookbook does. Using OpenStack Cinder, we have
attached a 10G block volume to each node in order to setup the OSD. After
running our ceph cluster initialization script (pasted below), our cluster has
a status of HEALTH_WARN and PG status of incomplete. Additionally all PGs in
every Ceph node have the same acting and up set: [0]. Is this an indicator that
the PG’s have not even started the creating state, since not every OSD has the
id 0 yet they all state 0 as their up and acting OSD? Additionally the weight
of all OSD’s is 0. Overall, the OSD’s appear to be up and in. The network
appears to be fine; we are able to ping & telnet to each server from one
another.
In order to isolate our problem, we tried replacing the attached cinder volume
for a 10G xfs formatted file mounted to /ceph-data. We set OSD_PATH=/ceph-data
and JOURNAL_PATH=/ceph-data/journal, and kept the rest of our setup_ceph.sh
script the same. Our ceph cluster was able to reach a status of HEALTH_OK and
all PGs were active+clean.
What seems to be missing is the communication between the OSDs to
replicate/create the PGs correctly. Any advice on what’s blocking the PGs from
reaching an active+clean state? We are very stumped as to why the cluster using
an attached cinder volume fails to reach HEALTH_OK.
If I left out any important information or explanation on how the ceph cluster
was created, let me know. Thank you!
Sincerely,
Johanni B. Thunstrom
Health Output:
ceph –s
cluster cbbcfd09-9e8e-4cd1-905f-4b8e0fdb48cf
health HEALTH_WARN 192 pgs incomplete; 192 pgs stuck inactive; 192 pgs
stuck unclean
monmap e3: 3 mons at
{cephscriptdeplcindervol01=10.98.66.235:6789/0,cephscriptdeplcindervol02=10.98.66.229:6789/0,cephscriptdeplcindervol03=10.98.66.226:6789/0},
election epoch 6, quorum 0,1,2
cephscriptdeplcindervol03,cephscriptdeplcindervol02,cephscriptdeplcindervol01
osdmap e11: 3 osds: 3 up, 3 in
pgmap v23: 192 pgs, 3 pools, 0 bytes data, 0 objects
101608 kB used, 15227 MB / 15326 MB avail
192 incomplete
ceph health detail
HEALTH_WARN 192 pgs incomplete; 192 pgs stuck inactive; 192 pgs stuck unclean
pg 1.2c is stuck inactive since forever, current state incomplete, last acting
[0]
pg 0.2d is stuck inactive since forever, current state incomplete, last acting
[0]
..
…
..
pg 0.2e is stuck unclean since forever, current state incomplete, last acting
[0]
pg 1.2f is stuck unclean since forever, current state incomplete, last acting
[0]
pg 2.2c is stuck unclean since forever, current state incomplete, last acting
[0]
pg 2.2f is incomplete, acting [0] (reducing pool rbd min_size from 2 may help;
search ceph.com/docs for 'incomplete')
..
….
..
pg 1.30 is incomplete, acting [0] (reducing pool metadata min_size from 2 may
help; search ceph.com/docs for 'incomplete')
pg 0.31 is incomplete, acting [0] (reducing pool data min_size from 2 may help;
search ceph.com/docs for 'incomplete')
pg 2.32 is incomplete, acting [0] (reducing pool rbd min_size from 2 may help;
search ceph.com/docs for 'incomplete')
pg 1.31 is incomplete, acting [0] (reducing pool metadata min_size from 2 may
help; search ceph.com/docs for 'incomplete')
pg 0.30 is incomplete, acting [0] (reducing pool data min_size from 2 may help;
search ceph.com/docs for 'incomplete')
pg 2.2d is incomplete, acting [0] (reducing pool rbd min_size from 2 may help;
search ceph.com/docs for 'incomplete')
pg 1.2e is incomplete, acting [0] (reducing pool metadata min_size from 2 may
help; search ceph.com/docs for 'incomplete')
pg 0.2f is incomplete, acting [0] (reducing pool data min_size from 2 may help;
search ceph.com/docs for 'incomplete')
pg 2.2c is incomplete, acting [0] (reducing pool rbd min_size from 2 may help;
search ceph.com/docs for 'incomplete')
pg 1.2f is incomplete, acting [0] (reducing pool metadata min_size from 2 may
help; search ceph.com/docs for 'incomplete')
pg 0.2e is incomplete, acting [0] (reducing pool data min_size from 2 may help;
search ceph.com/docs for 'incomplete')
ceph mon dump
dumped monmap epoch 3
epoch 3
fsid cbbcfd09-9e8e-4cd1-905f-4b8e0fdb48cf
last_changed 2015-05-18 23:10:39.218552
created 0.000000
0: 10.98.66.226:6789/0 mon.cephscriptdeplcindervol03
1: 10.98.66.229:6789/0 mon.cephscriptdeplcindervol02
2: 10.98.66.235:6789/0 mon.cephscriptdeplcindervol01
ceph osd dump
epoch 11
fsid cbbcfd09-9e8e-4cd1-905f-4b8e0fdb48cf
created 2015-05-18 22:35:14.823379
modified 2015-05-18 23:10:59.037467
flags
pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins
pg_num 64 pgp_num 64 last_change 1 flags hashpspool crash_replay_interval 45
stripe_width 0
pool 1 'metadata' replicated size 3 min_size 2 crush_ruleset 0 object_hash
rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
pool 2 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins
pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
max_osd 3
osd.0 up in weight 1 up_from 4 up_thru 5 down_at 0 last_clean_interval [0,0)
10.98.66.235:6800/3959 10.98.66.235:6801/3959 10.98.66.235:6802/3959
10.98.66.235:6803/3959 exists,up 71c866d3-2163-4574-a0aa-a4e0fa8c3569
osd.1 up in weight 1 up_from 8 up_thru 0 down_at 0 last_clean_interval [0,0)
10.98.66.229:6800/4137 10.98.66.229:6801/4137 10.98.66.229:6802/4137
10.98.66.229:6803/4137 exists,up 1ee644fc-3fc7-4f3b-9e5b-96ba6a8afb99
osd.2 up in weight 1 up_from 11 up_thru 0 down_at 0 last_clean_interval
[0,0) 10.98.66.226:6800/4139 10.98.66.226:6801/4139 10.98.66.226:6802/4139
10.98.66.226:6803/4139 exists,up 6bee9a39-b909-483f-a5a0-ed4e1b016638
ceph osd tree
# id weighttype name up/downreweight
-1 0root default
-2 0host cephscriptdeplcindervol01
0 0osd.0 up1
-3 0host cephscriptdeplcindervol02
1 0osd.1 up1
-4 0host cephscriptdeplcindervol03
2 0osd.2 up1
*on second ceph node
ceph pg map 0.1f
osdmap e11 pg 0.1f (0.1f) -> up [0] acting [0]
*on first (bootstrap) ceph node
ceph pg map 0.1f
osdmap e11 pg 0.1f (0.1f) -> up [0] acting [0]
ceph pg 0.1f query
{ "state": "incomplete",
"epoch": 11,
"up": [
0],
"acting": [
0],
"info": { "pgid": "0.1f",
"last_update": "0'0",
"last_complete": "0'0",
"log_tail": "0'0",
"last_user_version": 0,
"last_backfill": "MAX",
"purged_snaps": "[]",
"history": { "epoch_created": 1,
"last_epoch_started": 0,
"last_epoch_clean": 1,
"last_epoch_split": 0,
"same_up_since": 4,
"same_interval_since": 4,
"same_primary_since": 4,
"last_scrub": "0'0",
"last_scrub_stamp": "2015-05-18 22:35:38.460878",
"last_deep_scrub": "0'0",
"last_deep_scrub_stamp": "2015-05-18 22:35:38.460878",
"last_clean_scrub_stamp": "0.000000"},
"stats": { "version": "0'0",
"reported_seq": "10",
"reported_epoch": "11",
"state": "incomplete",
"last_fresh": "2015-05-18 23:10:59.047056",
"last_change": "2015-05-18 22:35:38.461314",
"last_active": "0.000000",
"last_clean": "0.000000",
"last_became_active": "0.000000",
"last_unstale": "2015-05-18 23:10:59.047056",
"mapping_epoch": 4,
"log_start": "0'0",
"ondisk_log_start": "0'0",
"created": 1,
"last_epoch_clean": 1,
"parent": "0.0",
"parent_split_bits": 0,
"last_scrub": "0'0",
"last_scrub_stamp": "2015-05-18 22:35:38.460878",
"last_deep_scrub": "0'0",
"last_deep_scrub_stamp": "2015-05-18 22:35:38.460878",
"last_clean_scrub_stamp": "0.000000",
"log_size": 0,
"ondisk_log_size": 0,
"stats_invalid": "0",
"stat_sum": { "num_bytes": 0,
"num_objects": 0,
"num_object_clones": 0,
"num_object_copies": 0,
"num_objects_missing_on_primary": 0,
"num_objects_degraded": 0,
"num_objects_unfound": 0,
"num_objects_dirty": 0,
"num_whiteouts": 0,
"num_read": 0,
"num_read_kb": 0,
"num_write": 0,
"num_write_kb": 0,
"num_scrub_errors": 0,
"num_shallow_scrub_errors": 0,
"num_deep_scrub_errors": 0,
"num_objects_recovered": 0,
"num_bytes_recovered": 0,
"num_keys_recovered": 0,
"num_objects_omap": 0,
"num_objects_hit_set_archive": 0},
"stat_cat_sum": {},
"up": [
0],
"acting": [
0],
"up_primary": 0,
"acting_primary": 0},
"empty": 1,
"dne": 0,
"incomplete": 0,
"last_epoch_started": 0,
"hit_set_history": { "current_last_update": "0'0",
"current_last_stamp": "0.000000",
"current_info": { "begin": "0.000000",
"end": "0.000000",
"version": "0'0"},
"history": []}},
"peer_info": [],
"recovery_state": [
{ "name": "Started\/Primary\/Peering",
"enter_time": "2015-05-18 22:35:38.461150",
"past_intervals": [
{ "first": 1,
"last": 3,
"maybe_went_rw": 0,
"up": [],
"acting": [
-1,
-1]}],
"probing_osds": [
"0"],
"down_osds_we_would_probe": [],
"peering_blocked_by": []},
{ "name": "Started",
"enter_time": "2015-05-18 22:35:38.461070"}],
"agent_state": {}}
Ceph pg dump
….
..
.
0'0 2015-05-18 22:35:38.469318
2.2c 00 00 00 0incomplete 2015-05-18 22:35:43.2686810'0 11:10[0] 0[0] 00'0
2015-05-18 22:35:43.2682160'0 2015-05-18 22:35:43.268216
1.2f 00 00 00 0incomplete 2015-05-18 22:35:40.4059080'0 11:10[0] 0[0] 00'0
2015-05-18 22:35:40.4055270'0 2015-05-18 22:35:40.405527
0.2e 00 00 00 0incomplete 2015-05-18 22:35:38.4692700'0 11:10[0] 0[0] 00'0
2015-05-18 22:35:38.4688330'0 2015-05-18 22:35:38.468833
pool 0 00 00 00 0
pool 1 00 00 00 0
pool 2 00 00 00 0
sum 00 00 00 0
osdstat kbusedkbavail kbhb in hb out
0 347045196892 5231596[] []
1 334525198144 5231596[0] []
2 334525198144 5231596[0,1] []
sum 10160815593180 15694788
===========================
$ cat setup_ceph.sh
#!/bin/bash
# ------------------------------------------------------------------------------
# This script sets up a Ceph node as part of a Ceph cluster.
#
# This is part of an experiment to ensure we run a Ceph cluster on Docker
# containers as well as use this cluster as a back-end storage for OpenStack
# services that are also running on containers. A fully-automated deployment
# will later be implemented using Chef cookbooks.
# ------------------------------------------------------------------------------
set -e
set -x
if [ "$1" == "-h" ] || [ "$1" == "--help" ]; then
cat << END_USAGE_INFO
Usage: $0 [ <initial monitors> [ <bootstrap IP> [ <monitor secret> [ <fsid> [
<monitor list> [ <data path> [ <journal path> ]]]]]]]
Where: initial monitors - comma-separated list of IDs of monitors allowed
to start the cluster (default: <this host's name>)
bootstrap IP - IP address of any other monitor that is already
part of the cluster (default: none)
monitor secret - monitor secret (randomly generated if not given)
fsid - cluster ID (randomly generated if not given)
monitor list - comma-separated list of FQDN of known monitors
(default: <this host's FQDN>)
data path - path to OSD data device or directory
(default: /ceph-data/osd)
journal path - path to OSD journal device or file
(default: /ceph-data/journal)
END_USAGE_INFO
exit 0
fi
if [ "$(id -u)" != "0" ]; then
echo "This script must be run as root."
exit 1
fi
INITIAL_MONS=$1
BOOTSTRAP_IP=$2
MON_SECRET=$3
FSID=$4
MON_LIST=$5
OSD_PATH=$6
JOURNAL_PATH=$7
CLUSTER_NAME="ceph"
THIS_HOST_FQDN=$(hostname -f)
THIS_HOST_NAME=$(hostname -s)
THIS_HOST_IP=$(hostname -i)
yum install -y ceph xfsprogs
if [ -z "${INITIAL_MONS}" ]; then
INITIAL_MONS=${THIS_HOST_NAME}
fi
if [ -z "${MON_SECRET}" ]; then
MON_SECRET=$(ceph-authtool /dev/stdout --name=mon. --gen-key | awk -F "key = "
'/key/{print $2}')
fi
if [ -z "${FSID}" ]; then
FSID=$(uuidgen)
NEW_CLUSTER=true
else
NEW_CLUSTER=false
fi
if [ -z "${MON_LIST}" ]; then
MON_LIST=${THIS_HOST_FQDN}
fi
if [ -z "${OSD_PATH}" ]; then
OSD_PATH="/ceph-data/osd"
rm -rf ${OSD_PATH}
mkdir -p ${OSD_PATH}
fi
if [ -z "${JOURNAL_PATH}" ]; then
JOURNAL_PATH="/ceph-data/journal"
rm -f ${JOURNAL_PATH}
fi
cat > /etc/ceph/ceph.conf << END_CEPH_CONF
[global]
fsid = ${FSID}
mon initial members = ${INITIAL_MONS}
mon host = ${MON_LIST}
END_CEPH_CONF
mkdir -p /var/run/ceph
chmod 755 /var/run/ceph
mkdir -p /var/lib/ceph/mon/${CLUSTER_NAME}-${THIS_HOST_NAME}
chmod 755 /var/lib/ceph/mon/${CLUSTER_NAME}-${THIS_HOST_NAME}
TMP_KEY=/tmp/${CLUSTER_NAME}-${THIS_HOST_NAME}.mon.keyring
ceph-authtool ${TMP_KEY} --create-keyring --name=mon. --add-key=${MON_SECRET}
--cap mon 'allow *'
if $NEW_CLUSTER ; then
ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring --gen-key -n
client.admin --set-uid=0 --cap mon 'allow *' --cap osd 'allow *' --cap mds
'allow'
ceph-authtool ${TMP_KEY} --import-keyring /etc/ceph/ceph.client.admin.keyring
fi
ceph-mon --mkfs -i ${THIS_HOST_NAME} --keyring ${TMP_KEY}
rm -f ${TMP_KEY}
touch /var/lib/ceph/mon/${CLUSTER_NAME}-${THIS_HOST_NAME}/{done,sysvinit}
/etc/init.d/ceph start mon.${THIS_HOST_NAME}
if [ ! -z "${BOOTSTRAP_IP}" ]; then
ceph --admin-daemon /var/run/ceph/ceph-mon.${THIS_HOST_NAME}.asok
add_bootstrap_peer_hint ${BOOTSTRAP_IP}
else
ceph --admin-daemon /var/run/ceph/ceph-mon.${THIS_HOST_NAME}.asok
add_bootstrap_peer_hint ${THIS_HOST_IP}
fi
/usr/sbin/ceph-disk -v prepare ${OSD_PATH} ${JOURNAL_PATH}
sleep 20
# call to "ceph-disk activate," is not necessary since udev will trigger
activation once the disk is prepared.
# /usr/sbin/ceph-disk -v activate ${OSD_PATH}
set +x
cat << INSTALL_END
Installation completed successfully
Monitor secret: ${MON_SECRET}
Cluster FSID: ${FSID}
This node name: ${THIS_HOST_NAME}
This node IP address: ${THIS_HOST_IP}
INSTALL_END
exit 0
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com