Den tors 14 mars 2019 kl 17:00 skrev Zhenshi Zhou :
> I think I've found the root cause which make the monmap contains no
> feature. As I moved the servers from one place to another, I modified
> the monmap once.
If this was the empty cluster that you refused to redo from scratch, then I
feel it
Hi huang,
I think I've found the root cause which make the monmap contains no
feature. As I moved the servers from one place to another, I modified
the monmap once.
However, not all monmap is the same on all mons. I modified monmap
on one of the mons, and create from scratch on the other two
Hi,
I'll try that command soon.
It's a new cluster installed mimic. Not sure what the exact reason, but as
far as I can think of, 2 things may cause this issue. One is that I moved
these servers from a datacenter to this one, followed by steps [1]. Another
is that I create a bridge using the
You can try that commands, but maybe you need to find the root cause
why the current monmap contains no features at all, do you upgrade
cluster from luminous to mimic,
or it's a new cluster installed mimic?
Zhenshi Zhou 于2019年3月14日周四 下午2:37写道:
>
> Hi huang,
>
> It's a pre-production
Hi huang,
It's a pre-production environment. If everything is fine, I'll use it for
production.
My cluster is version mimic, should I set all features you listed in the
command?
Thanks
huang jun 于2019年3月14日周四 下午2:11写道:
> sorry, the script should be
> for f in kraken luminous mimic
sorry, the script should be
for f in kraken luminous mimic osdmap-prune; do
ceph mon feature set $f --yes-i-really-mean-it
done
huang jun 于2019年3月14日周四 下午2:04写道:
>
> ok, if this is a **test environment**, you can try
> for f in 'kraken,luminous,mimic,osdmap-prune'; do
> ceph mon feature set
ok, if this is a **test environment**, you can try
for f in 'kraken,luminous,mimic,osdmap-prune'; do
ceph mon feature set $f --yes-i-really-mean-it
done
If it is a production environment, you should eval the risk first, and
maybe setup a test cluster to testing first.
Zhenshi Zhou
# ceph mon feature ls
all features
supported: [kraken,luminous,mimic,osdmap-prune]
persistent: [kraken,luminous,mimic,osdmap-prune]
on current monmap (epoch 2)
persistent: [none]
required: [none]
huang jun 于2019年3月14日周四 下午1:50写道:
> what's the output of 'ceph mon
what's the output of 'ceph mon feature ls'?
from the code, maybe mon features not contain luminous
6263 void OSD::send_beacon(const ceph::coarse_mono_clock::time_point& now)
6264 {
6265 const auto& monmap = monc->monmap;
6266 // send beacon to mon even if we are just connected, and the
Hi,
One of the log says the beacon not sending as below:
2019-03-14 12:41:15.722 7f3c27684700 10 osd.5 17032 tick_without_osd_lock
2019-03-14 12:41:15.722 7f3c27684700 20 osd.5 17032 can_inc_scrubs_pending
0 -> 1 (max 1, active 0)
2019-03-14 12:41:15.722 7f3c27684700 20 osd.5 17032
osd will not send beacons to mon if its not in ACTIVE state,
so you maybe turn on one osd's debug_osd=20 to see what is going on
Zhenshi Zhou 于2019年3月14日周四 上午11:07写道:
>
> What's more, I find that the osds don't send beacons all the time, some osds
> send beacons
> for a period of time and then
Hi
I set the config on every osd and check whether all osds send beacons
to monitors.
The result shows that only part of the osds send beacons and the monitor
receives all beacons from which the osd send out.
But why some osds don't send beacon?
huang jun 于2019年3月13日周三 下午11:02写道:
> sorry for
sorry for not make it clearly, you may need to set one of your osd's
osd_beacon_report_interval = 5
and debug_ms=1 and then restart the osd process, then check the osd
log by 'grep beacon /var/log/ceph/ceph-osd.$id.log'
to make sure osd send beacons to mon, if osd send beacon to mon, you
should
And now, new errors are cliaming..
[image: image.png]
Zhenshi Zhou 于2019年3月13日周三 下午2:58写道:
> Hi,
>
> I didn't set osd_beacon_report_interval as it must be the default value.
> I have set osd_beacon_report_interval to 60 and debug_mon to 10.
>
> Attachment is the leader monitor log, the
can you get the value of osd_beacon_report_interval item? the default
is 300, you can set to 60, or maybe turn on debug_ms=1 debug_mon=10
can get more infos.
Zhenshi Zhou 于2019年3月13日周三 下午1:20写道:
>
> Hi,
>
> The servers are cennected to the same switch.
> I can ping from anyone of the servers
Hi,
The servers are cennected to the same switch.
I can ping from anyone of the servers to other servers
without a packet lost and the average round trip time
is under 0.1 ms.
Thanks
Ashley Merrick 于2019年3月13日周三 下午12:06写道:
> Can you ping all your OSD servers from all your mons, and ping your
Can you ping all your OSD servers from all your mons, and ping your mons
from all your OSD servers?
I’ve seen this where a route wasn’t working one direction, so it made OSDs
flap when it used that mon to check availability:
On Wed, 13 Mar 2019 at 11:50 AM, Zhenshi Zhou wrote:
> After checking
After checking the network and syslog/dmsg, I think it's not the network or
hardware issue. Now there're some
osds being marked down every 15 minutes.
here is ceph.log:
2019-03-13 11:06:26.290701 mon.ceph-mon1 mon.0 10.39.0.34:6789/0 6756 :
cluster [INF] Cluster is now healthy
2019-03-13
Hi Kevin,
I'm sure the firewalld are disabled on each host.
Well, the network is not a problem. The servers are connected
to the same switch and the connection is good when the osds
are marked as down. There was no interruption or delay.
I restart the leader monitor daemon and it seems return
Are you sure that firewalld is stopped and disabled?
Looks exactly like that when I missed one host in a test cluster.
Kevin
Am Di., 12. März 2019 um 09:31 Uhr schrieb Zhenshi Zhou :
> Hi,
>
> I deployed a ceph cluster with good performance. But the logs
> indicate that the cluster is not as
Yep, I think it maybe a network issue as well. I'll check the connections.
Thanks Eugen:)
Eugen Block 于2019年3月12日周二 下午4:35写道:
> Hi,
>
> my first guess would be a network issue. Double-check your connections
> and make sure the network setup works as expected. Check syslogs,
> dmesg, switches
Hi,
my first guess would be a network issue. Double-check your connections
and make sure the network setup works as expected. Check syslogs,
dmesg, switches etc. for hints that a network interruption may have
occured.
Regards,
Eugen
Zitat von Zhenshi Zhou :
Hi,
I deployed a ceph
22 matches
Mail list logo