hi,jan
2015-06-01 15:43 GMT+08:00 Jan Schermer :
> We had to disable deep scrub or the cluster would me unusable - we need to
> turn it back on sooner or later, though.
> With minimal scrubbing and recovery settings, everything is mostly good.
> Turned out many issues we had were due to too few
hi,all
Recently, i did some experiments on OSD data distribution,
we set up a cluster with 72 OSDs,all 2TB sata disk,
and ceph version is v0.94.3 and linux kernel version is 3.18,
and set "ceph osd crush tunables optimal".
There are 3 pools:
pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset
After search the source code, i found ceph_psim tool which can
simulate objects distribution,
but it seems a little simple.
2015-09-01 22:58 GMT+08:00 huang jun :
> hi,all
>
> Recently, i did some experiments on OSD data distribution,
> we set up a cluster with 72 OSDs,all 2TB sata
it will rotate every week by default, you can see the logrotate file
/etc/ceph/logrotate.d/ceph
2015-12-03 12:37 GMT+08:00 Wukongming :
> Hi ,All
> Is there anyone who knows How long or how many days will the logs.gz
> (mon/osd/mds)be kept, maybe before flushed?
>
> --
In SimpleMessenger, the client OP like OSD_OP will dispatch by
ms_fast_dispatch, and not queued in PriortizedQueue in Messenger.
2015-12-03 22:14 GMT+08:00 Wukongming :
> Hi, All:
> I 've got a question about a priority. We defined
> osd_client_op_priority = 63. CEPH_MSG_PRIO_LOW = 64.
>
2016-09-01 17:25 GMT+08:00 한승진 :
> Hi all.
>
> I'm very confused about ceph journal system
>
> Some people said ceph journal system works like linux journal filesystem.
>
> Also some people said all data are written journal first and then written to
> OSD data.
>
> Journal of Ceph storage also writ
./init-ceph start mon.a
2016-10-12 14:54 GMT+08:00 agung Laksono :
> Hi Ceph Users,
>
> I deploy development cluster using vstart with 3 MONs and 3 OSDs.
> On my experiment, Kill one of the monitor nodes by its pid. like this:
>
> $ kill -SIGSEGV 27557
>
> After a new monitor leader is chosen, I
ec only support writefull and append operations, but not partial write,
your can try it by doing random writes, see if the osd crash or not.
2016-10-18 10:10 GMT+08:00 Liuxuan :
> Hello:
>
>
>
> I have create cephfs which data pool type is EC and metadata is replica,
> The cluster reported error
com>:
> On Mon, Oct 17, 2016 at 9:23 PM, huang jun wrote:
>
>> ec only support writefull and append operations, but not partial write,
>> your can try it by doing random writes, see if the osd crash or not.
>>
>> 2016-10-18 10:10 GMT+08:00 Liuxuan :
>> > Hell
you can copy the corrupt osdmap file from osd.1 and then restart osd,
we met this before, and that works for us.
2017-02-23 22:33 GMT+08:00 tao chang :
> HI,
>
> I have a ceph cluster (ceph 10.2.5) witch 3 node, each has two osds.
>
> It was a power outage last night and all the server are resta
hi,all
We meet a problem related to erasure pool with k:m=3:1 and stripe_unit=64k*3.
We have a cluster with 96 OSDs on 4 Hosts(hosts are: srv1, srv2, srv3,
srv4), each host have 24 OSDs,
each host have 12 core processors (Intel(R) Xeon(R) CPU E5-2620 v2 @
2.10GHz) and 48GB memory.
cluster configure
you can find in http://ceph.com/pgcalc/
2016-03-15 23:41 GMT+08:00 Martin Palma :
> Hi all,
>
> The documentation [0] gives us the following formula for calculating
> the number of PG if the cluster is bigger than 50 OSDs:
>
> (OSDs * 100)
> Total PGs =
>
if your cache-mode is write-back, which will cache the read object in
cache tier.
you can try the read-proxy mode, which will not cache the object.
the read request send to primary OSD, and the primary osd collect the
shards from base tier(in you case, is erasure code pool),
you need to read at lea
e
> A has 7 chunks in storage tier, to recover file A a client needs 4 chunks,
> will it be possible that 2 chunks of file A are copied to and stored in
> cache, when file A is requested, only another 2 chunks are needed from the
> storage tier? )
>
> Thanks!
>
>
>
> e---
Can you set 'debug_osd = 20' in ceph.conf and restart the osd again,
and post the corrupt log.
I doubt it's problem related to "0 byte osdmap" decode problem?
2016-04-16 12:14 GMT+08:00 hjcho616 :
> I've been successfully running cephfs on my Debian Jessies for a while and
> one day after power ou
for striped objects, the main goodness is your cluster's OSDs capacity
usage will get more balanced,
and write\read requests will spread across the whole cluster which
will improve w/r performance .
2016-04-15 22:17 GMT+08:00 Chandan Kumar Singh :
> Hi
>
> Is it a good practice to store striped ob
for your cluster warning message, it's a pg's some objects have
inconsistent in primary and replicas,
so you can try 'ceph pg repair $PGID'.
2016-04-16 9:04 GMT+08:00 Oliver Dzombic :
> Hi,
>
> i meant of course
>
> 0.e6_head
> 0.e6_TEMP
>
> in
>
> /var/lib/ceph/osd/ceph-12/current
>
> sry...
>
>
First, you should check whether file osdmap.16024 exists in your
osd.3/current/meta dir,
if not, you can copy it from other OSD who has it.
2016-04-16 12:36 GMT+08:00 hjcho616 :
> Here is what I get wtih debug_osd = 20.
>
> 2016-04-15 23:28:24.429063 7f9ca0a5b800 0 set uid:gid to 1001:1001
> (ce
eta# find ./ | grep osdmap |
> grep 16024
> ./DIR_E/DIR_3/inc\uosdmap.16024__0_46887E3E__none
>
> Regards,
> Hong
>
>
> On Friday, April 15, 2016 11:53 PM, huang jun wrote:
>
>
> First, you should check whether file osdmap.16024 exists in your
> osd.3/current/met
interpret this.
>
> Regards,
> Hong
>
>
> On Saturday, April 16, 2016 12:11 AM, hjcho616 wrote:
>
>
> Is this it?
>
> root@OSD2:/var/lib/ceph/osd/ceph-3/current/meta# find ./ | grep osdmap |
> grep 16024
> ./DIR_E/DIR_3/inc\uosdmap.16024__0_46887E3E__none
&g
Hi, can you post the 'modinfo rbd' and your cluster state 'ceph -s'.
2016-04-18 16:35 GMT+08:00 席智勇 :
> hi cephers:
>
> I create a rbd volume(image) on Jewel release, when exec rbd map, I got the
> error message as follows.i can not find any message usage in
> syslog/kern.log/messages.
> anyone c
can you get the value of osd_beacon_report_interval item? the default
is 300, you can set to 60, or maybe turn on debug_ms=1 debug_mon=10
can get more infos.
Zhenshi Zhou 于2019年3月13日周三 下午1:20写道:
>
> Hi,
>
> The servers are cennected to the same switch.
> I can ping from anyone of the servers to
:
>>
>> Hi,
>>
>> I didn't set osd_beacon_report_interval as it must be the default value.
>> I have set osd_beacon_report_interval to 60 and debug_mon to 10.
>>
>> Attachment is the leader monitor log, the "mark-down" operations is at 14:22
>>
>> Thanks
&
tim taler 于2019年3月13日周三 下午11:05写道:
>
> Hi all,
> how are your experiences with different disk sizes in one pool
> regarding the overall performance?
> I hope someone could shed some light on the following scenario:
>
> Let's say I mix an equal amount of 2TB and 8TB disks in one pool,
> with a crus
send beacons and the monitor
>> receives all beacons from which the osd send out.
>>
>> But why some osds don't send beacon?
>>
>> huang jun 于2019年3月13日周三 下午11:02写道:
>>>
>>> sorry for not make it clearly, you may need to set one of your osd
and 0 B; target
> 25 obj/sec or 5 MiB/sec
> 2019-03-14 12:41:15.722 7f3c27684700 20 osd.5 17032
> promote_throttle_recalibrate new_prob 1000
> 2019-03-14 12:41:15.722 7f3c27684700 10 osd.5 17032
> promote_throttle_recalibrate actual 0, actual/prob ratio 1, adjusted
> new_prob 10
2019年3月14日周四 下午1:56写道:
>
> # ceph mon feature ls
> all features
> supported: [kraken,luminous,mimic,osdmap-prune]
> persistent: [kraken,luminous,mimic,osdmap-prune]
> on current monmap (epoch 2)
> persistent: [none]
> required: [none]
>
sorry, the script should be
for f in kraken luminous mimic osdmap-prune; do
ceph mon feature set $f --yes-i-really-mean-it
done
huang jun 于2019年3月14日周四 下午2:04写道:
>
> ok, if this is a **test environment**, you can try
> for f in 'kraken,luminous,mimic,osdmap-prune'; do
>
production environment. If everything is fine, I'll use it for
> production.
>
> My cluster is version mimic, should I set all features you listed in the
> command?
>
> Thanks
>
> huang jun 于2019年3月14日周四 下午2:11写道:
>>
>> sorry, the script should be
&
Marc Roos 于2019年3月18日周一 上午5:46写道:
>
>
>
>
> 2019-03-17 21:59:58.296394 7f97cbbe6700 0 --
> 192.168.10.203:6800/1614422834 >> 192.168.10.43:0/1827964483
> conn(0x55ba9614d000 :6800 s=STATE_OPEN pgs=8 cs=1 l=0).fault server,
> going to standby
>
> What does this mean?
That means the connection is i
Did the time really cost on db compact operation?
or you can turn on debug_osd=20 to see what happens,
what about the disk util during start?
Nikhil R 于2019年3月28日周四 下午4:36写道:
>
> CEPH osd restarts are taking too long a time
> below is my ceph.conf
> [osd]
> osd_compact_leveldb_on_mount = false
>
din.com/in/nikhilravindra
>
>
>
> On Thu, Mar 28, 2019 at 3:58 PM huang jun wrote:
>>
>> Did the time really cost on db compact operation?
>> or you can turn on debug_osd=20 to see what happens,
>> what about the disk util during start?
>>
>> Nikhil R
long start time,
leveldb compact or filestore split?
> in.linkedin.com/in/nikhilravindra
>
>
>
> On Fri, Mar 29, 2019 at 6:55 AM huang jun wrote:
>>
>> It seems like the split settings result the problem,
>> what about comment out those settings then see it still
What's the output of 'ceph osd dump' and 'ceph osd crush dump' and
'ceph health detail'?
Andrew J. Hutton 于2019年3月30日周六 上午7:05写道:
>
> I have tried to create erasure pools for CephFS using the examples given
> at
> https://swamireddy.wordpress.com/2016/01/26/ceph-diff-between-erasure-and-replicate
The force-recovery/backfill command was introduced in Luminous version
if i remember right
Nikhil R 于2019年3月31日周日 上午7:59写道:
>
> Team,
> Is there a way to force backfill a pg in ceph jewel. I know this is available
> in mimic. Is it available in ceph jewel
> I tried ceph pg backfill &ceph pg bac
seems like the crush cannot get enough osds for this pg,
what the output of 'ceph osd crush dump' and especially the 'tunables'
section values?
Vladimir Prokofev 于2019年3月27日周三 上午4:02写道:
>
> CEPH 12.2.11, pool size 3, min_size 2.
>
> One node went down today(private network interface started flapp
Can you provide detail error logs when mon crash?
Pardhiv Karri 于2019年4月2日周二 上午9:02写道:
>
> Hi,
>
> Our ceph production cluster is down when updating crushmap. Now we can't get
> out monitors to come online and when they come online for a fraction of a
> second we see crush map errors in logs.
mj 于2019年4月25日周四 下午6:34写道:
>
> Hi all,
>
> On our three-node cluster, we have setup chrony for time sync, and even
> though chrony reports that it is synced to ntp time, at the same time
> ceph occasionally reports time skews that can last several hours.
>
> See for example:
>
> > root@ceph2:~# ce
Yes, 'ceph osd reweight-by-xxx' will use the osd crush-weight(which
represent how much data it can hold)
to calculate.
Igor Podlesny 于2019年4月29日周一 下午2:56写道:
>
> Say, some nodes have OSDs that are 1.5 times bigger, than other nodes
> have, meanwhile weights of all the nodes in question is almost
do you have osd's crush location changed after reboot?
kas 于2019年5月15日周三 下午10:39写道:
>
> kas wrote:
> : Marc,
> :
> : Marc Roos wrote:
> : : Are you sure your osd's are up and reachable? (run ceph osd tree on
> : : another node)
> :
> : They are up, because all three mons see them as u
Stuart Longland 于2019年5月18日周六 上午9:26写道:
>
> On 16/5/19 8:55 pm, Stuart Longland wrote:
> > As this is Bluestore, it's not clear what I should do to resolve that,
> > so I thought I'd "RTFM" before asking here:
> > http://docs.ceph.com/docs/luminous/rados/operations/pg-repair/
> >
> > Maybe there's
EDH - Manuel Rios Fernandez 于2019年5月17日周五 下午3:23写道:
>
> Did you check your KVM host RAM usage?
>
>
>
> We saw this on host very very loaded with overcommit in RAM causes a random
> crash of VM.
>
>
>
> As you said for solve must be remounted externaly and fsck. You can prevent
> it disabled ceph
That may have problem with your disk?
Do you check the syslog or demsg log,?
From the code, it will return 'read_error' only the read return EIO.
So i doubt that your disk have a sector error.
Stuart Longland 于2019年5月18日周六 上午9:43写道:
>
> On 18/5/19 11:34 am, huang jun wrote:
>
ok, so i think if you use 'rados -p pool get
7:581d78de:::rbd_data.b48c7238e1f29.1b34:head -o obj'
the osd maybe got crashed.
Stuart Longland 于2019年5月18日周六 上午10:05写道:
>
> On 18/5/19 11:56 am, huang jun wrote:
> > That may have problem with your disk?
> >
From the error message, i'm decline to that 'mon_max_pg_per_osd' was exceed,
you can check the value of it, and its default value is 250, so you
can at most have 1500pgs(250*6osds),
and for replicated pools with size=3, you can have 500pgs for all pools,
you already have 448pgs, so the next pool ca
Did you osd oom killed when cluster doing recover/backfill, or just
the client io?
The configure items you mentioned is for bluestore and the osd memory
include many other
things, like pglog, you it's important to known do you cluster is dong recover?
Sergei Genchev 于2019年6月8日周六 上午5:35写道:
>
> Hi
i think the write data will also write to the osd.4 in this case.
bc your osd.4 is not down, so the ceph don't think the pg have some osd
down,
and it will replicated the data to all osds in actingbackfill set.
Tarek Zegar 于2019年6月7日周五 下午10:37写道:
> Paul / All
>
> I'm not sure what warning your a
what's your 'ceph osd df tree' outputs?does the osd have the expected PGs?
Josh Haft 于2019年6月7日周五 下午9:23写道:
>
> 95% of usage is CephFS. Remaining is split between RGW and RBD.
>
> On Wed, Jun 5, 2019 at 3:05 PM Gregory Farnum wrote:
> >
> > I think the mimic balancer doesn't include omap data wh
can you show us the output of 'ceph osd dump' and 'ceph health detail'?
Luk 于2019年6月14日周五 下午8:02写道:
>
> Hello,
>
> All kudos are going to friends from Wroclaw, PL :)
>
> It was as simple as typo...
>
> There was osd added two times to crushmap due to (this commands where
> run over week ago
osd send osd beacons every 300s, and it's used to let mon know that
osd is alive,
for some cases, the osd don't have peers, ex, no pools created.
Rafał Wądołowski 于2019年6月14日周五 下午12:53写道:
>
> Hi,
>
> Is it normal that osd beacon could be without pgs? Like below. This
> drive contain data, but I c
you should add this to your ceph.conf
[client]
log file = /var/log/ceph/$name.$pid.log
debug client = 20
?? ?? 于2019年6月18日周二 上午11:18写道:
>
> I am a student new to cephfs. I want see the ldout log in
> /src/client/Client.cc (for example, ldout(cct, 20) << " no cap on " <<
> dn->inode->vino() << d
try: rbd create backup2/teste --size 5T --data-pool ec_pool
Fabio Abreu 于2019年7月5日周五 上午1:49写道:
>
> Hi Everybody,
>
> I have a doubt about the usability of rbd with EC pool , I tried to use this
> in my CentOS lab but I just receive some errors when I try create a rbd
> image inside this pool.
how long do you monitor after r/w finish?
there is a configure item named 'ms_connection_idle_timeout' which
default value is 900
fengyd 于2019年8月19日周一 下午4:10写道:
>
> Hi,
>
> I have a question about tcp connection.
> In the test environment, openstack uses ceph RBD as backend storage.
> I created a
try to restart some of the down osds in 'ceph osd tree', and to see
what happened?
nokia ceph 于2019年11月8日周五 下午6:24写道:
>
> Adding my official mail id
>
> -- Forwarded message -
> From: nokia ceph
> Date: Fri, Nov 8, 2019 at 3:57 PM
> Subject: OSD's not coming up in Nautilus
> To:
08 10:33:05
> cn1.chn8be1c1.cdn numactl[219218]: 2019-11-08 10:33:05.474 7f9ad14df700 -1
> osd.0 1795 set_numa_affinity unable to identify public interface 'dss-client'
> numa n...r directory
>
> Hint: Some lines were ellipsized, use -l to show in full.
>
>
>
>
850792e700 -1 osd.0 1795 set_numa_affinity unable to identify public
> interface 'dss-client' numa node: (2) No such file or directory
>
> On Fri, Nov 8, 2019 at 4:48 PM huang jun wrote:
>>
>> the osd.0 is still in down state after restart? if so, maybe the
>> pr
7;s
> fsid's are coming from still. Is this creating the problem. Because I am
> seeing that the OSD's in the fifth node are showing up in the ceph status
> whereas the other nodes osd's are showing down.
>
> On Fri, Nov 8, 2019 at 7:25 PM huang jun wrote:
>>
:31写道:
>
> Hi,
>
> Please find the ceph osd tree output in the pastebin
> https://pastebin.com/Gn93rE6w
>
> On Fri, Nov 8, 2019 at 7:58 PM huang jun wrote:
>>
>> can you post your 'ceph osd tree' in pastebin?
>> do you mean the osds report fsid m
f you want
> I will send you the logs of the mon once again by restarting the osd.0
>
> On Sun, Nov 10, 2019 at 10:17 AM huang jun wrote:
>>
>> The mon log shows that the all mismatch fsid osds are from node 10.50.11.45,
>> maybe that the fith node?
>> BTW i don'
psized, use -l to show in full.
>
>
> # ceph tell mon.cn1 injectargs '--debug-mon 1/5'
> injectargs:
>
> cn1.chn8be1c1.cdn ~# ceph daemon /var/run/ceph/ceph-mon.cn1.asok config
> show|grep debug_mon
> "debug_mon": "1/5",
> "debug_monc"
what about the pool's backfill_full_ratio value?
Simone Lazzaris 于2019年12月9日周一 下午6:38写道:
>
> Hi all;
>
> Long story short, I have a cluster of 26 OSD in 3 nodes (8+9+9). One of the
> disk is showing some read error, so I''ve added an OSD in the faulty node
> (OSD.26) and set the (re)weight of t
Hi, all
The apt-mirror.sepia.ceph.com and gitbuilder.ceph.com are down?
I can't ping it.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
62 matches
Mail list logo