Re: [ceph-users] Pool Max Avail and Ceph Dashboard Pool Useage on Nautilus giving different percentages

2020-01-14 Thread ceph
Does anyone know if this is also  respecting an nearfull values?

Thank you in advice
Mehmet

Am 14. Januar 2020 15:20:39 MEZ schrieb Stephan Mueller :
>Hi,
>I sent out this message on the 19th of December and somehow it didn't
>got into the list and I just noticed it now. Sorry for the delay.
>I tried to resend it but it just returned the same error that mail was
>not deliverable to the ceph mailing list. I will send the message
>beneath as soon it's finally possible, but for now this should help you
>out.
>
>Stephan
>
>--
>
>Hi,
>
>if "MAX AVAIL" displays the wrong data, the bug is just made more
>visible through the dashboard, as the calculation is correct.
>
>To get the right percentage you have to divide the used space through
>the total, and the total can only consist of two states used and not
>used space, so both states will be added together to get the total.
>
>Or in short:
>
>used / (avail + used)
>
>Just looked into the C++ code - Max avail will be calculated the
>following way:
>
>avail_res = avail / raw_used_rate (
>https://github.com/ceph/ceph/blob/nautilus/src/mon/PGMap.cc#L905)
>
>raw_used_rate *= (sum.num_object_copies - sum.num_objects_degraded) /
>sum.num_object_copies
>(https://github.com/ceph/ceph/blob/nautilus/src/mon/PGMap.cc#L892)
>
>
>Am Dienstag, den 17.12.2019, 07:07 +0100 schrieb c...@elchaka.de:
>> I have observed this in the ceph nautilus dashboard too - and Think
>> it is a Display Bug... but sometimes it Shows tue right values
>> 
>> 
>> Which nautilus u use?
>> 
>> 
>> Am 10. Dezember 2019 14:31:05 MEZ schrieb "David Majchrzak, ODERLAND
>> Webbhotell AB" :
>> > Hi!
>> > 
>> > While browsing /#/pool in nautilus ceph dashboard I noticed it said
>> > 93%
>> > used on the single pool we have (3x replica).
>> > 
>> > ceph df detal however shows 81% used on the pool and 67% raw
>> > useage.
>> > 
>> > # ceph df detail
>> > RAW STORAGE:
>> >CLASS SIZEAVAIL   USEDRAW USED %RAW
>> > USED 
>> >ssd   478 TiB 153 TiB 324 TiB  325
>> > TiB 67.96 
>> >TOTAL 478 TiB 153 TiB 324 TiB  325
>> > TiB 67.96 
>> > 
>> > POOLS:
>> >POOLID STORED  OBJECTS USED%USED
>> >
>> >  MAX AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY  USED
>> > COMPR UNDER COMPR 
>> >echo  3 108 TiB  29.49M 324
>> > TiB 81.61    24
>> > TiB N/A   N/A 29.49M    0
>> > B 0 B
>
>I manually calculated the used percentage to get "avail" in your case
>it seems to be 73 TiB. That means the the total space available for
>your pool would be 397 TiB.
>I'm not sure why that is, but it's what the math behind those
>calculations say.
>(Found a thread regarding that on the new mailing list (ceph-
>us...@ceph.io) -> 
>
>
>https://lists.ceph.io/hyperkitty/list/ceph-us...@ceph.io/thread/NH2LMMX5KVRWCURI3BARRUAETKE2T2QN/#JDHXOQKWF6NZLQMOGEPAQCLI44KB54A3
> )
>
>0.8161 = used (324) / total => total = 397
>
>Than I looked at the remaining calculations:
>
>raw_used_rate *= (sum.num_object_copies - sum.num_objects_degraded) /
>sum.num_object_copies
>
>and
>
>avail_res = avail / raw_used_rate 
>
>First I looked up the init value for "raw_used_rate" for replicated
>pools. It's their size so we can put in 3 here and for "avail_res" is
>24. 
>
>So I first calculated the final "raw_used_rate" which is 3.042. That
>means that you have around 4.2% degraded pg's in your pool.
>
>> > 
>> > 
>> > I know we're looking at the most full OSD (210PGs, 79% used, 1.17
>> > VAR)
>> > and count max avail from that. But where's the 93% full from in
>> > dashboard?
>
>As said above the calculation is right but the data is wrong... As it
>uses the real data that can be put in the selected pool, but it uses
>everywhere else sizes that consider all pool replicas.
>
>I created an issue to fix this https://tracker.ceph.com/issues/43384
>
>> > 
>> > My guess is that is comes from calculating: 
>> > 
>> > 1 - Max Avail / (Used + Max avail) = 0.93
>> > 
>> > 
>> > Kind Regards,
>> > 
>> > David Majchrzak
>> > 
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>Hope I could clarify some things and thanks for your feedback :)
>
>BTW this problem currently still exists as there wasn't any change to
>these mentioned lines after the nautilus release.
>
>Stephan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] One Mon out of Quorum

2020-01-12 Thread nokia ceph
Hi,

When installing Nautilus on a five node cluster, we tried to install one
node first and then the remaining four nodes. After that we saw that the
fifth node is out of quorum and we found that the fsid was different in 5th
node. When we replaced the ceph.conf file from the four nodes to the fifth
node and restart the ceph service still we are unable to make the fifth
node enter the quorum.

# ceph -s
  cluster:
id: 92e8e879-041f-49fd-a26a-027814e0255b
health: HEALTH_WARN
1/5 mons down, quorum cn1,cn2,cn3,cn4

  services:
mon: 5 daemons, quorum cn1,cn2,cn3,cn4 (age 44m), out of quorum: cn5

But when we find the monmap we see that the monmap is proper in all the
five nodes.

# monmaptool --print /tmp/monmap
monmaptool: monmap file /tmp/monmap
epoch 2
fsid 92e8e879-041f-49fd-a26a-027814e0255b
last_changed 2020-01-13 05:47:12.846861
created 2020-01-10 16:19:21.340371
min_mon_release 14 (nautilus)
0: [v2:10.50.11.11:3300/0,v1:10.50.11.11:6789/0] mon.cn1
1: [v2:10.50.11.12:3300/0,v1:10.50.11.12:6789/0] mon.cn2
2: [v2:10.50.11.13:3300/0,v1:10.50.11.13:6789/0] mon.cn3
3: [v2:10.50.11.14:3300/0,v1:10.50.11.14:6789/0] mon.cn4
4: [v2:10.50.11.15:3300/0,v1:10.50.11.15:6789/0] mon.cn5

Regards,
Sridhar S
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd du command

2020-01-06 Thread ceph
Hi,

rbd are thin provisionned, you need to trim on the upper level, either
via the fstrim command, or the discard option (on Linux)

Unless you trim, the rbd layer does not know that data has been removed
and are thus no longer needed



On 1/6/20 10:30 AM, M Ranga Swami Reddy wrote:
> Hello,
> I ran the "rbd du /image" command. Its shows increasing, when I add
> data to the image. That looks good. But when I removed data from the image,
> its not showing the decreasing the size.
> 
> Is this expected with "rbd du" or its not implemented?
> 
> NOTE: Expected behavior is the same as " Linux du command"
> 
> Thanks
> Swami
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Pool Max Avail and Ceph Dashboard Pool Useage on Nautilus giving different percentages

2019-12-16 Thread ceph
I have observed this in the ceph nautilus dashboard too - and Think it is a 
Display Bug... but sometimes it Shows tue right values


Which nautilus u use?


Am 10. Dezember 2019 14:31:05 MEZ schrieb "David Majchrzak, ODERLAND Webbhotell 
AB" :
>Hi!
>
>While browsing /#/pool in nautilus ceph dashboard I noticed it said 93%
>used on the single pool we have (3x replica).
>
>ceph df detal however shows 81% used on the pool and 67% raw useage.
>
># ceph df detail
>RAW STORAGE:
>CLASS SIZEAVAIL   USEDRAW USED %RAW
>USED 
>ssd   478 TiB 153 TiB 324 TiB  325
>TiB 67.96 
>TOTAL 478 TiB 153 TiB 324 TiB  325
>TiB 67.96 
> 
>POOLS:
>POOLID STORED  OBJECTS USED%USED   
>  MAX AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY  USED
>COMPR UNDER COMPR 
>echo  3 108 TiB  29.49M 324 TiB 81.6124
>TiB N/A   N/A 29.49M0
>B 0 B
>
>
>I know we're looking at the most full OSD (210PGs, 79% used, 1.17 VAR)
>and count max avail from that. But where's the 93% full from in
>dashboard?
>
>My guess is that is comes from calculating: 
>
>1 - Max Avail / (Used + Max avail) = 0.93
>
>
>Kind Regards,
>
>David Majchrzak
>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] rados_ioctx_selfmanaged_snap_set_write_ctx examples

2019-12-02 Thread nokia ceph
Hi Team,

We would like to create multiple snapshots inside ceph cluster,
initiate the request from librados client  and came across this rados api
rados_ioctx_selfmanaged_snap_set_write_ctx

Can some give us sample code on how to use this api  .

Thanks,
Muthu
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph osd's crashing repeatedly

2019-11-13 Thread nokia ceph
Hi,

We have upgraded a 5 node ceph cluster from Luminous to Nautilus and the
cluster was running fine. Yesterday when we tried to add one more osd into
the ceph cluster we find that the OSD is created in the cluster but
suddenly some of the other OSD's started to crash and we are not able to
restart any of the OSD's in that particular node where we found this issue.
Due to this we are not able to add the OSD's in other node and we are not
able to bring up the cluster.

The logs which are shown during the crash is below.


Nov 13 16:26:13 cn5 numactl: ceph version 14.2.2
(4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable)
Nov 13 16:26:13 cn5 numactl: 1: (()+0xf5d0) [0x7f488bb0f5d0]
Nov 13 16:26:13 cn5 numactl: 2: (gsignal()+0x37) [0x7f488a8ff207]
Nov 13 16:26:13 cn5 numactl: 3: (abort()+0x148) [0x7f488a9008f8]
Nov 13 16:26:13 cn5 numactl: 4: (ceph::__ceph_assert_fail(char const*, char
const*, int, char const*)+0x199) [0x5649f7348d43]
Nov 13 16:26:13 cn5 numactl: 5: (ceph::__ceph_assertf_fail(char const*,
char const*, int, char const*, char const*, ...)+0) [0x5649f7348ec2]
Nov 13 16:26:13 cn5 numactl: 6: (()+0x8e7e60) [0x5649f77c3e60]
Nov 13 16:26:13 cn5 numactl: 7:
(CallClientContexts::finish(std::pair&)+0x6b9) [0x5649f77d5bf9]
Nov 13 16:26:13 cn5 numactl: 8:
(ECBackend::complete_read_op(ECBackend::ReadOp&, RecoveryMessages*)+0x8c)
[0x5649f77ab02c]
Nov 13 16:26:13 cn5 numactl: 9:
(ECBackend::handle_sub_read_reply(pg_shard_t, ECSubReadReply&,
RecoveryMessages*, ZTracer::Trace const&)+0xd57) [0x5649f77c5627]
Nov 13 16:26:13 cn5 numactl: 10:
(ECBackend::_handle_message(boost::intrusive_ptr)+0x9f)
[0x5649f77c60af]
Nov 13 16:26:13 cn5 numactl: 11:
(PGBackend::handle_message(boost::intrusive_ptr)+0x87)
[0x5649f76a3467]
Nov 13 16:26:13 cn5 numactl: 12:
(PrimaryLogPG::do_request(boost::intrusive_ptr&,
ThreadPool::TPHandle&)+0x695) [0x5649f764f365]
Nov 13 16:26:13 cn5 numactl: 13: (OSD::dequeue_op(boost::intrusive_ptr,
boost::intrusive_ptr, ThreadPool::TPHandle&)+0x1a9)
[0x5649f7489ea9]
Nov 13 16:26:13 cn5 numactl: 14: (PGOpItem::run(OSD*, OSDShard*,
boost::intrusive_ptr&, ThreadPool::TPHandle&)+0x62) [0x5649f77275d2]
Nov 13 16:26:13 cn5 numactl: 15: (OSD::ShardedOpWQ::_process(unsigned int,
ceph::heartbeat_handle_d*)+0x9f4) [0x5649f74a6ef4]
Nov 13 16:26:13 cn5 numactl: 16:
(ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x433)
[0x5649f7aa5ce3]
Nov 13 16:26:13 cn5 numactl: 17:
(ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5649f7aa8d80]
Nov 13 16:26:13 cn5 numactl: 18: (()+0x7dd5) [0x7f488bb07dd5]
Nov 13 16:26:13 cn5 numactl: 19: (clone()+0x6d) [0x7f488a9c6ead]
Nov 13 16:26:13 cn5 numactl: NOTE: a copy of the executable, or `objdump
-rdS ` is needed to interpret this.
Nov 13 16:26:13 cn5 systemd: ceph-osd@279.service: main process exited,
code=killed, status=6/ABRT


Could you please let us know what might be the issue and how to debug this?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph Osd operation slow

2019-11-12 Thread nokia ceph
Hi Team,

In one of our ceph cluster we observe that there are many slow IOPS in all
our OSD's and most of the latency is happening between two set of
operations which are shown below.

{
"time": "2019-11-12 08:29:58.128669",
"event": "sub_op_committed"
},
{
"time": "2019-11-12 08:30:08.484235",
"event": "commit_sent"

Is there any way to know what is causing this issue and rectify this
problem? As you can see in the above output of historical_slow_ops there is
a 10 second delay between the two operations which is impacting the
performance. Attaching one of the OSD historic slow ops output to this mail
for reference.
{
"num to keep": 20,
"threshold to keep": 10,
"Ops": [
{
"description": "osd_op(client.71027356.0:194654 1.7s0 
1:e005f8bb:::%2fc20%2fvx039%2fpot%2fsridhar_106%2fhls%2fh_63538bf124417ce82e34317a6c95097a%2fvar2493000%2fseg20271245_w1573279521.ts:head
 [delete] snapc 0=[] ondisk+write+peerstat_old+known_if_redirected e30337)",
"initiated_at": "2019-11-12 08:29:50.410364",
"age": 77561.712867374998,
"duration": 12.158651442,
"type_data": {
"flag_point": "commit sent; apply or cleanup",
"client_info": {
"client": "client.71027356",
"client_addr": "10.50.62.163:0/478395830",
"tid": 194654
},
"events": [
{
"time": "2019-11-12 08:29:50.410364",
"event": "initiated"
},
{
"time": "2019-11-12 08:29:50.410364",
"event": "header_read"
},
{
"time": "2019-11-12 08:29:50.410361",
"event": "throttled"
},
{
"time": "2019-11-12 08:29:50.410368",
"event": "all_read"
},
{
"time": "2019-11-12 08:29:50.410369",
"event": "dispatched"
},
{
"time": "2019-11-12 08:29:50.410374",
"event": "queued_for_pg"
},
{
"time": "2019-11-12 08:29:50.410392",
"event": "reached_pg"
},
{
"time": "2019-11-12 08:29:50.410562",
"event": "started"
},
{
"time": "2019-11-12 08:29:50.410684",
"event": "sub_op_started"
},
{
"time": "2019-11-12 08:29:50.419974",
"event": "sub_op_committed"
},
{
"time": "2019-11-12 08:30:02.568999",
"event": "commit_sent"
},
{
"time": "2019-11-12 08:30:02.569015",
"event": "done"
}
]
}
},
{
"description": "osd_op(client.71027356.0:195247 1.7s0 
1:e0013cdb:::%2fc20%2fvx039%2fpot%2fsridhar_84%2fhls%2fh_598d9980480259996ca22bd53af09f02%2fvar37%2fseg20315902_w1573547390.ts:head
 [writefull 0~220148,setxattr mode (7),setxattr uid (5),setxattr gid 
(5),setxattr size (7),setxattr mtime (11),setxattr xattr (2),setxattr 
meal_flags (3),setxattr expire (7),setxattr md5 (33)] snapc 0=[] 
ondisk+write+known_if_redirected e30337)",
"initiated_at": "2019-11-12 08:29:56.741549",
"age": 77555.381682584004,
"duration": 11.74263874601,
"type_data": {
"flag_point": "commit 

Re: [ceph-users] Fwd: OSD's not coming up in Nautilus

2019-11-10 Thread nokia ceph
Please find the below output.

cn1.chn8be1c1.cdn ~# ceph osd metadata 0
{
"id": 0,
"arch": "x86_64",
"back_addr": "[v2:10.50.12.41:6883/12650,v1:10.50.12.41:6887/12650]",
"back_iface": "dss-private",
"bluefs": "1",
"bluefs_single_shared_device": "1",
"bluestore_bdev_access_mode": "blk",
"bluestore_bdev_block_size": "4096",
"bluestore_bdev_dev_node": "/dev/dm-23",
"bluestore_bdev_driver": "KernelDevice",
"bluestore_bdev_partition_path": "/dev/dm-23",
    "bluestore_bdev_rotational": "1",
"bluestore_bdev_size": "4000749453312",
"bluestore_bdev_support_discard": "0",
"bluestore_bdev_type": "hdd",
"ceph_release": "nautilus",
"ceph_version": "ceph version 14.2.2
(4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable)",
"ceph_version_short": "14.2.2",
"cpu": "Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz",
"default_device_class": "hdd",
"device_ids": "sdb=HP_LOGICAL_VOLUME_PDNLL0CRH4208F",
"devices": "sdb",
"distro": "centos",
"distro_description": "CentOS Linux 7 (Core)",
"distro_version": "7",
"front_addr": "[v2:10.50.11.41:6882/12650,v1:10.50.11.41:6887/12650]",
"front_iface": "dss-client",
"hb_back_addr": "[v2:10.50.12.41:6888/12650,v1:10.50.12.41:6890/12650]",
"hb_front_addr": "[v2:10.50.11.41:6889/12650,v1:10.50.11.41:6890/12650
]",
"hostname": "cn1.chn8be1c1.cdn",
"journal_rotational": "1",
"kernel_description": "#1 SMP Thu Nov 8 23:39:32 UTC 2018",
"kernel_version": "3.10.0-957.el7.x86_64",
"mem_swap_kb": "0",
"mem_total_kb": "272036636",
"network_numa_unknown_ifaces": "dss-client,dss-private",
"objectstore_numa_unknown_devices": "sdb",
"os": "Linux",
"osd_data": "/var/lib/ceph/osd/ceph-0",
"osd_objectstore": "bluestore",
"rotational": "1"
}
cn1.chn8be1c1.cdn ~# cat /var/lib/ceph/osd/ceph-0/fsid
a1ea2ea3-984d-4c91-86cf-29f452f5a952

On Sun, Nov 10, 2019 at 12:54 PM huang jun  wrote:

> The same problem:
> 2019-11-10 05:26:33.215 7fbfafeef700  7 mon.cn1@0(leader).osd e1819
> preprocess_boot from osd.0 v2:10.50.11.41:6814/2022032 clashes with
> existing osd: different fsid (ours:
> ccfdbd54-fcd2-467f-ab7b-c152b7e422fb ; theirs: a1ea2ea3-984d
> -4c91-86cf-29f452f5a952)
> maybe the osd uuid is wrong.
> what the output of 'ceph osd metadata 0' and 'cat
> /var/lib/ceph/osd/ceph-0/fsid'?
>
> nokia ceph  于2019年11月10日周日 下午2:47写道:
> >
> > Hi,
> >
> > yes still the cluster unrecovered. Not able to even up the osd.0 yet.
> >
> > osd logs: https://pastebin.com/4WrpgrH5
> >
> > Mon logs:
> https://drive.google.com/open?id=1_HqK2d52Cgaps203WnZ0mCfvxdcjcBoE
> >
> > # ceph daemon /var/run/ceph/ceph-mon.cn1.asok config show|grep debug_mon
> > "debug_mon": "20/20",
> > "debug_monc": "0/0",
> >
> >
> > # date; systemctl restart ceph-osd@0.service;date
> > Sun Nov 10 05:25:54 UTC 2019
> > Sun Nov 10 05:25:55 UTC 2019
> >
> >
> > cn1.chn8be1c1.cdn ~# systemctl status ceph-osd@0.service
> > ● ceph-osd@0.service - Ceph object storage daemon osd.0
> >Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service;
> enabled-runtime; vendor preset: disabled)
> >   Drop-In: /etc/systemd/system/ceph-osd@.service.d
> >└─90-ExecStart_NUMA.conf
> >Active: active (running) since Sun 2019-11-10 05:25:55 UTC; 8s ago
> >   Process: 2022026 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh
> --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
> >  Main PID: 2022032 (ceph-osd)
> >CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@0.service
> >└─2022032 /usr/bin/ceph-osd -f --cluster ceph --id 0
> --setuser ceph --setgroup ceph
> >
> > Nov 10 05:25:55 cn1.chn8be1c1.cdn systemd[1]: Starting Ceph object
> storage daemon osd.0...
> > Nov 10 05:25:55 cn1.chn8be1c1.cdn systemd[1]: Started Ceph object
> storage daemon osd.0.
> > Nov 10 05:26:03 cn1.chn8be1c1.cd

Re: [ceph-users] Fwd: OSD's not coming up in Nautilus

2019-11-09 Thread nokia ceph
Hi,

yes still the cluster unrecovered. Not able to even up the osd.0 yet.

osd logs: https://pastebin.com/4WrpgrH5

Mon logs: https://drive.google.com/open?id=1_HqK2d52Cgaps203WnZ0mCfvxdcjcBoE

# ceph daemon /var/run/ceph/ceph-mon.cn1.asok config show|grep debug_mon
"debug_mon": "20/20",
"debug_monc": "0/0",


# date; systemctl restart ceph-osd@0.service;date
Sun Nov 10 05:25:54 UTC 2019
Sun Nov 10 05:25:55 UTC 2019


cn1.chn8be1c1.cdn ~# systemctl status ceph-osd@0.service
● ceph-osd@0.service - Ceph object storage daemon osd.0
   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service;
enabled-runtime; vendor preset: disabled)
  Drop-In: /etc/systemd/system/ceph-osd@.service.d
   └─90-ExecStart_NUMA.conf
   Active: active (running) since Sun 2019-11-10 05:25:55 UTC; 8s ago
  Process: 2022026 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh
--cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
 Main PID: 2022032 (ceph-osd)
   CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@0.service
   └─2022032 /usr/bin/ceph-osd -f --cluster ceph --id 0 --setuser
ceph --setgroup ceph

Nov 10 05:25:55 cn1.chn8be1c1.cdn systemd[1]: Starting Ceph object storage
daemon osd.0...
Nov 10 05:25:55 cn1.chn8be1c1.cdn systemd[1]: Started Ceph object storage
daemon osd.0.
Nov 10 05:26:03 cn1.chn8be1c1.cdn numactl[2022032]: 2019-11-10 05:26:03.131
7fbef7bb5d80 -1 osd.0 1795 log_to_monitors {default=true}
Nov 10 05:26:03 cn1.chn8be1c1.cdn numactl[2022032]: 2019-11-10 05:26:03.372
7fbeea1c0700 -1 osd.0 1795 set_numa_affinity unable to identify public
interface 'dss-client' numa node: (2) No such file or directory
Hint: Some lines were ellipsized, use -l to show in full.


# ceph tell mon.cn1 injectargs '--debug-mon 1/5'
injectargs:

cn1.chn8be1c1.cdn ~# ceph daemon /var/run/ceph/ceph-mon.cn1.asok config
show|grep debug_mon
"debug_mon": "1/5",
"debug_monc": "0/0",




On Sun, Nov 10, 2019 at 11:05 AM huang jun  wrote:

> good, please send me the mon and osd.0 log.
> the cluster still un-recovered?
>
> nokia ceph  于2019年11月10日周日 下午1:24写道:
> >
> > Hi Huang,
> >
> > Yes the node 10.50.10.45 is the fifth node which is replaced. Yes I have
> set the debug_mon to 20 and still it is running with that value only. If
> you want I will send you the logs of the mon once again by restarting the
> osd.0
> >
> > On Sun, Nov 10, 2019 at 10:17 AM huang jun  wrote:
> >>
> >> The mon log shows that the all mismatch fsid osds are from node
> 10.50.11.45,
> >> maybe that the fith node?
> >> BTW i don't found the osd.0 boot message in ceph-mon.log
> >> do you set debug_mon=20 first and then restart osd.0 process, and make
> >> sure the osd.0 is restarted.
> >>
> >>
> >> nokia ceph  于2019年11月10日周日 下午12:31写道:
> >>
> >> >
> >> > Hi,
> >> >
> >> > Please find the ceph osd tree output in the pastebin
> https://pastebin.com/Gn93rE6w
> >> >
> >> > On Fri, Nov 8, 2019 at 7:58 PM huang jun  wrote:
> >> >>
> >> >> can you post your 'ceph osd tree' in pastebin?
> >> >> do you mean the osds report fsid mismatch is from old removed nodes?
> >> >>
> >> >> nokia ceph  于2019年11月8日周五 下午10:21写道:
> >> >> >
> >> >> > Hi,
> >> >> >
> >> >> > The fifth node in the cluster was affected by hardware failure and
> hence the node was replaced in the ceph cluster. But we were not able to
> replace it properly and hence we uninstalled the ceph in all the nodes,
> deleted the pools and also zapped the osd's and recreated them as new ceph
> cluster. But not sure where from the reference for the old fifth
> nodes(failed nodes) osd's fsid's are coming from still. Is this creating
> the problem. Because I am seeing that the OSD's in the fifth node are
> showing up in the ceph status whereas the other nodes osd's are showing
> down.
> >> >> >
> >> >> > On Fri, Nov 8, 2019 at 7:25 PM huang jun 
> wrote:
> >> >> >>
> >> >> >> I saw many lines like that
> >> >> >>
> >> >> >> mon.cn1@0(leader).osd e1805 preprocess_boot from osd.112
> >> >> >> v2:10.50.11.45:6822/158344 clashes with existing osd: different
> fsid
> >> >> >> (ours: 85908622-31bd-4728-9be3-f1f6ca44ed98 ; theirs:
> >> >> >> 127fdc44-c17e-42ee-bcd4-d577c0ef4479)
> >> >> >> the osd boot will be ignored if the fsid mismatch
> >> >> >> what do you do before this happen?
> &g

Re: [ceph-users] Fwd: OSD's not coming up in Nautilus

2019-11-09 Thread nokia ceph
Hi Huang,

Yes the node 10.50.10.45 is the fifth node which is replaced. Yes I have
set the debug_mon to 20 and still it is running with that value only. If
you want I will send you the logs of the mon once again by restarting the
osd.0

On Sun, Nov 10, 2019 at 10:17 AM huang jun  wrote:

> The mon log shows that the all mismatch fsid osds are from node
> 10.50.11.45,
> maybe that the fith node?
> BTW i don't found the osd.0 boot message in ceph-mon.log
> do you set debug_mon=20 first and then restart osd.0 process, and make
> sure the osd.0 is restarted.
>
>
> nokia ceph  于2019年11月10日周日 下午12:31写道:
>
> >
> > Hi,
> >
> > Please find the ceph osd tree output in the pastebin
> https://pastebin.com/Gn93rE6w
> >
> > On Fri, Nov 8, 2019 at 7:58 PM huang jun  wrote:
> >>
> >> can you post your 'ceph osd tree' in pastebin?
> >> do you mean the osds report fsid mismatch is from old removed nodes?
> >>
> >> nokia ceph  于2019年11月8日周五 下午10:21写道:
> >> >
> >> > Hi,
> >> >
> >> > The fifth node in the cluster was affected by hardware failure and
> hence the node was replaced in the ceph cluster. But we were not able to
> replace it properly and hence we uninstalled the ceph in all the nodes,
> deleted the pools and also zapped the osd's and recreated them as new ceph
> cluster. But not sure where from the reference for the old fifth
> nodes(failed nodes) osd's fsid's are coming from still. Is this creating
> the problem. Because I am seeing that the OSD's in the fifth node are
> showing up in the ceph status whereas the other nodes osd's are showing
> down.
> >> >
> >> > On Fri, Nov 8, 2019 at 7:25 PM huang jun  wrote:
> >> >>
> >> >> I saw many lines like that
> >> >>
> >> >> mon.cn1@0(leader).osd e1805 preprocess_boot from osd.112
> >> >> v2:10.50.11.45:6822/158344 clashes with existing osd: different fsid
> >> >> (ours: 85908622-31bd-4728-9be3-f1f6ca44ed98 ; theirs:
> >> >> 127fdc44-c17e-42ee-bcd4-d577c0ef4479)
> >> >> the osd boot will be ignored if the fsid mismatch
> >> >> what do you do before this happen?
> >> >>
> >> >> nokia ceph  于2019年11月8日周五 下午8:29写道:
> >> >> >
> >> >> > Hi,
> >> >> >
> >> >> > Please find the osd.0 which is restarted after the debug_mon is
> increased to 20.
> >> >> >
> >> >> > cn1.chn8be1c1.cdn ~# date;systemctl restart ceph-osd@0.service
> >> >> > Fri Nov  8 12:25:05 UTC 2019
> >> >> >
> >> >> > cn1.chn8be1c1.cdn ~# systemctl status ceph-osd@0.service -l
> >> >> > ● ceph-osd@0.service - Ceph object storage daemon osd.0
> >> >> >Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service;
> enabled-runtime; vendor preset: disabled)
> >> >> >   Drop-In: /etc/systemd/system/ceph-osd@.service.d
> >> >> >└─90-ExecStart_NUMA.conf
> >> >> >Active: active (running) since Fri 2019-11-08 12:25:06 UTC; 29s
> ago
> >> >> >   Process: 298505 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh
> --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
> >> >> >  Main PID: 298512 (ceph-osd)
> >> >> >CGroup:
> /system.slice/system-ceph\x2dosd.slice/ceph-osd@0.service
> >> >> >└─298512 /usr/bin/ceph-osd -f --cluster ceph --id 0
> --setuser ceph --setgroup ceph
> >> >> >
> >> >> > Nov 08 12:25:06 cn1.chn8be1c1.cdn systemd[1]: Starting Ceph object
> storage daemon osd.0...
> >> >> > Nov 08 12:25:06 cn1.chn8be1c1.cdn systemd[1]: Started Ceph object
> storage daemon osd.0.
> >> >> > Nov 08 12:25:11 cn1.chn8be1c1.cdn numactl[298512]: 2019-11-08
> 12:25:11.538 7f8515323d80 -1 osd.0 1795 log_to_monitors {default=true}
> >> >> > Nov 08 12:25:11 cn1.chn8be1c1.cdn numactl[298512]: 2019-11-08
> 12:25:11.689 7f850792e700 -1 osd.0 1795 set_numa_affinity unable to
> identify public interface 'dss-client' numa node: (2) No such file or
> directory
> >> >> >
> >> >> > On Fri, Nov 8, 2019 at 4:48 PM huang jun 
> wrote:
> >> >> >>
> >> >> >> the osd.0 is still in down state after restart? if so, maybe the
> >> >> >> problem is in mon,
> >> >> >> can you set the leader mon's debug_mon=20 and restart one of the
> down
> >> >> >> 

Re: [ceph-users] Fwd: OSD's not coming up in Nautilus

2019-11-09 Thread nokia ceph
Hi,

Please find the ceph osd tree output in the pastebin
https://pastebin.com/Gn93rE6w

On Fri, Nov 8, 2019 at 7:58 PM huang jun  wrote:

> can you post your 'ceph osd tree' in pastebin?
> do you mean the osds report fsid mismatch is from old removed nodes?
>
> nokia ceph  于2019年11月8日周五 下午10:21写道:
> >
> > Hi,
> >
> > The fifth node in the cluster was affected by hardware failure and hence
> the node was replaced in the ceph cluster. But we were not able to replace
> it properly and hence we uninstalled the ceph in all the nodes, deleted the
> pools and also zapped the osd's and recreated them as new ceph cluster. But
> not sure where from the reference for the old fifth nodes(failed nodes)
> osd's fsid's are coming from still. Is this creating the problem. Because I
> am seeing that the OSD's in the fifth node are showing up in the ceph
> status whereas the other nodes osd's are showing down.
> >
> > On Fri, Nov 8, 2019 at 7:25 PM huang jun  wrote:
> >>
> >> I saw many lines like that
> >>
> >> mon.cn1@0(leader).osd e1805 preprocess_boot from osd.112
> >> v2:10.50.11.45:6822/158344 clashes with existing osd: different fsid
> >> (ours: 85908622-31bd-4728-9be3-f1f6ca44ed98 ; theirs:
> >> 127fdc44-c17e-42ee-bcd4-d577c0ef4479)
> >> the osd boot will be ignored if the fsid mismatch
> >> what do you do before this happen?
> >>
> >> nokia ceph  于2019年11月8日周五 下午8:29写道:
> >> >
> >> > Hi,
> >> >
> >> > Please find the osd.0 which is restarted after the debug_mon is
> increased to 20.
> >> >
> >> > cn1.chn8be1c1.cdn ~# date;systemctl restart ceph-osd@0.service
> >> > Fri Nov  8 12:25:05 UTC 2019
> >> >
> >> > cn1.chn8be1c1.cdn ~# systemctl status ceph-osd@0.service -l
> >> > ● ceph-osd@0.service - Ceph object storage daemon osd.0
> >> >Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service;
> enabled-runtime; vendor preset: disabled)
> >> >   Drop-In: /etc/systemd/system/ceph-osd@.service.d
> >> >└─90-ExecStart_NUMA.conf
> >> >Active: active (running) since Fri 2019-11-08 12:25:06 UTC; 29s ago
> >> >   Process: 298505 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh
> --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
> >> >  Main PID: 298512 (ceph-osd)
> >> >CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@0.service
> >> >└─298512 /usr/bin/ceph-osd -f --cluster ceph --id 0
> --setuser ceph --setgroup ceph
> >> >
> >> > Nov 08 12:25:06 cn1.chn8be1c1.cdn systemd[1]: Starting Ceph object
> storage daemon osd.0...
> >> > Nov 08 12:25:06 cn1.chn8be1c1.cdn systemd[1]: Started Ceph object
> storage daemon osd.0.
> >> > Nov 08 12:25:11 cn1.chn8be1c1.cdn numactl[298512]: 2019-11-08
> 12:25:11.538 7f8515323d80 -1 osd.0 1795 log_to_monitors {default=true}
> >> > Nov 08 12:25:11 cn1.chn8be1c1.cdn numactl[298512]: 2019-11-08
> 12:25:11.689 7f850792e700 -1 osd.0 1795 set_numa_affinity unable to
> identify public interface 'dss-client' numa node: (2) No such file or
> directory
> >> >
> >> > On Fri, Nov 8, 2019 at 4:48 PM huang jun  wrote:
> >> >>
> >> >> the osd.0 is still in down state after restart? if so, maybe the
> >> >> problem is in mon,
> >> >> can you set the leader mon's debug_mon=20 and restart one of the down
> >> >> state osd.
> >> >> and then attach the mon log file.
> >> >>
> >> >> nokia ceph  于2019年11月8日周五 下午6:38写道:
> >> >> >
> >> >> > Hi,
> >> >> >
> >> >> >
> >> >> >
> >> >> > Below is the status of the OSD after restart.
> >> >> >
> >> >> >
> >> >> >
> >> >> > # systemctl status ceph-osd@0.service
> >> >> >
> >> >> > ● ceph-osd@0.service - Ceph object storage daemon osd.0
> >> >> >
> >> >> >Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service;
> enabled-runtime; vendor preset: disabled)
> >> >> >
> >> >> >   Drop-In: /etc/systemd/system/ceph-osd@.service.d
> >> >> >
> >> >> >└─90-ExecStart_NUMA.conf
> >> >> >
> >> >> >Active: active (running) since Fri 2019-11-08 10:32:51 UTC;
> 1min 1s ago
> >> >> >
> >> >> >   Process: 219213

Re: [ceph-users] Fwd: OSD's not coming up in Nautilus

2019-11-08 Thread nokia ceph
Hi,

The fifth node in the cluster was affected by hardware failure and hence
the node was replaced in the ceph cluster. But we were not able to replace
it properly and hence we uninstalled the ceph in all the nodes, deleted the
pools and also zapped the osd's and recreated them as new ceph cluster. But
not sure where from the reference for the old fifth nodes(failed nodes)
osd's fsid's are coming from still. Is this creating the problem. Because I
am seeing that the OSD's in the fifth node are showing up in the ceph
status whereas the other nodes osd's are showing down.

On Fri, Nov 8, 2019 at 7:25 PM huang jun  wrote:

> I saw many lines like that
>
> mon.cn1@0(leader).osd e1805 preprocess_boot from osd.112
> v2:10.50.11.45:6822/158344 clashes with existing osd: different fsid
> (ours: 85908622-31bd-4728-9be3-f1f6ca44ed98 ; theirs:
> 127fdc44-c17e-42ee-bcd4-d577c0ef4479)
> the osd boot will be ignored if the fsid mismatch
> what do you do before this happen?
>
> nokia ceph  于2019年11月8日周五 下午8:29写道:
> >
> > Hi,
> >
> > Please find the osd.0 which is restarted after the debug_mon is
> increased to 20.
> >
> > cn1.chn8be1c1.cdn ~# date;systemctl restart ceph-osd@0.service
> > Fri Nov  8 12:25:05 UTC 2019
> >
> > cn1.chn8be1c1.cdn ~# systemctl status ceph-osd@0.service -l
> > ● ceph-osd@0.service - Ceph object storage daemon osd.0
> >Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service;
> enabled-runtime; vendor preset: disabled)
> >   Drop-In: /etc/systemd/system/ceph-osd@.service.d
> >└─90-ExecStart_NUMA.conf
> >    Active: active (running) since Fri 2019-11-08 12:25:06 UTC; 29s ago
> >   Process: 298505 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh
> --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
> >  Main PID: 298512 (ceph-osd)
> >CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@0.service
> >└─298512 /usr/bin/ceph-osd -f --cluster ceph --id 0 --setuser
> ceph --setgroup ceph
> >
> > Nov 08 12:25:06 cn1.chn8be1c1.cdn systemd[1]: Starting Ceph object
> storage daemon osd.0...
> > Nov 08 12:25:06 cn1.chn8be1c1.cdn systemd[1]: Started Ceph object
> storage daemon osd.0.
> > Nov 08 12:25:11 cn1.chn8be1c1.cdn numactl[298512]: 2019-11-08
> 12:25:11.538 7f8515323d80 -1 osd.0 1795 log_to_monitors {default=true}
> > Nov 08 12:25:11 cn1.chn8be1c1.cdn numactl[298512]: 2019-11-08
> 12:25:11.689 7f850792e700 -1 osd.0 1795 set_numa_affinity unable to
> identify public interface 'dss-client' numa node: (2) No such file or
> directory
> >
> > On Fri, Nov 8, 2019 at 4:48 PM huang jun  wrote:
> >>
> >> the osd.0 is still in down state after restart? if so, maybe the
> >> problem is in mon,
> >> can you set the leader mon's debug_mon=20 and restart one of the down
> >> state osd.
> >> and then attach the mon log file.
> >>
> >> nokia ceph  于2019年11月8日周五 下午6:38写道:
> >> >
> >> > Hi,
> >> >
> >> >
> >> >
> >> > Below is the status of the OSD after restart.
> >> >
> >> >
> >> >
> >> > # systemctl status ceph-osd@0.service
> >> >
> >> > ● ceph-osd@0.service - Ceph object storage daemon osd.0
> >> >
> >> >Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service;
> enabled-runtime; vendor preset: disabled)
> >> >
> >> >   Drop-In: /etc/systemd/system/ceph-osd@.service.d
> >> >
> >> >└─90-ExecStart_NUMA.conf
> >> >
> >> >    Active: active (running) since Fri 2019-11-08 10:32:51 UTC; 1min
> 1s ago
> >> >
> >> >   Process: 219213 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh
> --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)  Main PID:
> 219218 (ceph-osd)
> >> >
> >> >CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@0.service
> >> >
> >> >└─219218 /usr/bin/ceph-osd -f --cluster ceph --id 0
> --setuser ceph --setgroup ceph
> >> >
> >> >
> >> >
> >> > Nov 08 10:32:51 cn1.chn8be1c1.cdn systemd[1]: Starting Ceph object
> storage daemon osd.0...
> >> >
> >> > Nov 08 10:32:51 cn1.chn8be1c1.cdn systemd[1]: Started Ceph object
> storage daemon osd.0.
> >> >
> >> > Nov 08 10:33:03 cn1.chn8be1c1.cdn numactl[219218]: 2019-11-08
> 10:33:03.785 7f9adeed4d80 -1 osd.0 1795 log_to_monitors {default=true} Nov
> 08 10:33:05 cn1.chn8be1c1.cdn numactl[219218]: 2019-11-08 10:33:05.474
> 7f9ad14df700 -1 osd.0 1795 set_numa_affinity u

Re: [ceph-users] Fwd: OSD's not coming up in Nautilus

2019-11-08 Thread nokia ceph
Hi,



Below is the status of the OSD after restart.



# systemctl status ceph-osd@0.service

● ceph-osd@0.service - Ceph object storage daemon osd.0

   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service;
enabled-runtime; vendor preset: disabled)

  Drop-In: /etc/systemd/system/ceph-osd@.service.d

   └─90-ExecStart_NUMA.conf

   Active: active (running) since Fri 2019-11-08 10:32:51 UTC; 1min 1s ago

  Process: 219213 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster
${CLUSTER} --id %i (code=exited, status=0/SUCCESS)  Main PID: 219218
(ceph-osd)

   CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@0.service

   └─219218 /usr/bin/ceph-osd -f --cluster ceph --id 0 --setuser
ceph --setgroup ceph



Nov 08 10:32:51 cn1.chn8be1c1.cdn systemd[1]: Starting Ceph object storage
daemon osd.0...

Nov 08 10:32:51 cn1.chn8be1c1.cdn systemd[1]: Started Ceph object storage
daemon osd.0.

Nov 08 10:33:03 cn1.chn8be1c1.cdn numactl[219218]: 2019-11-08 10:33:03.785
7f9adeed4d80 -1 osd.0 1795 log_to_monitors {default=true} Nov 08 10:33:05
cn1.chn8be1c1.cdn numactl[219218]: 2019-11-08 10:33:05.474 7f9ad14df700 -1
osd.0 1795 set_numa_affinity unable to identify public interface
'dss-client' numa n...r directory

Hint: Some lines were ellipsized, use -l to show in full.





And I have attached the logs in the file in this mail while this restart
was initiated.



On Fri, Nov 8, 2019 at 3:59 PM huang jun  wrote:

> try to restart some of the down osds in 'ceph osd tree', and to see
> what happened?
>
> nokia ceph  于2019年11月8日周五 下午6:24写道:
> >
> > Adding my official mail id
> >
> > -- Forwarded message -
> > From: nokia ceph 
> > Date: Fri, Nov 8, 2019 at 3:57 PM
> > Subject: OSD's not coming up in Nautilus
> > To: Ceph Users 
> >
> >
> > Hi Team,
> >
> > There is one 5 node ceph cluster which we have upgraded from Luminous to
> Nautilus and everything was going well until yesterday when we noticed that
> the ceph osd's are marked down and not recognized by the monitors as
> running eventhough the osd processes are running.
> >
> > We noticed that the admin.keyring and the mon.keyring are missing in the
> nodes which we have recreated it with the below commands.
> >
> > ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring
> --gen-key -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds
> allow
> >
> > ceph-authtool --create_keyring /etc/ceph/ceph.mon.keyring --gen-key -n
> mon. --cap mon 'allow *'
> >
> > In logs we find the below lines.
> >
> > 2019-11-08 09:01:50.525 7ff61722b700  0 log_channel(audit) log [DBG] :
> from='client.? 10.50.11.44:0/2398064782' entity='client.admin'
> cmd=[{"prefix": "df", "format": "json"}]: dispatch
> > 2019-11-08 09:02:37.686 7ff61722b700  0 log_channel(cluster) log [INF] :
> mon.cn1 calling monitor election
> > 2019-11-08 09:02:37.686 7ff61722b700  1 mon.cn1@0(electing).elector(31157)
> init, last seen epoch 31157, mid-election, bumping
> > 2019-11-08 09:02:37.688 7ff61722b700 -1 mon.cn1@0(electing) e3 failed
> to get devid for : udev_device_new_from_subsystem_sysname failed on ''
> > 2019-11-08 09:02:37.770 7ff61722b700  0 log_channel(cluster) log [INF] :
> mon.cn1 is new leader, mons cn1,cn2,cn3,cn4,cn5 in quorum (ranks 0,1,2,3,4)
> > 2019-11-08 09:02:37.857 7ff613a24700  0 log_channel(cluster) log [DBG] :
> monmap e3: 5 mons at {cn1=[v2:10.50.11.41:3300/0,v1:10.50.11.41:6789/0
> ],cn2=[v2:10.50.11.42:3300/0,v1:10.50.11.42:6789/0],cn3=[v2:
> 10.50.11.43:3300/0,v1:10.50.11.43:6789/0],cn4=[v2:
> 10.50.11.44:3300/0,v1:10.50.11.44:6789/0],cn5=[v2:
> 10.50.11.45:3300/0,v1:10.50.11.45:6789/0]}
> >
> >
> >
> > # ceph mon dump
> > dumped monmap epoch 3
> > epoch 3
> > fsid 9dbf207a-561c-48ba-892d-3e79b86be12f
> > last_changed 2019-09-03 07:53:39.031174
> > created 2019-08-23 18:30:55.970279
> > min_mon_release 14 (nautilus)
> > 0: [v2:10.50.11.41:3300/0,v1:10.50.11.41:6789/0] mon.cn1
> > 1: [v2:10.50.11.42:3300/0,v1:10.50.11.42:6789/0] mon.cn2
> > 2: [v2:10.50.11.43:3300/0,v1:10.50.11.43:6789/0] mon.cn3
> > 3: [v2:10.50.11.44:3300/0,v1:10.50.11.44:6789/0] mon.cn4
> > 4: [v2:10.50.11.45:3300/0,v1:10.50.11.45:6789/0] mon.cn5
> >
> >
> > # ceph -s
> >   cluster:
> > id: 9dbf207a-561c-48ba-892d-3e79b86be12f
> > health: HEALTH_WARN
> > 85 osds down
> > 3 hosts (72 osds) down
> > 1 nearfull osd(s)
> > 1 pool(s) nearfull
> > Reduced data availability: 2048 pgs inactive
> > too few PGs per OSD (17 < min 30)

[ceph-users] Fwd: OSD's not coming up in Nautilus

2019-11-08 Thread nokia ceph
Adding my official mail id

-- Forwarded message -
From: nokia ceph 
Date: Fri, Nov 8, 2019 at 3:57 PM
Subject: OSD's not coming up in Nautilus
To: Ceph Users 


Hi Team,

There is one 5 node ceph cluster which we have upgraded from Luminous to
Nautilus and everything was going well until yesterday when we noticed that
the ceph osd's are marked down and not recognized by the monitors as
running eventhough the osd processes are running.

We noticed that the admin.keyring and the mon.keyring are missing in the
nodes which we have recreated it with the below commands.

ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring
--gen-key -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds
allow

ceph-authtool --create_keyring /etc/ceph/ceph.mon.keyring --gen-key -n mon.
--cap mon 'allow *'

In logs we find the below lines.

2019-11-08 09:01:50.525 7ff61722b700  0 log_channel(audit) log [DBG] :
from='client.? 10.50.11.44:0/2398064782' entity='client.admin'
cmd=[{"prefix": "df", "format": "json"}]: dispatch
2019-11-08 09:02:37.686 7ff61722b700  0 log_channel(cluster) log [INF] :
mon.cn1 calling monitor election
2019-11-08 09:02:37.686 7ff61722b700  1 mon.cn1@0(electing).elector(31157)
init, last seen epoch 31157, mid-election, bumping
2019-11-08 09:02:37.688 7ff61722b700 -1 mon.cn1@0(electing) e3 failed to
get devid for : udev_device_new_from_subsystem_sysname failed on ''
2019-11-08 09:02:37.770 7ff61722b700  0 log_channel(cluster) log [INF] :
mon.cn1 is new leader, mons cn1,cn2,cn3,cn4,cn5 in quorum (ranks 0,1,2,3,4)
2019-11-08 09:02:37.857 7ff613a24700  0 log_channel(cluster) log [DBG] :
monmap e3: 5 mons at {cn1=[v2:10.50.11.41:3300/0,v1:10.50.11.41:6789/0
],cn2=[v2:10.50.11.42:3300/0,v1:10.50.11.42:6789/0],cn3=[v2:
10.50.11.43:3300/0,v1:10.50.11.43:6789/0],cn4=[v2:
10.50.11.44:3300/0,v1:10.50.11.44:6789/0],cn5=[v2:
10.50.11.45:3300/0,v1:10.50.11.45:6789/0]}



# ceph mon dump
dumped monmap epoch 3
epoch 3
fsid 9dbf207a-561c-48ba-892d-3e79b86be12f
last_changed 2019-09-03 07:53:39.031174
created 2019-08-23 18:30:55.970279
min_mon_release 14 (nautilus)
0: [v2:10.50.11.41:3300/0,v1:10.50.11.41:6789/0] mon.cn1
1: [v2:10.50.11.42:3300/0,v1:10.50.11.42:6789/0] mon.cn2
2: [v2:10.50.11.43:3300/0,v1:10.50.11.43:6789/0] mon.cn3
3: [v2:10.50.11.44:3300/0,v1:10.50.11.44:6789/0] mon.cn4
4: [v2:10.50.11.45:3300/0,v1:10.50.11.45:6789/0] mon.cn5


# ceph -s
  cluster:
id: 9dbf207a-561c-48ba-892d-3e79b86be12f
health: HEALTH_WARN
85 osds down
3 hosts (72 osds) down
1 nearfull osd(s)
1 pool(s) nearfull
Reduced data availability: 2048 pgs inactive
too few PGs per OSD (17 < min 30)
1/5 mons down, quorum cn2,cn3,cn4,cn5

  services:
mon: 5 daemons, quorum cn2,cn3,cn4,cn5 (age 57s), out of quorum: cn1
mgr: cn1(active, since 73m), standbys: cn2, cn3, cn4, cn5
osd: 120 osds: 35 up, 120 in; 909 remapped pgs

  data:
pools:   1 pools, 2048 pgs
objects: 0 objects, 0 B
usage:   176 TiB used, 260 TiB / 437 TiB avail
pgs: 100.000% pgs unknown
 2048 unknown


The osd logs show the below logs.

2019-11-08 09:05:33.332 7fd1a36eed80  0 _get_class not permitted to load kvs
2019-11-08 09:05:33.332 7fd1a36eed80  0 _get_class not permitted to load lua
2019-11-08 09:05:33.337 7fd1a36eed80  0 _get_class not permitted to load sdk
2019-11-08 09:05:33.337 7fd1a36eed80  0 osd.0 1795 crush map has features
43262930805112, adjusting msgr requires for clients
2019-11-08 09:05:33.337 7fd1a36eed80  0 osd.0 1795 crush map has features
43262930805112 was 8705, adjusting msgr requires for mons
2019-11-08 09:05:33.337 7fd1a36eed80  0 osd.0 1795 crush map has features
1009090060360105984, adjusting msgr requires for osds

Please let us know what might be the issue. There seems to be no network
issues in any of the servers public and private interfaces.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] OSD's not coming up in Nautilus

2019-11-08 Thread nokia ceph
Hi Team,

There is one 5 node ceph cluster which we have upgraded from Luminous to
Nautilus and everything was going well until yesterday when we noticed that
the ceph osd's are marked down and not recognized by the monitors as
running eventhough the osd processes are running.

We noticed that the admin.keyring and the mon.keyring are missing in the
nodes which we have recreated it with the below commands.

ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring
--gen-key -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds
allow

ceph-authtool --create_keyring /etc/ceph/ceph.mon.keyring --gen-key -n mon.
--cap mon 'allow *'

In logs we find the below lines.

2019-11-08 09:01:50.525 7ff61722b700  0 log_channel(audit) log [DBG] :
from='client.? 10.50.11.44:0/2398064782' entity='client.admin'
cmd=[{"prefix": "df", "format": "json"}]: dispatch
2019-11-08 09:02:37.686 7ff61722b700  0 log_channel(cluster) log [INF] :
mon.cn1 calling monitor election
2019-11-08 09:02:37.686 7ff61722b700  1 mon.cn1@0(electing).elector(31157)
init, last seen epoch 31157, mid-election, bumping
2019-11-08 09:02:37.688 7ff61722b700 -1 mon.cn1@0(electing) e3 failed to
get devid for : udev_device_new_from_subsystem_sysname failed on ''
2019-11-08 09:02:37.770 7ff61722b700  0 log_channel(cluster) log [INF] :
mon.cn1 is new leader, mons cn1,cn2,cn3,cn4,cn5 in quorum (ranks 0,1,2,3,4)
2019-11-08 09:02:37.857 7ff613a24700  0 log_channel(cluster) log [DBG] :
monmap e3: 5 mons at {cn1=[v2:10.50.11.41:3300/0,v1:10.50.11.41:6789/0
],cn2=[v2:10.50.11.42:3300/0,v1:10.50.11.42:6789/0],cn3=[v2:
10.50.11.43:3300/0,v1:10.50.11.43:6789/0],cn4=[v2:
10.50.11.44:3300/0,v1:10.50.11.44:6789/0],cn5=[v2:
10.50.11.45:3300/0,v1:10.50.11.45:6789/0]}



# ceph mon dump
dumped monmap epoch 3
epoch 3
fsid 9dbf207a-561c-48ba-892d-3e79b86be12f
last_changed 2019-09-03 07:53:39.031174
created 2019-08-23 18:30:55.970279
min_mon_release 14 (nautilus)
0: [v2:10.50.11.41:3300/0,v1:10.50.11.41:6789/0] mon.cn1
1: [v2:10.50.11.42:3300/0,v1:10.50.11.42:6789/0] mon.cn2
2: [v2:10.50.11.43:3300/0,v1:10.50.11.43:6789/0] mon.cn3
3: [v2:10.50.11.44:3300/0,v1:10.50.11.44:6789/0] mon.cn4
4: [v2:10.50.11.45:3300/0,v1:10.50.11.45:6789/0] mon.cn5


# ceph -s
  cluster:
id: 9dbf207a-561c-48ba-892d-3e79b86be12f
health: HEALTH_WARN
85 osds down
3 hosts (72 osds) down
1 nearfull osd(s)
1 pool(s) nearfull
Reduced data availability: 2048 pgs inactive
too few PGs per OSD (17 < min 30)
1/5 mons down, quorum cn2,cn3,cn4,cn5

  services:
mon: 5 daemons, quorum cn2,cn3,cn4,cn5 (age 57s), out of quorum: cn1
mgr: cn1(active, since 73m), standbys: cn2, cn3, cn4, cn5
osd: 120 osds: 35 up, 120 in; 909 remapped pgs

  data:
pools:   1 pools, 2048 pgs
objects: 0 objects, 0 B
usage:   176 TiB used, 260 TiB / 437 TiB avail
pgs: 100.000% pgs unknown
 2048 unknown


The osd logs show the below logs.

2019-11-08 09:05:33.332 7fd1a36eed80  0 _get_class not permitted to load kvs
2019-11-08 09:05:33.332 7fd1a36eed80  0 _get_class not permitted to load lua
2019-11-08 09:05:33.337 7fd1a36eed80  0 _get_class not permitted to load sdk
2019-11-08 09:05:33.337 7fd1a36eed80  0 osd.0 1795 crush map has features
43262930805112, adjusting msgr requires for clients
2019-11-08 09:05:33.337 7fd1a36eed80  0 osd.0 1795 crush map has features
43262930805112 was 8705, adjusting msgr requires for mons
2019-11-08 09:05:33.337 7fd1a36eed80  0 osd.0 1795 crush map has features
1009090060360105984, adjusting msgr requires for osds

Please let us know what might be the issue. There seems to be no network
issues in any of the servers public and private interfaces.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Is deepscrub Part of PG increase?

2019-11-02 Thread ceph
Hello,

I have a Nautlius Cluster (14.2.4) where i set the Time for scrubs to run only 
from 23 till 6 o'clock.

But, Last Time i increase my PGs from 512 PGs (with 15 bluestore osds - 3 
nodes) to 1024 PGs (with 35 OSDs - 7 nodes) and observed running deepscrubs 
when he finished some rebalances...

When i set no(deep-)scrub there is no scrub running, but when i unset 
no(deep-)scrub it starte again. Observed this when i First increase to 532 pgs 
as a test.

So, is scrubbing a nessesary Part when we increase the PGs which have to be run 
and should not or can not wait till the defined timespan for scrubbing from the 
admin?

Hope you can enlighten me :)
- Mehmet___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cluster network down

2019-10-25 Thread ceph


Am 1. Oktober 2019 08:20:08 MESZ schrieb "Lars Täuber" :
>Mon, 30 Sep 2019 15:21:18 +0200
>Janne Johansson  ==> Lars Täuber 
>:
>> >
>> > I don't remember where I read it, but it was told that the cluster
>is
>> > migrating its complete traffic over to the public network when the
>cluster
>> > networks goes down. So this seems not to be the case?
>> >  
>> 
>> Be careful with generalizations like "when a network acts up, it will
>be
>> completely down and noticeably unreachable for all parts", since
>networks
>> can break in thousands of not-very-obvious ways which are not
>0%-vs-100%
>> but somewhere in between.
>> 
>
>Ok. I ask my question in a new way.
>What does ceph do, when I switch off all switches of the cluster
>network?
>Does ceph handle this silently without interruption? Does the heartbeat
>systems use the public network as a failover automatically?

No, you will be in big trouble if this happens as the Cluster do not know how 
the status of your osds is to be able to Serv your Client requests.

There is no redundant Ring used in case of a failure of your Cluster network. 
To David this you could use LACP.

hth
Mehmet

>
>Thanks
>Lars
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Increase of Ceph-mon memory usage - Luminous

2019-10-16 Thread nokia ceph
Hi Team,

We have noticed that memory usage of ceph-monitor processes increased by
1GB in 4 days.
We monitored the ceph-monitor memory usage every minute and we can see it
increases and decreases by few 100 MBs at any point; but over time, the
memory usage increases. We also noticed some monitor processes use up to
8GB.

Environment -
6 node Luminous cluster - 12.2.2
67 OSDs per node, monitor process on each node

Is this amount of memory usage expected for ceph-monitor processes ?

Thanks,
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph stats on the logs

2019-10-08 Thread nokia ceph
Hi Team,

With default log settings , the ceph  stats will be logged like
cluster [INF] pgmap v30410386: 8192 pgs: 8192 active+clean; 445 TB data,
1339 TB used, 852 TB / 2191 TB avail; 188 kB/s rd, 217 MB/s wr, 1618 op/s
 Jewel : on mon logs
 Nautilus : on mgr logs
Luminous : not able to view similar logs on either mon/mgr , what is the
log level to be set to have this stats on the logs.

Thanks,
Muthu
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] KVM userspace-rbd hung_task_timeout on 3rd disk

2019-09-29 Thread ceph
I guess this depends in your Cluster Setup... you have slow request also?

- Mehmet

Am 11. September 2019 12:22:08 MESZ schrieb Ansgar Jazdzewski 
:
>Hi,
>
>we are running ceph version 13.2.4 and qemu 2.10, we figured out that
>on VMs with more than three disks IO fails with hung task timeout,
>wehn ever we do IO on disks after the 2nd one.
>
>- is this issue known to a qemu / ceph version could not find
>something in the changelogs!?
>- do you have an idea how can i provide some better logs
>- we tested it with kernelspace /dev/rbd no issues so far
>
>thanks,
>Ansgar
>_______
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Nautilus : ceph dashboard ssl not working

2019-09-24 Thread nokia ceph
Thank you Ricardo Dias

On Tue, Sep 17, 2019 at 2:13 PM Ricardo Dias  wrote:

> Hi Muthu,
>
> The command you used is only available in v14.2.3. To set the ssl
> certificate in v14.2.2 you need to use the following commands:
>
> $ ceph config-key set mgr/dashboard/crt -i dashboard.crt
> $ ceph config-key set mgr/dashboard/key -i dashboard.key
>
> The above commands will emit a deprecation warning that you can ignore.
>
> Thanks,
> Ricardo Dias
>
> ____
> From: ceph-users  on behalf of nokia
> ceph 
> Sent: Monday, September 16, 2019 10:30
> To: Ceph Users
> Subject: [ceph-users] Nautilus : ceph dashboard ssl not working
>
> Hi Team,
> In ceph 14.2.2 , ceph dashboard does not have set-ssl-certificate .
> We are trying to enable ceph dashboard and while using the ssl certificate
> and key , it is not working .
>
> cn5.chn5au1c1.cdn ~# ceph dashboard set-ssl-certificate -i dashboard.crt
> no valid command found; 10 closest matches:
> dashboard set-grafana-update-dashboards 
> dashboard reset-prometheus-api-host
> dashboard reset-ganesha-clusters-rados-pool-namespace
> dashboard set-grafana-api-username 
> dashboard get-audit-api-log-payload
> dashboard get-grafana-api-password
> dashboard get-grafana-api-username
> dashboard set-rgw-api-access-key 
> dashboard reset-rgw-api-host
> dashboard set-prometheus-api-host 
> Error EINVAL: invalid command
> cn5.chn5au1c1.cdn ~# ceph -v
> ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus
> (stable)
>
> How to set crt and key in this case.
>
> Thanks,
> Muthu
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Nautilus : ceph dashboard ssl not working

2019-09-16 Thread nokia ceph
Hi Team,
In ceph 14.2.2 , ceph dashboard does not have set-ssl-certificate .
We are trying to enable ceph dashboard and while using the ssl certificate
and key , it is not working .

cn5.chn5au1c1.cdn ~# ceph dashboard set-ssl-certificate -i dashboard.crt
no valid command found; 10 closest matches:
dashboard set-grafana-update-dashboards 
dashboard reset-prometheus-api-host
dashboard reset-ganesha-clusters-rados-pool-namespace
dashboard set-grafana-api-username 
dashboard get-audit-api-log-payload
dashboard get-grafana-api-password
dashboard get-grafana-api-username
dashboard set-rgw-api-access-key 
dashboard reset-rgw-api-host
dashboard set-prometheus-api-host 
Error EINVAL: invalid command
cn5.chn5au1c1.cdn ~# ceph -v
ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus
(stable)

How to set crt and key in this case.

Thanks,
Muthu
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] multiple RESETSESSION messages

2019-09-13 Thread nokia ceph
Hi,

We have a 5 node Luminous cluster on which we see multiple RESETSESSION
messages for OSDs on the last node alone.

's=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=2613 cs=1
l=0).handle_connect_reply connect got RESETSESSION'

We found the below fix for this issue, but not able to identify the correct
Luminous release in which this is/will be available.
https://github.com/ceph/ceph/pull/25343

Can someone help us with this please?

Thanks,
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mon db change from rocksdb to leveldb

2019-08-22 Thread nokia ceph
Thank you Paul.

On Wed, Aug 21, 2019 at 5:36 PM Paul Emmerich 
wrote:

> You can't downgrade from Luminous to Kraken well officially at least.
>
> I guess it maybe could somehow work but you'd need to re-create all
> the services. For the mon example: delete a mon, create a new old one,
> let it sync, etc.
> Still a bad idea.
>
> Paul
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
>
> On Wed, Aug 21, 2019 at 1:37 PM nokia ceph 
> wrote:
> >
> > Hi Team,
> >
> > One of our old customer had Kraken and they are going to upgrade to
> Luminous . In the process they also requesting for downgrade procedure.
> > Kraken used leveldb for ceph-mon data , from luminous it changed to
> rocksdb , upgrade works without any issues.
> >
> > When we downgrade , the ceph-mon does not start and the mon kv_backend
> not moving away from rocksdb .
> >
> > After downgrade , when kv_backend is rocksdb following error thrown by
> ceph-mon , trying to load data from rocksdb and end up in this error,
> >
> > 2019-08-21 11:22:45.200188 7f1a0406f7c0  4 rocksdb: Recovered from
> manifest file:/var/lib/ceph/mon/ceph-cn1/store.db/MANIFEST-000716
> succeeded,manifest_file_number is 716, next_file_number is 718,
> last_sequence is 311614, log_number is 0,prev_log_number is
> 0,max_column_family is 0
> >
> > 2019-08-21 11:22:45.200198 7f1a0406f7c0  4 rocksdb: Column family
> [default] (ID 0), log number is 715
> >
> > 2019-08-21 11:22:45.200247 7f1a0406f7c0  4 rocksdb: EVENT_LOG_v1
> {"time_micros": 1566386565200240, "job": 1, "event": "recovery_started",
> "log_files": [717]}
> > 2019-08-21 11:22:45.200252 7f1a0406f7c0  4 rocksdb: Recovering log #717
> mode 2
> > 2019-08-21 11:22:45.200282 7f1a0406f7c0  4 rocksdb: Creating manifest 719
> >
> > 2019-08-21 11:22:45.201222 7f1a0406f7c0  4 rocksdb: EVENT_LOG_v1
> {"time_micros": 1566386565201218, "job": 1, "event": "recovery_finished"}
> > 2019-08-21 11:22:45.202582 7f1a0406f7c0  4 rocksdb: DB pointer
> 0x55d4dacf
> > 2019-08-21 11:22:45.202726 7f1a0406f7c0 -1 ERROR: on disk data includes
> unsupported features: compat={},rocompat={},incompat={9=luminous ondisk
> layout}
> > 2019-08-21 11:22:45.202735 7f1a0406f7c0 -1 error checking features: (1)
> Operation not permitted
> >
> > We changed the kv_backend file inside /var/lib/ceph/mon/ceph-cn1 to
> leveldb and ceph-mon failed with following error,
> >
> > 2019-08-21 11:24:07.922978 7fc5a25de7c0 -1 WARNING: the following
> dangerous and experimental features are enabled: bluestore,rocksdb
> > 2019-08-21 11:24:07.922983 7fc5a25de7c0  0 set uid:gid to 167:167
> (ceph:ceph)
> > 2019-08-21 11:24:07.923009 7fc5a25de7c0  0 ceph version 11.2.0
> (f223e27eeb35991352ebc1f67423d4ebc252adb7), process ceph-mon, pid 3509050
> > 2019-08-21 11:24:07.923050 7fc5a25de7c0  0 pidfile_write: ignore empty
> --pid-file
> > 2019-08-21 11:24:07.944867 7fc5a25de7c0 -1 WARNING: the following
> dangerous and experimental features are enabled: bluestore,rocksdb
> > 2019-08-21 11:24:07.950304 7fc5a25de7c0  0 load: jerasure load: lrc
> load: isa
> > 2019-08-21 11:24:07.950563 7fc5a25de7c0 -1 error opening mon data
> directory at '/var/lib/ceph/mon/ceph-cn1': (22) Invalid argument
> >
> > Is there any possibility to toggle ceph-mon db between leveldb and
> rocksdb?
> > Tried to add mon_keyvaluedb = leveldb and filestore_omap_backend =
> leveldb in ceph.conf also not worked.
> > thanks,
> > Muthu
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] mon db change from rocksdb to leveldb

2019-08-21 Thread nokia ceph
Hi Team,

One of our old customer had Kraken and they are going to upgrade to
Luminous . In the process they also requesting for downgrade procedure.
Kraken used leveldb for ceph-mon data , from luminous it changed to rocksdb
, upgrade works without any issues.

When we downgrade , the ceph-mon does not start and the mon kv_backend not
moving away from rocksdb .

After downgrade , when kv_backend is rocksdb following error thrown by
ceph-mon , trying to load data from rocksdb and end up in this error,

2019-08-21 11:22:45.200188 7f1a0406f7c0  4 rocksdb: Recovered from manifest
file:/var/lib/ceph/mon/ceph-cn1/store.db/MANIFEST-000716
succeeded,manifest_file_number is 716, next_file_number is 718,
last_sequence is 311614, log_number is 0,prev_log_number is
0,max_column_family is 0

2019-08-21 11:22:45.200198 7f1a0406f7c0  4 rocksdb: Column family [default]
(ID 0), log number is 715

2019-08-21 11:22:45.200247 7f1a0406f7c0  4 rocksdb: EVENT_LOG_v1
{"time_micros": 1566386565200240, "job": 1, "event": "recovery_started",
"log_files": [717]}
2019-08-21 11:22:45.200252 7f1a0406f7c0  4 rocksdb: Recovering log #717
mode 2
2019-08-21 11:22:45.200282 7f1a0406f7c0  4 rocksdb: Creating manifest 719

2019-08-21 11:22:45.201222 7f1a0406f7c0  4 rocksdb: EVENT_LOG_v1
{"time_micros": 1566386565201218, "job": 1, "event": "recovery_finished"}
2019-08-21 11:22:45.202582 7f1a0406f7c0  4 rocksdb: DB pointer
0x55d4dacf
2019-08-21 11:22:45.202726 7f1a0406f7c0 -1 ERROR: on disk data includes
unsupported features: compat={},rocompat={},incompat={9=luminous ondisk
layout}
2019-08-21 11:22:45.202735 7f1a0406f7c0 -1 error checking features: (1)
Operation not permitted

We changed the kv_backend file inside /var/lib/ceph/mon/ceph-cn1 to leveldb
and ceph-mon failed with following error,

2019-08-21 11:24:07.922978 7fc5a25de7c0 -1 WARNING: the following dangerous
and experimental features are enabled: bluestore,rocksdb
2019-08-21 11:24:07.922983 7fc5a25de7c0  0 set uid:gid to 167:167
(ceph:ceph)
2019-08-21 11:24:07.923009 7fc5a25de7c0  0 ceph version 11.2.0
(f223e27eeb35991352ebc1f67423d4ebc252adb7), process ceph-mon, pid 3509050
2019-08-21 11:24:07.923050 7fc5a25de7c0  0 pidfile_write: ignore empty
--pid-file
2019-08-21 11:24:07.944867 7fc5a25de7c0 -1 WARNING: the following dangerous
and experimental features are enabled: bluestore,rocksdb
2019-08-21 11:24:07.950304 7fc5a25de7c0  0 load: jerasure load: lrc load:
isa
2019-08-21 11:24:07.950563 7fc5a25de7c0 -1 error opening mon data directory
at '/var/lib/ceph/mon/ceph-cn1': (22) Invalid argument

Is there any possibility to toggle ceph-mon db between leveldb and rocksdb?
Tried to add mon_keyvaluedb = leveldb and filestore_omap_backend = leveldb
in ceph.conf also not worked.
thanks,
Muthu
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] bluestore write iops calculation

2019-08-06 Thread nokia ceph
On Mon, Aug 5, 2019 at 6:35 PM  wrote:

> > Hi  Team,
> > @vita...@yourcmc.ru , thank you for information and could you please
> > clarify on the below quires as well,
> >
> > 1. Average object size we use will be 256KB to 512KB , will there be
> > deferred write queue ?
>
> With the default settings, no (bluestore_prefer_deferred_size_hdd =
> 32KB)
>
>   Are you sure that 256-512KB operations aren't counted as multiple

> operations in your disk stats?
>

  I think it is not taking multiple operations.

>
> > 2. Share the link of existing rocksdb ticket which does 2 write +
> > syncs.
>
> My PR is here https://github.com/ceph/ceph/pull/26909, you can find the
> issue tracker links inside it.
>
> > 3. Any configuration by which we can reduce/optimize the iops ?
>
> As already said part of your I/O may be caused by the metadata (rocksdb)
> reads if it doesn't fit into RAM. You can try to add more RAM in that
> case... :)
>

 I can add RAM ans is there a way to increase rocksdb caching , can I
increase bluestore_cache_size_hdd to higher value to cache rocksdb?

>
> You can also try to add SSDs for metadata (block.db/block.wal).
>
 This we have planned to add some SSDs and how many OSD's rocks db we can
add per SSDs and i guess if one SSD is down then all related OSDs has to be
re-installed.

>
> Is there something else?... I don't think so.
>
> --
> Vitaliy Filippov
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] bluestore write iops calculation

2019-08-05 Thread nokia ceph
Hi  Team,
@vita...@yourcmc.ru , thank you for information and could you please
clarify on the below quires as well,

1. Average object size we use will be 256KB to 512KB , will there be
deferred write queue ?
2. Share the link of existing rocksdb ticket which does 2 write + syncs.
3. Any configuration by which we can reduce/optimize the iops ?

Thanks,
Muthu


On Fri, Aug 2, 2019 at 6:21 PM  wrote:

> > 1. For 750 object write request , data written directly into data
> > partition and since we use EC 4+1 there will be 5 iops across the
> > cluster for each obejct write . This makes 750 * 5 = 3750 iops
>
> don't forget about the metadata and the deferring of small writes.
> deferred write queue + metadata, then data for each OSD. this is either
> 2 or 3 ops per an OSD. the deferred write queue is in the same RocksDB
> so deferred write queue + metadata should be 1 op, although a slightly
> bigger one (8-12 kb for 4 kb writes). so it's either 3*5*750 or 2*5*750,
> depending on how your final statistics is collected
>
> > 2. For 750 attribute request , first it will be written into
> > rocksdb.WAL and then to rocks.db . So , 2 iops per disk for every
> > attribute request . This makes 750*2*5 = 7500 iops inside the cluster.
>
> rocksdb is LSM so it doesn't write to wal then to DB, it just writes to
> WAL then compacts it at some point and merges with L0->L1->L2->...
>
> so in theory without compaction it should be 1*5*750 iops
>
> however, there is a bug that makes bluestore do 2 writes+syncs instead
> of 1 per each journal write (not all the time though). the first write
> is the rocksdb's WAL and the second one is the bluefs's journal. this
> probably adds another 5*750 iops on top of each of (1) and (2).
>
> so 5*((2 or 3)+1+2)*750 = either 18750 or 22500. 18750/120 = 156.25,
> 22500/120 = 187.5
>
> the rest may be compaction or metadata reads if you update some objects.
> or maybe I'm missing something else. however this is already closer to
> your 200 iops :)
>
> --
> Vitaliy Filippov
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] details about cloning objects using librados

2019-08-02 Thread nokia ceph
Thank you Greg, it is now clear for us and the option is only available in
C++ , we need to rewrite the client code with c++ .

Thanks,
Muthu

On Fri, Aug 2, 2019 at 1:05 AM Gregory Farnum  wrote:

> On Wed, Jul 31, 2019 at 10:31 PM nokia ceph 
> wrote:
> >
> > Thank you Greg,
> >
> > Another question , we need to give new destination object  , so that we
> can read them separately in parallel with src object .  This function
> resides in objector.h , seems to be like internal and can it be used in
> interface level  and can we use this in our client ? Currently we use
> librados.h in our client to communicate with ceph cluster.
>
> copy_from is an ObjectOperations and exposed via the librados C++ api
> like all the others. It may not be in the simple
> (, , ) interfaces. It may also not be
> in the C API?
>
> > Also any equivalent librados api for the command rados -p poolname  object> 
>
> It's using the copy_from command we're discussing here. You can look
> at the source as an example:
> https://github.com/ceph/ceph/blob/master/src/tools/rados/rados.cc#L497
> -Greg
> <https://github.com/ceph/ceph/blob/master/src/tools/rados/rados.cc#L497-Greg>
>
> >
> > Thanks,
> > Muthu
> >
> > On Wed, Jul 31, 2019 at 11:13 PM Gregory Farnum 
> wrote:
> >>
> >>
> >>
> >> On Wed, Jul 31, 2019 at 1:32 AM nokia ceph 
> wrote:
> >>>
> >>> Hi Greg,
> >>>
> >>> We were trying to implement this however having issues in assigning
> the destination object name with this api.
> >>> There is a rados command "rados -p  cp  "
> , is there any librados api equivalent to this ?
> >>
> >>
> >> The copyfrom operation, like all other ops, is directed to a specific
> object. The object you run it on is the destination; it copies the
> specified “src” object into itself.
> >> -Greg
> >>
> >>>
> >>> Thanks,
> >>> Muthu
> >>>
> >>> On Fri, Jul 5, 2019 at 4:00 PM nokia ceph 
> wrote:
> >>>>
> >>>> Thank you Greg, we will try this out .
> >>>>
> >>>> Thanks,
> >>>> Muthu
> >>>>
> >>>> On Wed, Jul 3, 2019 at 11:12 PM Gregory Farnum 
> wrote:
> >>>>>
> >>>>> Well, the RADOS interface doesn't have a great deal of documentation
> >>>>> so I don't know if I can point you at much.
> >>>>>
> >>>>> But if you look at Objecter.h, you see that the ObjectOperation has
> >>>>> this function:
> >>>>> void copy_from(object_t src, snapid_t snapid, object_locator_t
> >>>>> src_oloc, version_t src_version, unsigned flags, unsigned
> >>>>> src_fadvise_flags)
> >>>>>
> >>>>> src: the object to copy from
> >>>>> snapid: if you want to copy a specific snap instead of HEAD
> >>>>> src_oloc: the object locator for the object
> >>>>> src_version: the version of the object to copy from (helps identify
> if
> >>>>> it was updated in the meantime)
> >>>>> flags: probably don't want to set these, but see
> >>>>> PrimaryLogPG::_copy_some for the choices
> >>>>> src_fadvise_flags: these are the fadvise flags we have in various
> >>>>> places that let you specify things like not to cache the data.
> >>>>> Probably leave them unset.
> >>>>>
> >>>>> -Greg
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Wed, Jul 3, 2019 at 2:47 AM nokia ceph 
> wrote:
> >>>>> >
> >>>>> > Hi Greg,
> >>>>> >
> >>>>> > Can you please share the api details  for COPY_FROM or any
> reference document?
> >>>>> >
> >>>>> > Thanks ,
> >>>>> > Muthu
> >>>>> >
> >>>>> > On Wed, Jul 3, 2019 at 4:12 AM Brad Hubbard 
> wrote:
> >>>>> >>
> >>>>> >> On Wed, Jul 3, 2019 at 4:25 AM Gregory Farnum 
> wrote:
> >>>>> >> >
> >>>>> >> > I'm not sure how or why you'd get an object class involved in
> doing
> >>>>> >> > this in the normal course of affairs.
> >>>>> >> >
> >>>>> >> > There's a copy_from op that a client can s

[ceph-users] bluestore write iops calculation

2019-08-02 Thread nokia ceph
Hi Team,

Could you please help us in understanding the write iops inside ceph
cluster . There seems to be mismatch in iops between theoretical and what
we see in disk status.

Our platform 5 node cluster 120 OSDs, with each node having 24 disks HDD (
data, rcoksdb and rocksdb.WAL all resides in the same disk) .

We use EC 4+1

We do only write operation total average 1500 write iops (750objects/s and
750 attribute requests per second , single Key value entry for each
object). And in the ceph status we see consistent 1500 write iops from the
client.

Please correct if our assumptions are wrong.
1. For 750 object write request , data written directly into data partition
and since we use EC 4+1 there will be 5 iops across the cluster for each
obejct write . This makes 750 * 5 = 3750 iops
2. For 750 attribute request , first it will be written into rocksdb.WAL
and then to rocks.db . So , 2 iops per disk for every attribute request .
This makes 750*2*5 = 7500 iops inside the cluster.

Now the total iops inside the cluster would be 11250 iops. we have 120 OSDs
, hence per OSD should have 11250/120 = ~94iops .

Currently we see average 200iops per osd for the same load in iostat
however the theoretical calculation seems to be only 94iops .

Could you please let us know where we miss the remaining iops inside the
cluster for 1500 write iops from client?

Does each object write will endup in writing one metadata inside rocksdb ,
then we need to add another 3750 to the total iops  and this make each OSD
will have 125iops , still there is difference of 75iops per OSD.

Thanks,
Muthu
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] details about cloning objects using librados

2019-07-31 Thread nokia ceph
Thank you Greg,

Another question , we need to give new destination object  , so that we can
read them separately in parallel with src object .  This function resides
in objector.h , seems to be like internal and can it be used in interface
level  and can we use this in our client ? Currently we use librados.h in
our client to communicate with ceph cluster.
Also any equivalent librados api for the command rados -p poolname  

Thanks,
Muthu

On Wed, Jul 31, 2019 at 11:13 PM Gregory Farnum  wrote:

>
>
> On Wed, Jul 31, 2019 at 1:32 AM nokia ceph 
> wrote:
>
>> Hi Greg,
>>
>> We were trying to implement this however having issues in assigning the
>> destination object name with this api.
>> There is a rados command "rados -p  cp  " ,
>> is there any librados api equivalent to this ?
>>
>
> The copyfrom operation, like all other ops, is directed to a specific
> object. The object you run it on is the destination; it copies the
> specified “src” object into itself.
> -Greg
>
>
>> Thanks,
>> Muthu
>>
>> On Fri, Jul 5, 2019 at 4:00 PM nokia ceph 
>> wrote:
>>
>>> Thank you Greg, we will try this out .
>>>
>>> Thanks,
>>> Muthu
>>>
>>> On Wed, Jul 3, 2019 at 11:12 PM Gregory Farnum 
>>> wrote:
>>>
>>>> Well, the RADOS interface doesn't have a great deal of documentation
>>>> so I don't know if I can point you at much.
>>>>
>>>> But if you look at Objecter.h, you see that the ObjectOperation has
>>>> this function:
>>>> void copy_from(object_t src, snapid_t snapid, object_locator_t
>>>> src_oloc, version_t src_version, unsigned flags, unsigned
>>>> src_fadvise_flags)
>>>>
>>>> src: the object to copy from
>>>> snapid: if you want to copy a specific snap instead of HEAD
>>>> src_oloc: the object locator for the object
>>>> src_version: the version of the object to copy from (helps identify if
>>>> it was updated in the meantime)
>>>> flags: probably don't want to set these, but see
>>>> PrimaryLogPG::_copy_some for the choices
>>>> src_fadvise_flags: these are the fadvise flags we have in various
>>>> places that let you specify things like not to cache the data.
>>>> Probably leave them unset.
>>>>
>>>> -Greg
>>>>
>>>>
>>>>
>>>> On Wed, Jul 3, 2019 at 2:47 AM nokia ceph 
>>>> wrote:
>>>> >
>>>> > Hi Greg,
>>>> >
>>>> > Can you please share the api details  for COPY_FROM or any reference
>>>> document?
>>>> >
>>>> > Thanks ,
>>>> > Muthu
>>>> >
>>>> > On Wed, Jul 3, 2019 at 4:12 AM Brad Hubbard 
>>>> wrote:
>>>> >>
>>>> >> On Wed, Jul 3, 2019 at 4:25 AM Gregory Farnum 
>>>> wrote:
>>>> >> >
>>>> >> > I'm not sure how or why you'd get an object class involved in doing
>>>> >> > this in the normal course of affairs.
>>>> >> >
>>>> >> > There's a copy_from op that a client can send and which copies an
>>>> >> > object from another OSD into the target object. That's probably the
>>>> >> > primitive you want to build on. Note that the OSD doesn't do much
>>>> >>
>>>> >> Argh! yes, good idea. We really should document that!
>>>> >>
>>>> >> > consistency checking (it validates that the object version matches
>>>> an
>>>> >> > input, but if they don't it just returns an error) so the client
>>>> >> > application is responsible for any locking needed.
>>>> >> > -Greg
>>>> >> >
>>>> >> > On Tue, Jul 2, 2019 at 3:49 AM Brad Hubbard 
>>>> wrote:
>>>> >> > >
>>>> >> > > Yes, this should be possible using an object class which is also
>>>> a
>>>> >> > > RADOS client (via the RADOS API). You'll still have some client
>>>> >> > > traffic as the machine running the object class will still need
>>>> to
>>>> >> > > connect to the relevant primary osd and send the write
>>>> (presumably in
>>>> >> > > some situations though this will be the same machine).
>>>> >> > >
>>>> >

Re: [ceph-users] details about cloning objects using librados

2019-07-31 Thread nokia ceph
Hi Greg,

We were trying to implement this however having issues in assigning the
destination object name with this api.
There is a rados command "rados -p  cp  " , is
there any librados api equivalent to this ?

Thanks,
Muthu

On Fri, Jul 5, 2019 at 4:00 PM nokia ceph  wrote:

> Thank you Greg, we will try this out .
>
> Thanks,
> Muthu
>
> On Wed, Jul 3, 2019 at 11:12 PM Gregory Farnum  wrote:
>
>> Well, the RADOS interface doesn't have a great deal of documentation
>> so I don't know if I can point you at much.
>>
>> But if you look at Objecter.h, you see that the ObjectOperation has
>> this function:
>> void copy_from(object_t src, snapid_t snapid, object_locator_t
>> src_oloc, version_t src_version, unsigned flags, unsigned
>> src_fadvise_flags)
>>
>> src: the object to copy from
>> snapid: if you want to copy a specific snap instead of HEAD
>> src_oloc: the object locator for the object
>> src_version: the version of the object to copy from (helps identify if
>> it was updated in the meantime)
>> flags: probably don't want to set these, but see
>> PrimaryLogPG::_copy_some for the choices
>> src_fadvise_flags: these are the fadvise flags we have in various
>> places that let you specify things like not to cache the data.
>> Probably leave them unset.
>>
>> -Greg
>>
>>
>>
>> On Wed, Jul 3, 2019 at 2:47 AM nokia ceph 
>> wrote:
>> >
>> > Hi Greg,
>> >
>> > Can you please share the api details  for COPY_FROM or any reference
>> document?
>> >
>> > Thanks ,
>> > Muthu
>> >
>> > On Wed, Jul 3, 2019 at 4:12 AM Brad Hubbard 
>> wrote:
>> >>
>> >> On Wed, Jul 3, 2019 at 4:25 AM Gregory Farnum 
>> wrote:
>> >> >
>> >> > I'm not sure how or why you'd get an object class involved in doing
>> >> > this in the normal course of affairs.
>> >> >
>> >> > There's a copy_from op that a client can send and which copies an
>> >> > object from another OSD into the target object. That's probably the
>> >> > primitive you want to build on. Note that the OSD doesn't do much
>> >>
>> >> Argh! yes, good idea. We really should document that!
>> >>
>> >> > consistency checking (it validates that the object version matches an
>> >> > input, but if they don't it just returns an error) so the client
>> >> > application is responsible for any locking needed.
>> >> > -Greg
>> >> >
>> >> > On Tue, Jul 2, 2019 at 3:49 AM Brad Hubbard 
>> wrote:
>> >> > >
>> >> > > Yes, this should be possible using an object class which is also a
>> >> > > RADOS client (via the RADOS API). You'll still have some client
>> >> > > traffic as the machine running the object class will still need to
>> >> > > connect to the relevant primary osd and send the write (presumably
>> in
>> >> > > some situations though this will be the same machine).
>> >> > >
>> >> > > On Tue, Jul 2, 2019 at 4:08 PM nokia ceph <
>> nokiacephus...@gmail.com> wrote:
>> >> > > >
>> >> > > > Hi Brett,
>> >> > > >
>> >> > > > I think I was wrong here in the requirement description. It is
>> not about data replication , we need same content stored in different
>> object/name.
>> >> > > > We store video contents inside the ceph cluster. And our new
>> requirement is we need to store same content for different users , hence
>> need same content in different object name . if client sends write request
>> for object x and sets number of copies as 100, then cluster has to clone
>> 100 copies of object x and store it as object x1, objectx2,etc. Currently
>> this is done in the client side where objectx1, object x2...objectx100 are
>> cloned inside the client and write request sent for all 100 objects which
>> we want to avoid to reduce network consumption.
>> >> > > >
>> >> > > > Similar usecases are rbd snapshot , radosgw copy .
>> >> > > >
>> >> > > > Is this possible in object class ?
>> >> > > >
>> >> > > > thanks,
>> >> > > > Muthu
>> >> > > >
>> >> > > >
>> >> > > > On Mon, Jul 1, 2019 at 7:58 PM Brett Chancellor <
>>

Re: [ceph-users] Nautilus:14.2.2 Legacy BlueStore stats reporting detected

2019-07-24 Thread nokia ceph
Hi Team,

I guess for cluster installed with Nuatilus this warning will not come and
it is only for upgraded systems.
Please let us know disabling bluestore warn on legacy statfs is the only
option for upgraded clusters.

thanks,
Muthu

On Fri, Jul 19, 2019 at 5:22 PM Paul Emmerich 
wrote:

> bluestore warn on legacy statfs = false
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
>
>
> On Fri, Jul 19, 2019 at 1:35 PM nokia ceph 
> wrote:
>
>> Hi Team,
>>
>> After upgrading our cluster from 14.2.1 to 14.2.2 , the cluster moved to
>> warning state with following error
>>
>> cn1.chn6m1c1ru1c1.cdn ~# ceph status
>>   cluster:
>> id: e9afb5f3-4acf-421a-8ae6-caaf328ef888
>> health: HEALTH_WARN
>> Legacy BlueStore stats reporting detected on 335 OSD(s)
>>
>>   services:
>> mon: 5 daemons, quorum cn1,cn2,cn3,cn4,cn5 (age 114m)
>> mgr: cn4(active, since 2h), standbys: cn3, cn1, cn2, cn5
>> osd: 335 osds: 335 up (since 112m), 335 in
>>
>>   data:
>> pools:   1 pools, 8192 pgs
>> objects: 129.01M objects, 849 TiB
>> usage:   1.1 PiB used, 749 TiB / 1.8 PiB avail
>> pgs: 8146 active+clean
>>  46   active+clean+scrubbing
>>
>> Checked the bug list and found that this issue is solved however still
>> exists ,
>>
>> https://github.com/ceph/ceph/pull/28563
>>
>> How to disable this warning?
>>
>> Thanks,
>> Muthu
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Nautilus:14.2.2 Legacy BlueStore stats reporting detected

2019-07-21 Thread nokia ceph
Thank you  Paul Emmerich

On Fri, Jul 19, 2019 at 5:22 PM Paul Emmerich 
wrote:

> bluestore warn on legacy statfs = false
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
>
>
> On Fri, Jul 19, 2019 at 1:35 PM nokia ceph 
> wrote:
>
>> Hi Team,
>>
>> After upgrading our cluster from 14.2.1 to 14.2.2 , the cluster moved to
>> warning state with following error
>>
>> cn1.chn6m1c1ru1c1.cdn ~# ceph status
>>   cluster:
>> id: e9afb5f3-4acf-421a-8ae6-caaf328ef888
>> health: HEALTH_WARN
>> Legacy BlueStore stats reporting detected on 335 OSD(s)
>>
>>   services:
>> mon: 5 daemons, quorum cn1,cn2,cn3,cn4,cn5 (age 114m)
>> mgr: cn4(active, since 2h), standbys: cn3, cn1, cn2, cn5
>> osd: 335 osds: 335 up (since 112m), 335 in
>>
>>   data:
>> pools:   1 pools, 8192 pgs
>> objects: 129.01M objects, 849 TiB
>> usage:   1.1 PiB used, 749 TiB / 1.8 PiB avail
>> pgs: 8146 active+clean
>>  46   active+clean+scrubbing
>>
>> Checked the bug list and found that this issue is solved however still
>> exists ,
>>
>> https://github.com/ceph/ceph/pull/28563
>>
>> How to disable this warning?
>>
>> Thanks,
>> Muthu
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Nautilus:14.2.2 Legacy BlueStore stats reporting detected

2019-07-19 Thread nokia ceph
Hi Team,

After upgrading our cluster from 14.2.1 to 14.2.2 , the cluster moved to
warning state with following error

cn1.chn6m1c1ru1c1.cdn ~# ceph status
  cluster:
id: e9afb5f3-4acf-421a-8ae6-caaf328ef888
health: HEALTH_WARN
Legacy BlueStore stats reporting detected on 335 OSD(s)

  services:
mon: 5 daemons, quorum cn1,cn2,cn3,cn4,cn5 (age 114m)
mgr: cn4(active, since 2h), standbys: cn3, cn1, cn2, cn5
osd: 335 osds: 335 up (since 112m), 335 in

  data:
pools:   1 pools, 8192 pgs
objects: 129.01M objects, 849 TiB
usage:   1.1 PiB used, 749 TiB / 1.8 PiB avail
pgs: 8146 active+clean
 46   active+clean+scrubbing

Checked the bug list and found that this issue is solved however still
exists ,

https://github.com/ceph/ceph/pull/28563

How to disable this warning?

Thanks,
Muthu
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] details about cloning objects using librados

2019-07-05 Thread nokia ceph
Thank you Greg, we will try this out .

Thanks,
Muthu

On Wed, Jul 3, 2019 at 11:12 PM Gregory Farnum  wrote:

> Well, the RADOS interface doesn't have a great deal of documentation
> so I don't know if I can point you at much.
>
> But if you look at Objecter.h, you see that the ObjectOperation has
> this function:
> void copy_from(object_t src, snapid_t snapid, object_locator_t
> src_oloc, version_t src_version, unsigned flags, unsigned
> src_fadvise_flags)
>
> src: the object to copy from
> snapid: if you want to copy a specific snap instead of HEAD
> src_oloc: the object locator for the object
> src_version: the version of the object to copy from (helps identify if
> it was updated in the meantime)
> flags: probably don't want to set these, but see
> PrimaryLogPG::_copy_some for the choices
> src_fadvise_flags: these are the fadvise flags we have in various
> places that let you specify things like not to cache the data.
> Probably leave them unset.
>
> -Greg
>
>
>
> On Wed, Jul 3, 2019 at 2:47 AM nokia ceph 
> wrote:
> >
> > Hi Greg,
> >
> > Can you please share the api details  for COPY_FROM or any reference
> document?
> >
> > Thanks ,
> > Muthu
> >
> > On Wed, Jul 3, 2019 at 4:12 AM Brad Hubbard  wrote:
> >>
> >> On Wed, Jul 3, 2019 at 4:25 AM Gregory Farnum 
> wrote:
> >> >
> >> > I'm not sure how or why you'd get an object class involved in doing
> >> > this in the normal course of affairs.
> >> >
> >> > There's a copy_from op that a client can send and which copies an
> >> > object from another OSD into the target object. That's probably the
> >> > primitive you want to build on. Note that the OSD doesn't do much
> >>
> >> Argh! yes, good idea. We really should document that!
> >>
> >> > consistency checking (it validates that the object version matches an
> >> > input, but if they don't it just returns an error) so the client
> >> > application is responsible for any locking needed.
> >> > -Greg
> >> >
> >> > On Tue, Jul 2, 2019 at 3:49 AM Brad Hubbard 
> wrote:
> >> > >
> >> > > Yes, this should be possible using an object class which is also a
> >> > > RADOS client (via the RADOS API). You'll still have some client
> >> > > traffic as the machine running the object class will still need to
> >> > > connect to the relevant primary osd and send the write (presumably
> in
> >> > > some situations though this will be the same machine).
> >> > >
> >> > > On Tue, Jul 2, 2019 at 4:08 PM nokia ceph 
> wrote:
> >> > > >
> >> > > > Hi Brett,
> >> > > >
> >> > > > I think I was wrong here in the requirement description. It is
> not about data replication , we need same content stored in different
> object/name.
> >> > > > We store video contents inside the ceph cluster. And our new
> requirement is we need to store same content for different users , hence
> need same content in different object name . if client sends write request
> for object x and sets number of copies as 100, then cluster has to clone
> 100 copies of object x and store it as object x1, objectx2,etc. Currently
> this is done in the client side where objectx1, object x2...objectx100 are
> cloned inside the client and write request sent for all 100 objects which
> we want to avoid to reduce network consumption.
> >> > > >
> >> > > > Similar usecases are rbd snapshot , radosgw copy .
> >> > > >
> >> > > > Is this possible in object class ?
> >> > > >
> >> > > > thanks,
> >> > > > Muthu
> >> > > >
> >> > > >
> >> > > > On Mon, Jul 1, 2019 at 7:58 PM Brett Chancellor <
> bchancel...@salesforce.com> wrote:
> >> > > >>
> >> > > >> Ceph already does this by default. For each replicated pool, you
> can set the 'size' which is the number of copies you want Ceph to maintain.
> The accepted norm for replicas is 3, but you can set it higher if you want
> to incur the performance penalty.
> >> > > >>
> >> > > >> On Mon, Jul 1, 2019, 6:01 AM nokia ceph <
> nokiacephus...@gmail.com> wrote:
> >> > > >>>
> >> > > >>> Hi Brad,
> >> > > >>>
> >> > > >>> Thank you for your response , and we wil

Re: [ceph-users] details about cloning objects using librados

2019-07-03 Thread nokia ceph
Hi Greg,

Can you please share the api details  for COPY_FROM or any reference
document?

Thanks ,
Muthu

On Wed, Jul 3, 2019 at 4:12 AM Brad Hubbard  wrote:

> On Wed, Jul 3, 2019 at 4:25 AM Gregory Farnum  wrote:
> >
> > I'm not sure how or why you'd get an object class involved in doing
> > this in the normal course of affairs.
> >
> > There's a copy_from op that a client can send and which copies an
> > object from another OSD into the target object. That's probably the
> > primitive you want to build on. Note that the OSD doesn't do much
>
> Argh! yes, good idea. We really should document that!
>
> > consistency checking (it validates that the object version matches an
> > input, but if they don't it just returns an error) so the client
> > application is responsible for any locking needed.
> > -Greg
> >
> > On Tue, Jul 2, 2019 at 3:49 AM Brad Hubbard  wrote:
> > >
> > > Yes, this should be possible using an object class which is also a
> > > RADOS client (via the RADOS API). You'll still have some client
> > > traffic as the machine running the object class will still need to
> > > connect to the relevant primary osd and send the write (presumably in
> > > some situations though this will be the same machine).
> > >
> > > On Tue, Jul 2, 2019 at 4:08 PM nokia ceph 
> wrote:
> > > >
> > > > Hi Brett,
> > > >
> > > > I think I was wrong here in the requirement description. It is not
> about data replication , we need same content stored in different
> object/name.
> > > > We store video contents inside the ceph cluster. And our new
> requirement is we need to store same content for different users , hence
> need same content in different object name . if client sends write request
> for object x and sets number of copies as 100, then cluster has to clone
> 100 copies of object x and store it as object x1, objectx2,etc. Currently
> this is done in the client side where objectx1, object x2...objectx100 are
> cloned inside the client and write request sent for all 100 objects which
> we want to avoid to reduce network consumption.
> > > >
> > > > Similar usecases are rbd snapshot , radosgw copy .
> > > >
> > > > Is this possible in object class ?
> > > >
> > > > thanks,
> > > > Muthu
> > > >
> > > >
> > > > On Mon, Jul 1, 2019 at 7:58 PM Brett Chancellor <
> bchancel...@salesforce.com> wrote:
> > > >>
> > > >> Ceph already does this by default. For each replicated pool, you
> can set the 'size' which is the number of copies you want Ceph to maintain.
> The accepted norm for replicas is 3, but you can set it higher if you want
> to incur the performance penalty.
> > > >>
> > > >> On Mon, Jul 1, 2019, 6:01 AM nokia ceph 
> wrote:
> > > >>>
> > > >>> Hi Brad,
> > > >>>
> > > >>> Thank you for your response , and we will check this video as well.
> > > >>> Our requirement is while writing an object into the cluster , if
> we can provide number of copies to be made , the network consumption
> between client and cluster will be only for one object write. However , the
> cluster will clone/copy multiple objects and stores inside the cluster.
> > > >>>
> > > >>> Thanks,
> > > >>> Muthu
> > > >>>
> > > >>> On Fri, Jun 28, 2019 at 9:23 AM Brad Hubbard 
> wrote:
> > > >>>>
> > > >>>> On Thu, Jun 27, 2019 at 8:58 PM nokia ceph <
> nokiacephus...@gmail.com> wrote:
> > > >>>> >
> > > >>>> > Hi Team,
> > > >>>> >
> > > >>>> > We have a requirement to create multiple copies of an object
> and currently we are handling it in client side to write as separate
> objects and this causes huge network traffic between client and cluster.
> > > >>>> > Is there possibility of cloning an object to multiple copies
> using librados api?
> > > >>>> > Please share the document details if it is feasible.
> > > >>>>
> > > >>>> It may be possible to use an object class to accomplish what you
> want
> > > >>>> to achieve but the more we understand what you are trying to do,
> the
> > > >>>> better the advice we can offer (at the moment your description
> sounds
> > > >>>> like replication

Re: [ceph-users] details about cloning objects using librados

2019-07-02 Thread nokia ceph
Hi Brett,

I think I was wrong here in the requirement description. It is not about
data replication , we need same content stored in different object/name.
We store video contents inside the ceph cluster. And our new requirement is
we need to store same content for different users , hence need same content
in different object name . if client sends write request for object x and
sets number of copies as 100, then cluster has to clone 100 copies of
object x and store it as object x1, objectx2,etc. Currently this is done in
the client side where objectx1, object x2...objectx100 are cloned inside
the client and write request sent for all 100 objects which we want to
avoid to reduce network consumption.

Similar usecases are rbd snapshot , radosgw copy .

Is this possible in object class ?

thanks,
Muthu


On Mon, Jul 1, 2019 at 7:58 PM Brett Chancellor 
wrote:

> Ceph already does this by default. For each replicated pool, you can set
> the 'size' which is the number of copies you want Ceph to maintain. The
> accepted norm for replicas is 3, but you can set it higher if you want to
> incur the performance penalty.
>
> On Mon, Jul 1, 2019, 6:01 AM nokia ceph  wrote:
>
>> Hi Brad,
>>
>> Thank you for your response , and we will check this video as well.
>> Our requirement is while writing an object into the cluster , if we can
>> provide number of copies to be made , the network consumption between
>> client and cluster will be only for one object write. However , the cluster
>> will clone/copy multiple objects and stores inside the cluster.
>>
>> Thanks,
>> Muthu
>>
>> On Fri, Jun 28, 2019 at 9:23 AM Brad Hubbard  wrote:
>>
>>> On Thu, Jun 27, 2019 at 8:58 PM nokia ceph 
>>> wrote:
>>> >
>>> > Hi Team,
>>> >
>>> > We have a requirement to create multiple copies of an object and
>>> currently we are handling it in client side to write as separate objects
>>> and this causes huge network traffic between client and cluster.
>>> > Is there possibility of cloning an object to multiple copies using
>>> librados api?
>>> > Please share the document details if it is feasible.
>>>
>>> It may be possible to use an object class to accomplish what you want
>>> to achieve but the more we understand what you are trying to do, the
>>> better the advice we can offer (at the moment your description sounds
>>> like replication which is already part of RADOS as you know).
>>>
>>> More on object classes from Cephalocon Barcelona in May this year:
>>> https://www.youtube.com/watch?v=EVrP9MXiiuU
>>>
>>> >
>>> > Thanks,
>>> > Muthu
>>> > ___
>>> > ceph-users mailing list
>>> > ceph-users@lists.ceph.com
>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>>
>>> --
>>> Cheers,
>>> Brad
>>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] details about cloning objects using librados

2019-07-01 Thread nokia ceph
Hi Brad,

Thank you for your response , and we will check this video as well.
Our requirement is while writing an object into the cluster , if we can
provide number of copies to be made , the network consumption between
client and cluster will be only for one object write. However , the cluster
will clone/copy multiple objects and stores inside the cluster.

Thanks,
Muthu

On Fri, Jun 28, 2019 at 9:23 AM Brad Hubbard  wrote:

> On Thu, Jun 27, 2019 at 8:58 PM nokia ceph 
> wrote:
> >
> > Hi Team,
> >
> > We have a requirement to create multiple copies of an object and
> currently we are handling it in client side to write as separate objects
> and this causes huge network traffic between client and cluster.
> > Is there possibility of cloning an object to multiple copies using
> librados api?
> > Please share the document details if it is feasible.
>
> It may be possible to use an object class to accomplish what you want
> to achieve but the more we understand what you are trying to do, the
> better the advice we can offer (at the moment your description sounds
> like replication which is already part of RADOS as you know).
>
> More on object classes from Cephalocon Barcelona in May this year:
> https://www.youtube.com/watch?v=EVrP9MXiiuU
>
> >
> > Thanks,
> > Muthu
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> --
> Cheers,
> Brad
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] details about cloning objects using librados

2019-06-27 Thread nokia ceph
Hi Team,

We have a requirement to create multiple copies of an object and currently
we are handling it in client side to write as separate objects and this
causes huge network traffic between client and cluster.
Is there possibility of cloning an object to multiple copies using librados
api?
Please share the document details if it is feasible.

Thanks,
Muthu
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Using Ceph Ansible to Add Nodes to Cluster at Weight 0

2019-06-23 Thread ceph
Hello,

I would advice to use this Script from dan:
https://github.com/cernceph/ceph-scripts/blob/master/tools/ceph-gentle-reweight

I have Used it many Times and it works Great - also if you want to drain the 
OSDs.

Hth
Mehmet

Am 30. Mai 2019 22:59:05 MESZ schrieb Michel Raabe :
>Hi Mike,
>
>On 30.05.19 02:00, Mike Cave wrote:
>> I’d like a s little friction for the cluster as possible as it is in 
>> heavy use right now.
>> 
>> I’m running mimic (13.2.5) on CentOS.
>> 
>> Any suggestions on best practices for this?
>
>You can limit the recovery for example
>
>* max backfills
>* recovery max active
>* recovery sleep
>
>It will slow down the rebalance but will not hurt the users too much.
>
>
>Michel.
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Error when I compare hashes of export-diff / import-diff

2019-06-11 Thread ceph



On 6/11/19 3:24 PM, Rafael Diaz Maurin wrote:
> 3- I create a snapshot inside the source pool
> rbd snap create ${POOL-SOURCE}/${KVM-IMAGE}@${TODAY-SNAP}
> 
> 4- I export the snapshot from the source pool and I import the snapshot
> towards the destination pool (in the pipe)
> rbd export-diff --from-snap ${LAST-SNAP}
> ${POOL-SOURCE}/${KVM-IMAGE}@${TODAY-SNAP} - | rbd -c ${BACKUP-CLUSTER}
> import-diff - ${POOL-DESTINATION}/${KVM-IMAGE}

Here, you are wrong
If have to export-diff (without --from-snap) to do the "first" export
What you are doing is rebasing the image, on the backup cluster, based
on crap (the dummy snapshot you created on step 3)

So:
1) Create a snapshot of source image
2) Create a dest image if not exists
3) If dest was created, export the source snapshot and import it:
  rbd export-diff  --snap  | rbd import-diff - 
3b) If dest was not created (you then have a shared snapshot between the
source and the dest), export-diff using from-snap:
   rbd export-diff --from-snap   --snap  | rbd
import-diff - 

You can checkout Backurne's code, that does what you want:
https://github.com/JackSlateur/backurne/blob/master/ceph.py#L173

Best regards,

> 
> The problem occurs when I want to validate only the diff between the 2
> snapshots (in order to be more efficient). I note that those hashes are
> differents.
> 
> Here is how I calcultate the hashes :
> Source-hash : rbd diff --from-snap ${LAST-SNAP}
> ${POOL-SOURCE}/${KVM-IMAGE}@${TODAY-SNAP} --format json | md5sum | cut
> -d ' ' -f 1
> => bc56663b8ff01ec388598037a20861cf
> Destination-hash : rbd -c ${BACKUP-CLUSTER} diff --from-snap
> ${LAST-SNAP} ${POOL-DESTINATION}/${KVM-IMAGE}@${TODAY-SNAP} --format
> json | md5sum | cut -d ' ' -f 1
> => 3aa35362471419abe0a41f222c113096
> 
> In an other hand, if I compare the hashes of the export (between source
> and destination), they are the same :
> 
> rbd -p ${POOL-SOURCE} export ${KVM-IMAGE}@${TODAY-SNAP} - | md5sum
> => 2c4962870fdd67ca758c154760d9df83
> rbd -c ${BACKUP-CLUSTER} -p ${POOL-DESTINATION} export
> ${KVM-IMAGE}@${TODAY-SNAP} - | md5sum
> => 2c4962870fdd67ca758c154760d9df83
> 
> 
> Can someone has an idea of what's happenning ?
> 
> Can someone has a way to succeed in comparing the export-diff
> /import-diff ?
> 
> 
> 
> 
> Thank you,
> Rafael
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph master - src/common/options.cc - size_t / uint64_t incompatibility on ARM 32bit

2019-06-03 Thread Dyweni - Ceph-Users

Hi List / James,


In the Ceph master (and also Ceph 14.2.1), file:  src/common/options.cc, 
 line # 192:


Option::size_t sz{strict_iecstrtoll(val.c_str(), error_message)};


On ARM 32-bit, compiling with CLang 7.1.0,  compilation fails hard at 
this line.



The reason is because strict_iecstrtoll() returns an uint64_t value, 
which has a different size that size_t on this architecture.


I think, since the intention is to convert a value in string format back 
to it's native type, that the strict_iecstrtoll() should return a size_t 
type, not an uint64_t type.  What does everyone else think?



Thanks!

- Dyweni
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph nautilus deep-scrub health error

2019-05-15 Thread nokia ceph
Hi Manuel.

Thanks for your response. We will consider this settings when we enable
deep-scrubbing. For now i saw this write up from Nautilus release notes,

Configuration values mon_warn_not_scrubbed and
mon_warn_not_deep_scrubbed have been renamed. They are now
mon_warn_pg_not_scrubbed_ratio and mon_warn_pg_not_deep_scrubbed_ratio
respectively. This is to clarify that these warnings are related to
pg scrubbing and are a ratio of the related interval. These options
are now enabled by default.

So, we made mon_warn_pg_not_deep_scrubbed_ratio = 0  and after that cluster
not moving to warning state for not deep scrubbing.

Thanks,
Muthu

On Tue, May 14, 2019 at 4:30 PM EDH - Manuel Rios Fernandez <
mrios...@easydatahost.com> wrote:

> Hi Muthu
>
>
>
> We found the same issue near 2000 pgs not deep-scrubbed in time.
>
>
>
> We’re manually force scrubbing with :
>
>
>
> ceph health detail | grep -i not | awk '{print $2}' | while read i; do
> ceph pg deep-scrub ${i}; done
>
>
>
> It launch near 20-30 pgs to be deep-scrubbed. I think you can improve
>  with a sleep of 120 secs between scrub to prevent overload your osd.
>
>
>
> For disable deep-scrub you can use “ceph osd set nodeep-scrub” , Also you
> can setup deep-scrub with threshold .
>
> #Start Scrub 22:00
>
> osd scrub begin hour = 22
>
> #Stop Scrub 8
>
> osd scrub end hour = 8
>
> #Scrub Load 0.5
>
> osd scrub load threshold = 0.5
>
>
>
> Regards,
>
>
>
> Manuel
>
>
>
>
>
>
>
>
>
> *De:* ceph-users  *En nombre de *nokia
> ceph
> *Enviado el:* martes, 14 de mayo de 2019 11:44
> *Para:* Ceph Users 
> *Asunto:* [ceph-users] ceph nautilus deep-scrub health error
>
>
>
> Hi Team,
>
>
>
> After upgrading from Luminous to Nautilus , we see 654 pgs not
> deep-scrubbed in time error in ceph status . How can we disable this flag?
> . In our setup we disable deep-scrubbing for performance issues.
>
>
>
> Thanks,
>
> Muthu
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph nautilus deep-scrub health error

2019-05-14 Thread nokia ceph
Hi Team,

After upgrading from Luminous to Nautilus , we see 654 pgs not
deep-scrubbed in time error in ceph status . How can we disable this flag?
. In our setup we disable deep-scrubbing for performance issues.

Thanks,
Muthu
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] obj_size_info_mismatch error handling

2019-04-30 Thread ceph
Hello Reed,

I would give PG repair a try.
IIRC there should be issue when you have Size 3... it would be difficult when 
you have Size 2 I guess...

Hth
Mehmet

Am 29. April 2019 17:05:48 MESZ schrieb Reed Dier :
>Hi list,
>
>Woke up this morning to two PG's reporting scrub errors, in a way that
>I haven't seen before.
>> $ ceph versions
>> {
>> "mon": {
>> "ceph version 13.2.5
>(cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic (stable)": 3
>> },
>> "mgr": {
>> "ceph version 13.2.5
>(cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic (stable)": 3
>> },
>> "osd": {
>> "ceph version 13.2.4
>(b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable)": 156
>> },
>>     "mds": {
>> "ceph version 13.2.5
>(cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic (stable)": 2
>> },
>> "overall": {
>> "ceph version 13.2.4
>(b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable)": 156,
>> "ceph version 13.2.5
>(cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic (stable)": 8
>> }
>> }
>
>
>> OSD_SCRUB_ERRORS 8 scrub errors
>> PG_DAMAGED Possible data damage: 2 pgs inconsistent
>> pg 17.72 is active+clean+inconsistent, acting [3,7,153]
>> pg 17.2b9 is active+clean+inconsistent, acting [19,7,16]
>
>Here is what $rados list-inconsistent-obj 17.2b9 --format=json-pretty
>yields:
>> {
>> "epoch": 134582,
>> "inconsistents": [
>> {
>> "object": {
>> "name": "10008536718.",
>> "nspace": "",
>> "locator": "",
>> "snap": "head",
>> "version": 0
>> },
>> "errors": [],
>> "union_shard_errors": [
>> "obj_size_info_mismatch"
>> ],
>> "shards": [
>> {
>> "osd": 7,
>> "primary": false,
>> "errors": [
>> "obj_size_info_mismatch"
>> ],
>> "size": 5883,
>> "object_info": {
>> "oid": {
>> "oid": "10008536718.",
>> "key": "",
>> "snapid": -2,
>> "hash": 1752643257,
>> "max": 0,
>> "pool": 17,
>> "namespace": ""
>> },
>> "version": "134599'448331",
>> "prior_version": "134599'448330",
>> "last_reqid": "client.1580931080.0:671854",
>> "user_version": 448331,
>> "size": 3505,
>> "mtime": "2019-04-28 15:32:20.003519",
>> "local_mtime": "2019-04-28 15:32:25.991015",
>> "lost": 0,
>> "flags": [
>> "dirty",
>> "data_digest",
>> "omap_digest"
>> ],
>> "truncate_seq": 899,
>> "truncate_size": 0,
>> "data_digest": "0xf99a3bd3",
>> "omap_digest": "0x",
>> "expected_object_size": 0,
>> "expected_write_size": 0,
>> "alloc_hint_flags": 0,
>> "manifest": {
>> "type": 0
>> },
>> "watchers": {}
>> }
>>

Re: [ceph-users] VM management setup

2019-04-24 Thread ceph
Hello,

I would also recommend proxmox
It is very easy to install and to Manage your kvm/lxc with Huge amount of 
Support for possible storages.

Just my 2 Cents
Hth
- Mehmet 


Am 6. April 2019 17:48:32 MESZ schrieb Marc Roos :
>
>We have also hybrid ceph/libvirt-kvm setup, using some scripts to do 
>live migration, do you have auto failover in your setup?
>
>
>
>-Original Message-
>From: jes...@krogh.cc [mailto:jes...@krogh.cc] 
>Sent: 05 April 2019 21:34
>To: ceph-users
>Subject: [ceph-users] VM management setup
>
>Hi. Knowing this is a bit off-topic but seeking recommendations and 
>advise anyway.
>
>We're seeking a "management" solution for VM's - currently in the 40-50
>
>VM - but would like to have better access in managing them and 
>potintially migrate them across multiple hosts, setup block devices, 
>etc, etc.
>
>This is only to be used internally in a department where a bunch of 
>engineering people will manage it, no costumers and that kind of thing.
>
>Up until now we have been using virt-manager with kvm - and have been 
>quite satisfied when we were in the "few vms", but it seems like the 
>time to move on.
>
>Thus we're looking for something "simple" that can help manage a 
>ceph+kvm based setup -  the simpler and more to the point the better.
>
>Any recommendations?
>
>.. found a lot of names allready ..
>OpenStack
>CloudStack
>Proxmox
>..
>
>But recommendations are truely welcome.
>
>Thanks.
>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Chasing slow ops in mimic

2019-04-13 Thread ceph
Hi Alex,  slow ops are often root cause of Bad disks... perpaps do a htop when 
you Stress your Cluster and See which disks operates on Limit...

Only a guess 

Am 12. März 2019 14:19:03 MEZ schrieb Alex Litvak 
:
>I looked further into historic slow ops (thanks to some other posts on
>the list) and I am confused a bit with the following event
>
>{
>"description": "osd_repop(client.85322.0:86478552 7.1b e502/466
>7:d8d149b7:::rbd_data.ff7e3d1b58ba.0316:head v
>502'10665506)",
> "initiated_at": "2019-03-08 07:53:23.673807",
> "age": 335669.547018,
> "duration": 13.328475,
> "type_data": {
> "flag_point": "commit sent; apply or cleanup",
> "events": [
> {
> "time": "2019-03-08 07:53:23.673807",
> "event": "initiated"
> },
> {
> "time": "2019-03-08 07:53:23.673807",
> "event": "header_read"
> },
> {
> "time": "2019-03-08 07:53:23.673808",
> "event": "throttled"
> },
> {
> "time": "2019-03-08 07:53:37.001601",
> "event": "all_read"
> },
> {
> "time": "2019-03-08 07:53:37.001643",
> "event": "dispatched"
> },
> {
> "time": "2019-03-08 07:53:37.001649",
> "event": "queued_for_pg"
> },
> {
> "time": "2019-03-08 07:53:37.001679",
> "event": "reached_pg"
> },
> {
> "time": "2019-03-08 07:53:37.001699",
> "event": "started"
> },
> {
> "time": "2019-03-08 07:53:37.002208",
> "event": "commit_sent"
> },
> {
> "time": "2019-03-08 07:53:37.002282",
> "event": "done"
> }
> ]
> }
> },
>
>It just tell me throttled, nothing else.  What does throttled mean in
>this case?
>I see some events where osd is waiting for response from its partners
>for a specific pg but while it can be attributed to a network issue,
>throttled ones are not a clear cut.
>
>Appreciate any clues,
>
>On 3/11/2019 4:26 PM, Alex Litvak wrote:
>> Hello Cephers,
>> 
>> I am trying to find the cause of multiple slow ops happened with my
>small cluster.  I have a 3 node  with 9 OSDs
>> 
>> Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
>> 128 GB RAM
>> Each OSD is SSD Intel DC-S3710 800GB
>> It runs mimic 13.2.2 in containers.
>> 
>> Cluster was operating normally for 4 month and then recently I had an
>outage with multiple VMs (RBD) showing
>> 
>> Mar  8 07:59:42 sbc12n2-chi.siptalk.com kernel: [140206.243812] INFO:
>task xfsaild/vda1:404 blocked for more than 120 seconds.
>> Mar  8 07:59:42 sbc12n2-chi.siptalk.com kernel: [140206.243957] Not
>tainted 4.19.5-1.el7.elrepo.x86_64 #1
>> Mar  8 07:59:42 sbc12n2-chi.siptalk.com kernel: [140206.244063] "echo
>0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> Mar  8 07:59:42 sbc12n2-chi.siptalk.com kernel: [140206.244181]
>xfsaild/vda1    D    0   404  2 0x8000
>> 
>> After examining ceph logs, i found following entries in multiple OSDs
>> Mar  8 07:38:52 storage1n2-chi ceph-osd-run.sh[20939]: 2019-03-08
>07:38:52.299 7fe0bdb8f700 -1 osd.13 502 get_health_metrics reporting 1
>slow ops, oldest is osd_op(client.148553.0:5996289 7.fe 
>> 7:7f0ebfe2:::rbd_data.17bab2eb141f2.023d:head [stat,write
>2588672~16384] snapc 0=[] ondisk+write+known_if_redirected e502)
>> Mar  8 07:38:53 storage1n2-chi ceph-osd-run.sh[20939]: 2019-03-08
>07:38:53.347 7fe

[ceph-users] Kraken - Pool storage MAX AVAIL drops by 30TB after disk failure

2019-04-11 Thread nokia ceph
Hi,

We have a 5 node EC 4+1 cluster with 335 OSDs running Kraken Bluestore
11.2.0.
There was a disk failure on one of the OSDs and the disk was replaced.
After which it was noticed that there was a ~30TB drop in the MAX_AVAIL
value for the pool storage details on output of 'ceph df'
Even though the disk was replaced and the OSD is now running properly, this
value did not recover back to the original; also the disk is only a 4TB
disk. Hence the drop of ~30TB from the MAX_AVAIL doesn't seem right. Has
anyone had a similar issue before?

Thanks.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] problems with pg down

2019-04-09 Thread ceph
Hi Fabio,
Did you resolve the issue?

A bit late, i know, but did you tried to restart  OSD 14? If 102 and 121 are 
fine i would also try to crush reweight 14 to 0.

Greetings
Mehmet 

Am 10. März 2019 19:26:57 MEZ schrieb Fabio Abreu :
>Hi Darius,
>
>Thanks for your reply !
>
>This happening after a disaster with an sata storage node, the osds 102
>and
>121 is up  .
>
>The information belllow is osd 14 log , do you recommend mark out of
>this
>cluster ?
>
>2019-03-10 17:36:17.654134 7f1991163700  0 -- 172.16.184.90:6800/589935
>>>
>:/0 pipe(0x555be7808800 sd=516 :6800 s=0 pgs=0 cs=0 l=0
>c=0x555be6720400).accept failed to getpeername (107) Transport endpoint
>is
>not connected
>2019-03-10 17:36:17.654660 7f1992d7f700  0 -- 172.16.184.90:6800/589935
>>>
>:/0 pipe(0x555be773f400 sd=536 :6800 s=0 pgs=0 cs=0 l=0
>c=0x555be6720700).accept failed to getpeername (107) Transport endpoint
>is
>not connected
>2019-03-10 17:36:17.654720 7f1993a8c700  0 -- 172.16.184.90:6800/589935
>>>
>172.16.184.92:6801/102 pipe(0x555be7807400 sd=542 :6800 s=0 pgs=0
>cs=0
>l=0 c=0x555be6720280).accept connect_seq 0 vs existing 0 state wait
>2019-03-10 17:36:17.654813 7f199095b700  0 -- 172.16.184.90:6800/589935
>>>
>:/0 pipe(0x555be6d8e000 sd=537 :6800 s=0 pgs=0 cs=0 l=0
>c=0x555be671ff80).accept failed to getpeername (107) Transport endpoint
>is
>not connected
>2019-03-10 17:36:17.654847 7f1992476700  0 -- 172.16.184.90:6800/589935
>>>
>172.16.184.95:6840/1537112 pipe(0x555be773e000 sd=533 :6800 s=0 pgs=0
>cs=0
>l=0 c=0x555be671fc80).accept connect_seq 0 vs existing 0 state wait
>2019-03-10 17:36:17.655252 7f1993486700  0 -- 172.16.184.90:6800/589935
>>>
>172.16.184.92:6832/1098862 pipe(0x555be779f400 sd=521 :6800 s=0 pgs=0
>cs=0
>l=0 c=0x555be6242d00).accept connect_seq 0 vs existing 0 state wait
>2019-03-10 17:36:17.655315 7f1993284700  0 -- 172.16.184.90:6800/589935
>>>
>:/0 pipe(0x555be6d90800 sd=523 :6800 s=0 pgs=0 cs=0 l=0
>c=0x555be6720880).accept failed to getpeername (107) Transport endpoint
>is
>not connected
>2019-03-10 17:36:17.655814 7f1992173700  0 -- 172.16.184.90:6800/589935
>>>
>172.16.184.91:6833/316673 pipe(0x555be7740800 sd=527 :6800 s=0 pgs=0
>cs=0
>l=0 c=0x555be6720580).accept connect_seq 0 vs existing 0 state wait
>
>Regards,
>Fabio Abreu
>
>On Sun, Mar 10, 2019 at 3:20 PM Darius Kasparavičius 
>wrote:
>
>> Hi,
>>
>> Check your osd.14 logs for information its currently stuck and not
>> providing io for replication. And what happened to OSD's 102 121?
>>
>> On Sun, Mar 10, 2019 at 7:44 PM Fabio Abreu
>
>> wrote:
>> >
>> > Hi Everybody .
>> >
>> > I have an pg with down+peering  state and that have requests
>blocked
>> impacting my pg query, I can't find the osd to apply the lost
>paremeter.
>> >
>> >
>>
>http://docs.ceph.com/docs/mimic/rados/troubleshooting/troubleshooting-pg/#placement-group-down-peering-failure
>> >
>> > Did someone  have  same  scenario with  state down?
>> >
>> > Storage :
>> >
>> > 100 ops are blocked > 262.144 sec on osd.14
>> >
>> > root@monitor:~# ceph pg dump_stuck inactive
>> > ok
>> > pg_stat state   up  up_primary  acting  acting_primary
>> > 5.6e0   down+remapped+peering   [102,121,14]102 [14]14
>> >
>> >
>> > root@monitor:~# ceph -s
>> > cluster xxx
>> >  health HEALTH_ERR
>> > 1 pgs are stuck inactive for more than 300 seconds
>> > 223 pgs backfill_wait
>> > 14 pgs backfilling
>> > 215 pgs degraded
>> > 1 pgs down
>> > 1 pgs peering
>> > 1 pgs recovering
>> > 53 pgs recovery_wait
>> > 199 pgs stuck degraded
>> >     1 pgs stuck inactive
>> > 278 pgs stuck unclean
>> >     162 pgs stuck undersized
>> > 162 pgs undersized
>> > 100 requests are blocked > 32 sec
>> > recovery 2767660/317878237 objects degraded (0.871%)
>> > recovery 7484106/317878237 objects misplaced (2.354%)
>> > recovery 29/105009626 unfoun
>> >
>> >
>> >
>> >
>> > --
>> > Regards,
>> > Fabio Abreu Reis
>> > http://fajlinux.com.br
>> > Tel : +55 21 98244-0161
>> > Skype : fabioabreureis
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>-- 
>Atenciosamente,
>Fabio Abreu Reis
>http://fajlinux.com.br
>*Tel : *+55 21 98244-0161
>*Skype : *fabioabreureis
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PGs stuck in created state

2019-04-08 Thread ceph
Hello Simon,

Another idea is to increase choose_total_tries.

Hth
Mehmet

Am 7. März 2019 09:56:17 MEZ schrieb Martin Verges :
>Hello,
>
>try restarting every osd if possible.
>Upgrade to a recent ceph version.
>
>--
>Martin Verges
>Managing director
>
>Mobile: +49 174 9335695
>E-Mail: martin.ver...@croit.io
>Chat: https://t.me/MartinVerges
>
>croit GmbH, Freseniusstr. 31h, 81247 Munich
>CEO: Martin Verges - VAT-ID: DE310638492
>Com. register: Amtsgericht Munich HRB 231263
>
>Web: https://croit.io
>YouTube: https://goo.gl/PGE1Bx
>
>
>Am Do., 7. März 2019 um 08:39 Uhr schrieb simon falicon <
>simonfali...@gmail.com>:
>
>> Hello Ceph Users,
>>
>> I have an issue with my ceph cluster, after one serious fail in four
>SSD
>> (electricaly dead) I have lost PGs (and replicats) and I have 14 Pgs
>stuck.
>>
>> So for correct it I have try to force create this PGs (with same IDs)
>but
>> now the Pgs stuck in creating state -_-" :
>>
>> ~# ceph -s
>>  health HEALTH_ERR
>> 14 pgs are stuck inactive for more than 300 seconds
>> 
>>
>> ceph pg dump | grep creating
>>
>> dumped all in format plain
>> 9.300000000creating2019-02-25
>09:32:12.3339790'00:0[20,26]20[20,11]200'0   
>2019-02-25 09:32:12.3339790'02019-02-25 09:32:12.333979
>> 3.900000000creating2019-02-25
>09:32:11.2954510'00:0[16,39]16[17,6]170'0   
>2019-02-25 09:32:11.2954510'02019-02-25 09:32:11.295451
>> ...
>>
>> I have try to create new PG dosent existe before and it work, but for
>this
>> PG stuck in creating state.
>>
>> In my monitor logs I have this message:
>>
>> 2019-02-25 11:02:46.904897 7f5a371ed700  0 mon.controller1@1(peon) e7
>handle_command mon_command({"prefix": "pg force_create_pg", "pgid":
>"4.20e"} v 0) v1
>> 2019-02-25 11:02:46.904938 7f5a371ed700  0 log_channel(audit) log
>[INF] : from='client.? 172.31.101.107:0/3101034432'
>entity='client.admin' cmd=[{"prefix": "pg force_create_pg", "pgid":
>"4.20e"}]: dispatch
>>
>> When I check map I have:
>>
>> ~# ceph pg map 4.20e
>> osdmap e428069 pg 4.20e (4.20e) -> up [27,37,36] acting [13,17]
>>
>> I have restart OSD 27,37,36,13 and 17 but no effect. (one by one)
>>
>> I have see this issue http://tracker.ceph.com/issues/18298 but I run
>on
>> ceph 10.2.11.
>>
>> So could you help me please ?
>>
>> Many thanks by advance,
>> Sfalicon.
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Blocked ops after change from filestore on HDD to bluestore on SDD

2019-03-28 Thread ceph
Hi Uwe,

Am 28. Februar 2019 11:02:09 MEZ schrieb Uwe Sauter :
>Am 28.02.19 um 10:42 schrieb Matthew H:
>> Have you made any changes to your ceph.conf? If so, would you mind
>copying them into this thread?
>
>No, I just deleted an OSD, replaced HDD with SDD and created a new OSD
>(with bluestore). Once the cluster was healty again, I
>repeated with the next OSD.
>
>
>[global]
>  auth client required = cephx
>  auth cluster required = cephx
>  auth service required = cephx
>  cluster network = 169.254.42.0/24
>  fsid = 753c9bbd-74bd-4fea-8c1e-88da775c5ad4
>  keyring = /etc/pve/priv/$cluster.$name.keyring
>  public network = 169.254.42.0/24
>
>[mon]
>  mon allow pool delete = true
>  mon data avail crit = 5
>  mon data avail warn = 15
>
>[osd]
>  keyring = /var/lib/ceph/osd/ceph-$id/keyring
>  osd journal size = 5120
>  osd pool default min size = 2
>  osd pool default size = 3
>  osd max backfills = 6
>  osd recovery max active = 12

I guess should decrease  this last two  parameters to 1. This should help to 
avoid to much pressure on your drives...

Hth
- Mehmet 

>
>[mon.px-golf-cluster]
>  host = px-golf-cluster
>  mon addr = 169.254.42.54:6789
>
>[mon.px-hotel-cluster]
>  host = px-hotel-cluster
>  mon addr = 169.254.42.55:6789
>
>[mon.px-india-cluster]
>  host = px-india-cluster
>  mon addr = 169.254.42.56:6789
>
>
>
>
>> 
>>
>------
>> *From:* ceph-users  on behalf of
>Vitaliy Filippov 
>> *Sent:* Wednesday, February 27, 2019 4:21 PM
>> *To:* Ceph Users
>> *Subject:* Re: [ceph-users] Blocked ops after change from filestore
>on HDD to bluestore on SDD
>>  
>> I think this should not lead to blocked ops in any case, even if the 
>> performance is low...
>> 
>> -- 
>> With best regards,
>>    Vitaliy Filippov
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problems with osd creation in Ubuntu 18.04, ceph 13.2.4-1bionic

2019-03-15 Thread ceph
Hi Rainer,

Try something like

dd if=/dev/zero of=/dev/sdX bs=4096

To wipe/zap any Information on the disk.

HTH
Mehmet

Am 14. Februar 2019 13:57:51 MEZ schrieb Rainer Krienke 
:
>Hi,
>
>I am quite new to ceph and just try to set up a ceph cluster. Initially
>I used ceph-deploy for this but when I tried to create a BlueStore osd
>ceph-deploy fails. Next I tried the direct way on one of the OSD-nodes
>using ceph-volume to create the osd, but this also fails. Below you can
>see what  ceph-volume says.
>
>I ensured that there was no left over lvm VG and LV on the disk sdg
>before I started the osd creation for this disk. The very same error
>happens also on other disks not just for /dev/sdg. All the disk have
>4TB
>in size and the linux system is Ubuntu 18.04 and finally ceph is
>installed in version 13.2.4-1bionic from this repo:
>https://download.ceph.com/debian-mimic.
>
>There is a VG and two LV's  on the system for the ubuntu system itself
>that is installed on two separate disks configured as software raid1
>and
>lvm on top of the raid. But I cannot imagine that this might do any
>harm
>to cephs osd creation.
>
>Does anyone have an idea what might be wrong?
>
>Thanks for hints
>Rainer
>
>root@ceph1:~# wipefs -fa /dev/sdg
>root@ceph1:~# ceph-volume lvm prepare --bluestore --data /dev/sdg
>Running command: /usr/bin/ceph-authtool --gen-print-key
>Running command: /usr/bin/ceph --cluster ceph --name
>client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
>-i - osd new 14d041d6-0beb-4056-8df2-3920e2febce0
>Running command: /sbin/vgcreate --force --yes
>ceph-1433ffd0-0a80-481a-91f5-d7a47b78e17b /dev/sdg
> stdout: Physical volume "/dev/sdg" successfully created.
> stdout: Volume group "ceph-1433ffd0-0a80-481a-91f5-d7a47b78e17b"
>successfully created
>Running command: /sbin/lvcreate --yes -l 100%FREE -n
>osd-block-14d041d6-0beb-4056-8df2-3920e2febce0
>ceph-1433ffd0-0a80-481a-91f5-d7a47b78e17b
>stdout: Logical volume "osd-block-14d041d6-0beb-4056-8df2-3920e2febce0"
>created.
>Running command: /usr/bin/ceph-authtool --gen-print-key
>Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-0
>--> Absolute path not found for executable: restorecon
>--> Ensure $PATH environment variable contains common executable
>locations
>Running command: /bin/chown -h ceph:ceph
>/dev/ceph-1433ffd0-0a80-481a-91f5-d7a47b78e17b/osd-block-14d041d6-0beb-4056-8df2-3920e2febce0
>Running command: /bin/chown -R ceph:ceph /dev/dm-8
>Running command: /bin/ln -s
>/dev/ceph-1433ffd0-0a80-481a-91f5-d7a47b78e17b/osd-block-14d041d6-0beb-4056-8df2-3920e2febce0
>/var/lib/ceph/osd/ceph-0/block
>Running command: /usr/bin/ceph --cluster ceph --name
>client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
>mon getmap -o /var/lib/ceph/osd/ceph-0/activate.monmap
> stderr: got monmap epoch 1
>Running command: /usr/bin/ceph-authtool
>/var/lib/ceph/osd/ceph-0/keyring
>--create-keyring --name osd.0 --add-key
>AQAAY2VcU968HxAAvYWMaJZmriUc4H9bCCp8XQ==
> stdout: creating /var/lib/ceph/osd/ceph-0/keyring
>added entity osd.0 auth auth(auid = 18446744073709551615
>key=AQAAY2VcU968HxAAvYWMaJZmriUc4H9bCCp8XQ== with 0 caps)
>Running command: /bin/chown -R ceph:ceph
>/var/lib/ceph/osd/ceph-0/keyring
>Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/
>Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore
>bluestore --mkfs -i 0 --monmap /var/lib/ceph/osd/ceph-0/activate.monmap
>--keyfile - --osd-data /var/lib/ceph/osd/ceph-0/ --osd-uuid
>14d041d6-0beb-4056-8df2-3920e2febce0 --setuser ceph --setgroup ceph
> stderr: 2019-02-14 13:45:54.788 7f3fcecb3240 -1
>bluestore(/var/lib/ceph/osd/ceph-0/) _read_fsid unparsable uuid
> stderr: /build/ceph-13.2.4/src/os/bluestore/KernelDevice.cc: In
>function 'virtual int KernelDevice::read(uint64_t, uint64_t,
>ceph::bufferlist*, IOContext*, bool)' thread 7f3fcecb3240 time
>2019-02-14 13:45:54.841130
> stderr: /build/ceph-13.2.4/src/os/bluestore/KernelDevice.cc: 821:
>FAILED assert((uint64_t)r == len)
> stderr: ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e)
>mimic (stable)
> stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int,
>char const*)+0x102) [0x7f3fc60d33e2]
> stderr: 2: (()+0x26d5a7) [0x7f3fc60d35a7]
> stderr: 3: (KernelDevice::read(unsigned long, unsigned long,
>ceph::buffer::list*, IOContext*, bool)+0x4a7) [0x561371346817]
> stderr: 4: (BlueFS::_read(BlueFS::FileReader*,
>BlueFS::FileReaderBuffer*, unsigned long, unsigned long,
>ceph::buffer::list*, char*)+0x435) [0x5613713065c5]
> stderr: 5: (BlueFS::_replay(bool, bool)+0x214) [0x56137130c434]
> stderr: 6: (BlueFS::mount()+0x1f1) [0x561371310c81]
> stderr: 7:

Re: [ceph-users] Upgrade Luminous to mimic on Ubuntu 18.04

2019-02-18 Thread ceph
Hello people,

Am 11. Februar 2019 12:47:36 MEZ schrieb c...@elchaka.de:
>Hello Ashley,
>
>Am 9. Februar 2019 17:30:31 MEZ schrieb Ashley Merrick
>:
>>What does the output of apt-get update look like on one of the nodes?
>>
>>You can just list the lines that mention CEPH
>>
>
>... .. .
>Get:6 Https://Download.ceph.com/debian-luminous bionic InRelease [8393
>B]
>... .. .
>
>The Last available is 12.2.8.

Any advice or recommends on how to proceed to be able to Update to 
mimic/(nautilus)?

- Mehmet
>
>- Mehmet
>
>>Thanks
>>
>>On Sun, 10 Feb 2019 at 12:28 AM,  wrote:
>>
>>> Hello Ashley,
>>>
>>> Thank you for this fast response.
>>>
>>> I cannt prove this jet but i am using already cephs own repo for
>>Ubuntu
>>> 18.04 and this 12.2.7/8 is the latest available there...
>>>
>>> - Mehmet
>>>
>>> Am 9. Februar 2019 17:21:32 MEZ schrieb Ashley Merrick <
>>> singap...@amerrick.co.uk>:
>>> >Around available versions, are you using the Ubuntu repo’s or the
>>CEPH
>>> >18.04 repo.
>>> >
>>> >The updates will always be slower to reach you if your waiting for
>>it
>>> >to
>>> >hit the Ubuntu repo vs adding CEPH’s own.
>>> >
>>> >
>>> >On Sun, 10 Feb 2019 at 12:19 AM,  wrote:
>>> >
>>> >> Hello m8s,
>>> >>
>>> >> Im curious how we should do an Upgrade of our ceph Cluster on
>>Ubuntu
>>> >> 16/18.04. As (At least on our 18.04 nodes) we only have 12.2.7
>(or
>>> >.8?)
>>> >>
>>> >> For an Upgrade to mimic we should First Update to Last version,
>>> >actualy
>>> >> 12.2.11 (iirc).
>>> >> Which is not possible on 18.04.
>>> >>
>>> >> Is there a Update path from 12.2.7/8 to actual mimic release or
>>> >better the
>>> >> upcoming nautilus?
>>> >>
>>> >> Any advice?
>>> >>
>>> >> - Mehmet___
>>> >> ceph-users mailing list
>>> >> ceph-users@lists.ceph.com
>>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> >>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Upgrade Luminous to mimic on Ubuntu 18.04

2019-02-11 Thread ceph
Hello Ashley,

Am 9. Februar 2019 17:30:31 MEZ schrieb Ashley Merrick 
:
>What does the output of apt-get update look like on one of the nodes?
>
>You can just list the lines that mention CEPH
>

... .. .
Get:6 Https://Download.ceph.com/debian-luminous bionic InRelease [8393 B]
... .. .

The Last available is 12.2.8.

- Mehmet

>Thanks
>
>On Sun, 10 Feb 2019 at 12:28 AM,  wrote:
>
>> Hello Ashley,
>>
>> Thank you for this fast response.
>>
>> I cannt prove this jet but i am using already cephs own repo for
>Ubuntu
>> 18.04 and this 12.2.7/8 is the latest available there...
>>
>> - Mehmet
>>
>> Am 9. Februar 2019 17:21:32 MEZ schrieb Ashley Merrick <
>> singap...@amerrick.co.uk>:
>> >Around available versions, are you using the Ubuntu repo’s or the
>CEPH
>> >18.04 repo.
>> >
>> >The updates will always be slower to reach you if your waiting for
>it
>> >to
>> >hit the Ubuntu repo vs adding CEPH’s own.
>> >
>> >
>> >On Sun, 10 Feb 2019 at 12:19 AM,  wrote:
>> >
>> >> Hello m8s,
>> >>
>> >> Im curious how we should do an Upgrade of our ceph Cluster on
>Ubuntu
>> >> 16/18.04. As (At least on our 18.04 nodes) we only have 12.2.7 (or
>> >.8?)
>> >>
>> >> For an Upgrade to mimic we should First Update to Last version,
>> >actualy
>> >> 12.2.11 (iirc).
>> >> Which is not possible on 18.04.
>> >>
>> >> Is there a Update path from 12.2.7/8 to actual mimic release or
>> >better the
>> >> upcoming nautilus?
>> >>
>> >> Any advice?
>> >>
>> >> - Mehmet___
>> >> ceph-users mailing list
>> >> ceph-users@lists.ceph.com
>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Upgrade Luminous to mimic on Ubuntu 18.04

2019-02-09 Thread ceph
Hello Ashley,

Thank you for this fast response.

I cannt prove this jet but i am using already cephs own repo for Ubuntu 18.04 
and this 12.2.7/8 is the latest available there...

- Mehmet

Am 9. Februar 2019 17:21:32 MEZ schrieb Ashley Merrick 
:
>Around available versions, are you using the Ubuntu repo’s or the CEPH
>18.04 repo.
>
>The updates will always be slower to reach you if your waiting for it
>to
>hit the Ubuntu repo vs adding CEPH’s own.
>
>
>On Sun, 10 Feb 2019 at 12:19 AM,  wrote:
>
>> Hello m8s,
>>
>> Im curious how we should do an Upgrade of our ceph Cluster on Ubuntu
>> 16/18.04. As (At least on our 18.04 nodes) we only have 12.2.7 (or
>.8?)
>>
>> For an Upgrade to mimic we should First Update to Last version,
>actualy
>> 12.2.11 (iirc).
>> Which is not possible on 18.04.
>>
>> Is there a Update path from 12.2.7/8 to actual mimic release or
>better the
>> upcoming nautilus?
>>
>> Any advice?
>>
>> - Mehmet___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Upgrade Luminous to mimic on Ubuntu 18.04

2019-02-09 Thread ceph
Hello m8s,

Im curious how we should do an Upgrade of our ceph Cluster on Ubuntu 16/18.04. 
As (At least on our 18.04 nodes) we only have 12.2.7 (or .8?) 

For an Upgrade to mimic we should First Update to Last version, actualy 12.2.11 
(iirc).
Which is not possible on 18.04.

Is there a Update path from 12.2.7/8 to actual mimic release or better the 
upcoming nautilus?

Any advice?

- Mehmet___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Questions about using existing HW for PoC cluster

2019-02-09 Thread ceph
Hi 

Am 27. Januar 2019 18:20:24 MEZ schrieb Will Dennis :
>Been reading "Learning Ceph - Second Edition"
>(https://learning.oreilly.com/library/view/learning-ceph-/9781787127913/8f98bac7-44d4-45dc-b672-447d162ea604.xhtml)
>and in Ch. 4 I read this:
>
>"We've noted that Ceph OSDs built with the new BlueStore back end do
>not require journals. One might reason that additional cost savings can
>be had by not having to deploy journal devices, and this can be quite
>true. However, BlueStore does still benefit from provisioning certain
>data components on faster storage, especially when OSDs are deployed on
>relatively slow HDDs. Today's investment in fast FileStore journal
>devices for HDD OSDs is not wasted when migrating to BlueStore. When
>repaving OSDs as BlueStore devices the former journal devices can be
>readily re purposed for BlueStore's RocksDB and WAL data. When using
>SSD-based OSDs, this BlueStore accessory data can reasonably be
>colocated with the OSD data store. For even better performance they can
>employ faster yet NVMe or other technloogies for WAL and RocksDB. This
>approach is not unknown for traditional FileStore journals as well,
>though it is not inexpensive.Ceph clusters that are fortunate to
>exploit SSDs as primary OSD dri
>ves usually do not require discrete journal devices, though use cases
>that require every last bit of performance may justify NVMe journals.
>SSD clusters with NVMe journals are as uncommon as they are expensive,
>but they are not unknown."
>
>So can I get by with using a single SATA SSD (size?) per server for
>RocksDB / WAL if I'm using Bluestore?

IIRC there is  a rule of thump where the Size of DB-partition should be 4% of 
the OSD Size.

I.e. 4TB OSD should have At least a DB Partition of 160GB
 
Hth
- Mehmet

>
>
>> - Is putting the journal on a partition of the SATA drives a real I/O
>killer? (this is how my Proxmox boxes are set up)
>> - If YES to the above, then is a SATA SSD acceptable for journal
>device, or should I definitely consider PCIe SSD? (I'd have to limit to
>one per server, which I know isn't optimal, but price prevents
>otherwise...)
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] block.db on a LV? (Re: Mixed SSD+HDD OSD setup recommendation)

2019-02-01 Thread ceph
Hello @all,

Am 18. Januar 2019 14:29:42 MEZ schrieb Alfredo Deza :
>On Fri, Jan 18, 2019 at 7:21 AM Jan Kasprzak  wrote:
>>
>> Eugen Block wrote:
>> : Hi Jan,
>> :
>> : I think you're running into an issue reported a couple of times.
>> : For the use of LVM you have to specify the name of the Volume Group
>> : and the respective Logical Volume instead of the path, e.g.
>> :
>> : ceph-volume lvm prepare --bluestore --block.db ssd_vg/ssd00 --data
>/dev/sda
>>
>> Eugen,
>>
>> thanks, I will try it. In the meantime, I have discovered another way
>> how to get around it: convert my SSDs from MBR to GPT partition
>table,
>> and then create 15 additional GPT partitions for the respective
>block.dbs
>> instead of 2x15 LVs.
>
>This is because ceph-volume can accept both LVs or GPT partitions for
>block.db
>
>Another way around this, that doesn't require you to create the LVs is
>to use the `batch` sub-command, that will automatically
>detect your HDD and put data on it, and detect the SSD and create the
>block.db LVs. The command could look something like:
>
>
>ceph-volume lvm batch --bluestore /dev/sda /dev/sdb /dev/sdc /dev/sdd
>/dev/nvme0n1
>
>Would create 4 OSDs, place data on: sda, sdb, sdc, and sdd. And create
>4 block.db LVs on nvme0n1
>

How would you replace ,lets say sdc (osd.2), in this Case?

Could you please give a short step-by-step howto?

Thanks in advice for  your Great Job on ceph-volume @alfredo

- Mehmet 

>
>
>>
>> -Yenya
>>
>> --
>> | Jan "Yenya" Kasprzak private}> |
>> | http://www.fi.muni.cz/~kas/ GPG:
>4096R/A45477D5 |
>>  This is the world we live in: the way to deal with computers is to
>google
>>  the symptoms, and hope that you don't have to watch a video. --P.
>Zaitcev
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bionic Upgrade 12.2.10

2019-01-30 Thread ceph
Hello scott,

Ive Seen a Solution from the croit Guys
Perhaps this is Related?
https://croit.io/2018/09/23/2018-09-23-debian-mirror

Greetz
Mehmet

Am 14. Januar 2019 20:33:59 MEZ schrieb Scottix :
>Wow OK.
>I wish there was some official stance on this.
>
>Now I got to remove those OSDs, downgrade to 16.04 and re-add them,
>this is going to take a while.
>
>--Scott
>
>On Mon, Jan 14, 2019 at 10:53 AM Reed Dier 
>wrote:
>>
>> This is because Luminous is not being built for Bionic for whatever
>reason.
>> There are some other mailing list entries detailing this.
>>
>> Right now you have ceph installed from the Ubuntu bionic-updates
>repo, which has 12.2.8, but does not get regular release updates.
>>
>> This is what I ended up having to do for my ceph nodes that were
>upgraded from Xenial to Bionic, as well as new ceph nodes that
>installed straight to Bionic, due to the repo issues. Even if you try
>to use the xenial packages, you will run into issues with libcurl4 and
>libcurl3 I imagine.
>>
>> Reed
>>
>> On Jan 14, 2019, at 12:21 PM, Scottix  wrote:
>>
>> https://download.ceph.com/debian-luminous/
>>
>>
>
>
>-- 
>T: @Thaumion
>IG: Thaumion
>scot...@gmail.com
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] repair do not work for inconsistent pg which three replica are the same

2019-01-26 Thread ceph


Am 10. Januar 2019 08:43:30 MEZ schrieb Wido den Hollander :
>
>
>On 1/10/19 8:36 AM, hnuzhoulin2 wrote:
>> 
>> Hi,cephers
>> 
>> I have two inconsistent pg.I try list inconsistent obj,got nothing.
>> 
>> rados list-inconsistent-obj 388.c29
>> No scrub information available for pg 388.c29
>> error 2: (2) No such file or directory
>> 
>
>
>Have you tried to run a deep-scrub on this PG and see what that does?
>
>Wido
>
>> so I search the log to find the obj name, and I search this name in
>> three replica. Yes, three replica all the same(md5 is the same).
>> error log is: 388.c29 shard 295: soid
>>
>388:9430fef2:::c2e226a9-b855-45c5-a17f-b1c697755072.1813469.4__multipart_dumbo%2f180888654%2f20181221%2fxtrabackup_full_x19_30044_20181221025000%2fx19.xbstream.2~ntwW9vwutbmOJ4bDZYehERT2AokbtAi.3595:head
>> candidate had a readerror

In Addition i would Check the  underlying Disk... perhaps something in dmesg?

- Mehmet  
>> 
>> obj name is:
>>
>DIR_9/DIR_2/DIR_C/DIR_0/DIR_F/c2e226a9-b855-45c5-a17f-b1c697755072.1813469.4\\u\\umultipart\\udumbo\\s180888654\\s20181221\\sxtrabackup\\ufull\\ux19\\u30044\\u20181221025000\\sx19.xbstream.2~ntwW9vwutbmOJ4bDZYehERT2AokbtAi.3595__head_4F7F0C29__184
>> all md5 is : 73281ed56c92a56da078b1ae52e888e0  
>> 
>> stat info is:
>> root@cld-osd3-48:/home/ceph/var/lib/osd/ceph-33/current/388.c29_head#
>> stat
>>
>DIR_9/DIR_2/DIR_C/DIR_0/DIR_F/c2e226a9-b855-45c5-a17f-b1c697755072.1813469.4\\u\\umultipart\\udumbo\\s180888654\\s20181221\\sxtrabackup\\ufull\\ux19\\u30044\\u20181221025000\\sx19.xbstream.2~ntwW9vwutbmOJ4bDZYehERT2AokbtAi.3595__head_4F7F0C29__184
>>   Size: 4194304   Blocks: 8200       IO Block: 4096   regular file
>> Device: 891h/2193dInode: 4300403471  Links: 1
>> Access: (0644/-rw-r--r--)  Uid: (  999/    ceph)   Gid: (  999/  
> ceph)
>> Access: 2018-12-21 14:17:12.945132144 +0800
>> Modify: 2018-12-21 14:17:12.965132073 +0800
>> Change: 2018-12-21 14:17:13.761129235 +0800
>>  Birth: -
>> 
>>
>root@cld-osd24-48:/home/ceph/var/lib/osd/ceph-279/current/388.c29_head#
>> stat
>>
>DIR_9/DIR_2/DIR_C/DIR_0/DIR_F/c2e226a9-b855-45c5-a17f-b1c697755072.1813469.4\\u\\umultipart\\udumbo\\s180888654\\s20181221\\sxtrabackup\\ufull\\ux19\\u30044\\u20181221025000\\sx19.xbstream.2~ntwW9vwutbmOJ4bDZYehERT2AokbtAi.3595__head_4F7F0C29__184
>>   Size: 4194304   Blocks: 8200       IO Block: 4096   regular file
>> Device: 831h/2097dInode: 8646464869  Links: 1
>> Access: (0644/-rw-r--r--)  Uid: (  999/    ceph)   Gid: (  999/  
> ceph)
>> Access: 2019-01-07 10:54:23.010293026 +0800
>> Modify: 2019-01-07 10:54:23.010293026 +0800
>> Change: 2019-01-07 10:54:23.014293004 +0800
>>  Birth: -
>> 
>>
>root@cld-osd31-48:/home/ceph/var/lib/osd/ceph-363/current/388.c29_head#
>> stat
>>
>DIR_9/DIR_2/DIR_C/DIR_0/DIR_F/c2e226a9-b855-45c5-a17f-b1c697755072.1813469.4\\u\\umultipart\\udumbo\\s180888654\\s20181221\\sxtrabackup\\ufull\\ux19\\u30044\\u20181221025000\\sx19.xbstream.2~ntwW9vwutbmOJ4bDZYehERT2AokbtAi.3595__head_4F7F0C29__184
>>   Size: 4194304   Blocks: 8200       IO Block: 4096   regular file
>> Device: 831h/2097dInode: 13141445890  Links: 1
>> Access: (0644/-rw-r--r--)  Uid: (  999/    ceph)   Gid: (  999/  
> ceph)
>> Access: 2018-12-21 14:17:12.946862160 +0800
>> Modify: 2018-12-21 14:17:12.966862262 +0800
>> Change: 2018-12-21 14:17:13.762866312 +0800
>>  Birth: -
>> 
>> 
>> another pg os the same.I try run deep-scrub and repair. do not work.
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Configure libvirt to 'see' already created snapshots of a vm rbd image

2019-01-24 Thread ceph
Hmmm... if i am Not wrong, this Information have to be put within the config 
files from you... there isnt a mechanism which extracts this via rbd snap ls ...

Am 7. Januar 2019 13:16:36 MEZ schrieb Marc Roos :
>
>
>How do you configure libvirt so it sees the snapshots already created
>on 
>the rbd image it is using for the vm?
>
>I have already a vm running connected to the rbd pool via 
>protocol='rbd', and rbd snap ls is showing for snapshots.
>
>
>
>
>
>_______
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Using Ceph central backup storage - Best practice creating pools

2019-01-22 Thread ceph
Hi,

Ceph's pool are meant to let you define specific engineering rules
and/or application (rbd, cephfs, rgw)
They are not designed to be created in a massive fashion (see pgs etc)
So, create a pool for each engineering ruleset, and store your data in them
For what is left of your project, I believe you have to implement that
on top of Ceph

For instance, let say you simply create a pool, with a rbd volume in it
You then create a filesystem on that, and map it on some server
Finally, you can push your files on that mountpoint, using various
Linux's user, acl or whatever : beyond that point, there is nothing more
specific to Ceph, it is "just" a mounted filesystem

Regards,

On 01/22/2019 02:16 PM, cmonty14 wrote:
> Hi,
> 
> my use case for Ceph is providing a central backup storage.
> This means I will backup multiple databases in Ceph storage cluster.
> 
> This is my question:
> What is the best practice for creating pools & images?
> Should I create multiple pools, means one pool per database?
> Or should I create a single pool "backup" and use namespace when writing
> data in the pool?
> 
> This is the security demand that should be considered:
> DB-owner A can only modify the files that belong to A; other files
> (owned by B, C or D) are accessible for A.
> 
> And there's another issue:
> How can I identify a backup created by client A that I want to restore
> on another client Z?
> I mean typically client A would write a backup file identified by the
> filename.
> Would it be possible on client Z to identify this backup file by
> filename? If yes, how?
> 
> 
> THX
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] dropping python 2 for nautilus... go/no-go

2019-01-16 Thread ceph
Hi,

My 2 cents:
- do drop python2 support
- do not drop python2 support unexpectedly, aka do a deprecation phase

People should already know that python2 is dead
That is not enough, though, to remove that "by surprise"

Regards,

On 01/16/2019 04:45 PM, Sage Weil wrote:
> Hi everyone,
> 
> This has come up several times before, but we need to make a final 
> decision.  Alfredo has a PR prepared that drops Python 2 support entirely 
> in master, which will mean nautilus is Python 3 only.
> 
> All of our distro targets (el7, bionic, xenial) include python 3, so that 
> isn't an issue.  However, it also means that users of python-rados, 
> python-rbd, and python-cephfs will need to be using python 3.
> 
> Python 2 is on its way out, and has been for years.  See
> 
>   https://pythonclock.org/
> 
> If it don't kill it in Nautilus, we'll be doing it for Octopus.
> 
> Are there major python-{rbd,cephfs,rgw,rados} users that are still Python 
> 2 that we need to be worried about?  (OpenStack?)
> 
> sage
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph community - how to make it even stronger

2019-01-05 Thread ceph . novice
Hi.

What makes us struggle / wonder again and again is the absence of CEPH __man 
pages__. On *NIX systems man pages are always the first way to go for help, 
right? Or is this considered "old school" from the CEPH makers / community? :O

And as many ppl complain again and again, the same here as well / again... the 
CEPH documentation on docs.ceph.com lacks lot of useful / needed things. If you 
really want to work with CEPH, you need to read and track many different 
sources all the time, like the community news posts, docs.ceph.com, the RedHat 
Storage stuff and sometimes even the GitHub source code... all together very 
time consuming and error prone... from my point of view this is the biggest 
drop-back of the hole (and overall GREAT!) "storage solution"! 

As we are on that topic... THANKS for all the great help and posts to YOU / the 
CEPH community! You guys are great and really "make the difference"!

 
-

Hi All.

I was reading up and especially the thread on upgrading to mimic and
stable releases - caused me to reflect a bit on our ceph journey so far.

We started approximately 6 months ago - with CephFS as the dominant
use case in our HPC setup - starting at 400TB useable capacity and
as is matures going towards 1PB - mixed slow and SSD.

Some of the first confusions was.
bluestore vs. filestore - what was the recommendation actually?
Figuring out what kernel clients are useable with CephFS - and what
kernels to use on the other end?
Tuning of the MDS ?
Imbalace of OSD nodes rendering the cluster down - how to balance?
Triggering kernel bugs in the kernel client during OSD_FULL ?

This mailing list has been very responsive to the questions, thanks for
that.

But - compared to other open source projects we're lacking a bit of
infrastructure and guidance here.

I did check:
- http://tracker.ceph.com/projects/ceph/wiki/Wiki => Which does not seem
to be operational.
- 
http://docs.ceph.com/docs/mimic/start/get-involved/[http://docs.ceph.com/docs/mimic/start/get-involved/]
Gmane is probably not coming back - waiting 2 years now, can we easily get
the mailinglist archives indexed otherwise.

I feel that the wealth of knowledge being build up around operating ceph
is not really captured to make the next users journey - better and easier.

I would love to help out - hey - I end up spending the time anyway, but
some guidance on how to do it may help.

I would suggest:

1) Dump a 1-3 monthly status email on the project to the respective
mailing lists => Major releases, Conferences, etc
2) Get the wiki active - one of the main things I want to know about when
messing with the storage is - What is working for other people - just a
page where people can dump an aggregated output of their ceph cluster and
write 2-5 lines about the use-case for it.
3) Either get community more active on the documentation - advocate for it
- or start up more documentation on the wiki => A FAQ would be a nice
first place to start.

There may be an awful lot of things I've missed on the write up - but
please follow up.

If some of the core ceph people allready have thoughts / ideas / guidance,
please share so we collaboratively can make it better.

Lastly - thanks for the great support on the mailing list - so far - the
intent is only to try to make ceph even better.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Strange Data Issue - Unexpected client hang on OSD I/O Error

2018-12-26 Thread Dyweni - Ceph-Users

Good Morning,

I re-ran the verification and it matches exactly the original data that
was backed up (aprox 300GB).  There were no further messages issued on
the client or any OSD originally involved (2,9,18).  I believe the data
to be ok.  The cluster is currently healther (all PGs 'active+clean').

I would still like to understand the following:

1. What does the original message on the client mean?
   (libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 
131072, skipping)


2. What actions does the OSD take after reporting '4.35 missing primary 
copy'?


3. Why did the client read operation hang?  I have never had a client
   hang due to an OSD disk I/O error.

Further, I'm wondering about:

PG '4.35' was automatically deep-scrubbed with no errors on
'2018-12-24 18:45'.  The error 'missing primary copy' occured at
'2018-12-25 20:26'.  It was then manually deep-scrubbed, also with no
errors, at '2018-12-25 21:47'.

In the past, when a read error occurs, the PG goes inconsistent and the
admin has to repair it.  The client operations are unaffected, because
the data from the remaining 2 OSDs is available.

In this case, there was data missing, Ceph detected it, but the PG did
not go inconsistent.  Rather, the client operation was impacting,
forcing a hard reboot of the client to recover.

1. Why did the PG not go inconsistent?

2. Did Ceph automatically correct the data error?  I believe so, but
   there was no message in the logs that such a repair had completed.

3. I believe the client operation should have been unaffected by this,
   regardless of what kind of error it was on the OSD side, since two
   copies of this data were existing on other OSDs.  Thus, I would
   recommend that issue of the client hanging (stuck in 'D+' state) and
   requiring a hard reboot to recover should be treated as a bug and
   investigated.  I saw a similar issue (similar client kernel message)
   in the 4.9.x kernels regarding CephFS, but this is RBD.

Thank you,
Dyweni




On 2018-12-25 22:55, Dyweni - Ceph-Users wrote:

Hi again!

Prior to rebooting the client, I found this file (and it's contents):

# cat
/sys/kernel/debug/ceph/8abf116d-a710-4245-811d-c08473cb9fb4.client7412370/osdc
REQUESTS 1 homeless 0
1459933 osd24.3120c635  [2,18,9]/2  [2,18,9]/2  
rbd_data.6b60e8643c9869.157f0x4000111   0'0 
read
LINGER REQUESTS
18446462598732840963osd84.1662b47d  [8,18,3]/8  [8,18,3]/8  
rbd_header.711beb643c9869   0x240   WC/0
18446462598732840961osd10   4.2a8391bc  [10,18,4]/10[10,18,4]/10
rbd_header.357247238e1f29   0x240   WC/0
18446462598732840962osd12   4.906668e1  [12,2,18]/12[12,2,18]/12
rbd_header.6b60e8643c9869   0x240   WC/0


When I search for '6b60e8643c9869' in the Ceph logs across all OSDs, I
only find the two error lines listed previously.


Thanks,
Dyweni



On 2018-12-25 22:38, Dyweni - Ceph-Users wrote:

Hi Everyone/Devs,

Would someone please help me troubleshoot a strange data issue
(unexpected client hang on OSD I/O Error)?

On the client, I had a process reading a large amount of data from a
mapped RBD image.  I noticed tonight that it had stalled for a long
period of time (which never happens).  The process is currently in an
'uninterruptible sleep' state ('D+').  When I checked the kernel logs
(dmesg), I found this entry:

libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 
131072, skipping


I checked the kernel log on the referenced OSD (osd2) and noted the
usual kernel output regarding I/O errors on the disk.  These errors
occured 1 second prior to the message being issued on the client.  
This

OSD has a drive that is developing bad sectors. This is known and
tollerated.  The data sits in a pool with 3 replicas.

Normally, when I/O errors occur, Ceph reports the PG as
active+clean+inconsistent and 'rados list-inconsistent-obj' lists
'read error' for that OSD.  The clients proceed onward oblivious to 
the

issue, I review the 'ceph health detail' and
'rados list-inconsistent-obj' outputs, and issue the 'ceph pg repair'
commands.  Everything turns out ok.

However, tonight I received no such inconsistent messages from Ceph.
When I looked at the ceph logs on that OSD, I found these and only 
these

lines, regarding the corruption.

2018-12-25 20:26:19.665945 b0b540c0 -1 bdev(0x5db2b00
/var/lib/ceph/osd/ceph-2/block) _aio_thread got (5) Input/output error
2018-12-25 20:26:19.676377 a1b250c0 -1 log_channel(cluster) log [ERR]
: 4.35 missing primary copy of
4:ac63048c:::rbd_data.6b60e8643c9869.157f:head, will try
copies on 9,18

To be proactive, I then issued 'ceph pg deep-scrub 4.35'.  This
completed normally and all PGs still show 'active+clean' (as was prior
to issuing the deep-scrub command.

So I have my requests:

1. What does the message on the client mean?
   (libceph: get_reply osd2 tid 1459933 data 3248128 > pre

Re: [ceph-users] Strange Data Issue - Unexpected client hang on OSD I/O Error

2018-12-25 Thread Dyweni - Ceph-Users

Hi again!

Prior to rebooting the client, I found this file (and it's contents):

# cat 
/sys/kernel/debug/ceph/8abf116d-a710-4245-811d-c08473cb9fb4.client7412370/osdc

REQUESTS 1 homeless 0
1459933 osd24.3120c635  [2,18,9]/2  [2,18,9]/2  
rbd_data.6b60e8643c9869.157f0x4000111   0'0 
read
LINGER REQUESTS
18446462598732840963osd84.1662b47d  [8,18,3]/8  [8,18,3]/8  
rbd_header.711beb643c9869   0x240   WC/0
18446462598732840961osd10   4.2a8391bc  [10,18,4]/10[10,18,4]/10
rbd_header.357247238e1f29   0x240   WC/0
18446462598732840962osd12   4.906668e1  [12,2,18]/12[12,2,18]/12
rbd_header.6b60e8643c9869   0x240   WC/0


When I search for '6b60e8643c9869' in the Ceph logs across all OSDs, I 
only find the two error lines listed previously.



Thanks,
Dyweni



On 2018-12-25 22:38, Dyweni - Ceph-Users wrote:

Hi Everyone/Devs,

Would someone please help me troubleshoot a strange data issue
(unexpected client hang on OSD I/O Error)?

On the client, I had a process reading a large amount of data from a
mapped RBD image.  I noticed tonight that it had stalled for a long
period of time (which never happens).  The process is currently in an
'uninterruptible sleep' state ('D+').  When I checked the kernel logs
(dmesg), I found this entry:

libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, 
skipping


I checked the kernel log on the referenced OSD (osd2) and noted the
usual kernel output regarding I/O errors on the disk.  These errors
occured 1 second prior to the message being issued on the client.  This
OSD has a drive that is developing bad sectors. This is known and
tollerated.  The data sits in a pool with 3 replicas.

Normally, when I/O errors occur, Ceph reports the PG as
active+clean+inconsistent and 'rados list-inconsistent-obj' lists
'read error' for that OSD.  The clients proceed onward oblivious to the
issue, I review the 'ceph health detail' and
'rados list-inconsistent-obj' outputs, and issue the 'ceph pg repair'
commands.  Everything turns out ok.

However, tonight I received no such inconsistent messages from Ceph.
When I looked at the ceph logs on that OSD, I found these and only 
these

lines, regarding the corruption.

2018-12-25 20:26:19.665945 b0b540c0 -1 bdev(0x5db2b00
/var/lib/ceph/osd/ceph-2/block) _aio_thread got (5) Input/output error
2018-12-25 20:26:19.676377 a1b250c0 -1 log_channel(cluster) log [ERR]
: 4.35 missing primary copy of
4:ac63048c:::rbd_data.6b60e8643c9869.157f:head, will try
copies on 9,18

To be proactive, I then issued 'ceph pg deep-scrub 4.35'.  This
completed normally and all PGs still show 'active+clean' (as was prior
to issuing the deep-scrub command.

So I have my requests:

1. What does the message on the client mean?
   (libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated
131072, skipping)

2. What occured on the OSD (osd2), after reporting '4.35 missing 
primary copy'?


3. Why did the client read operation hang?  I have never had a client
   hang due to an OSD disk I/O error.

4. How do I verify that the data in the cluster is OK (i.e. beyond
   forcing deep-scrub on all PGs)?

Note about 4:
   The client operation in question was performing a verification of
   a disk image backup (i.e. make MD5sums of the original disk and the
   backup image, and verify both match).  I will restart the client
   machine and repeat the verification step of that backup.  This will
   tell me if this small section of cluster data is OK (and indirectly
   test the Ceph execution/data paths that originally failed).


For reference:

All Ceph versions are 12.2.5.
Client kernel version is 4.9.95.


Thank you,
Dyweni


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Strange Data Issue - Unexpected client hang on OSD I/O Error

2018-12-25 Thread Dyweni - Ceph-Users

Hi Everyone/Devs,

Would someone please help me troubleshoot a strange data issue
(unexpected client hang on OSD I/O Error)?

On the client, I had a process reading a large amount of data from a
mapped RBD image.  I noticed tonight that it had stalled for a long
period of time (which never happens).  The process is currently in an
'uninterruptible sleep' state ('D+').  When I checked the kernel logs
(dmesg), I found this entry:

libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, 
skipping


I checked the kernel log on the referenced OSD (osd2) and noted the
usual kernel output regarding I/O errors on the disk.  These errors
occured 1 second prior to the message being issued on the client.  This
OSD has a drive that is developing bad sectors. This is known and
tollerated.  The data sits in a pool with 3 replicas.

Normally, when I/O errors occur, Ceph reports the PG as
active+clean+inconsistent and 'rados list-inconsistent-obj' lists
'read error' for that OSD.  The clients proceed onward oblivious to the
issue, I review the 'ceph health detail' and
'rados list-inconsistent-obj' outputs, and issue the 'ceph pg repair'
commands.  Everything turns out ok.

However, tonight I received no such inconsistent messages from Ceph.
When I looked at the ceph logs on that OSD, I found these and only these
lines, regarding the corruption.

2018-12-25 20:26:19.665945 b0b540c0 -1 bdev(0x5db2b00 
/var/lib/ceph/osd/ceph-2/block) _aio_thread got (5) Input/output error
2018-12-25 20:26:19.676377 a1b250c0 -1 log_channel(cluster) log [ERR] : 
4.35 missing primary copy of 
4:ac63048c:::rbd_data.6b60e8643c9869.157f:head, will try 
copies on 9,18


To be proactive, I then issued 'ceph pg deep-scrub 4.35'.  This
completed normally and all PGs still show 'active+clean' (as was prior
to issuing the deep-scrub command.

So I have my requests:

1. What does the message on the client mean?
   (libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 
131072, skipping)


2. What occured on the OSD (osd2), after reporting '4.35 missing primary 
copy'?


3. Why did the client read operation hang?  I have never had a client
   hang due to an OSD disk I/O error.

4. How do I verify that the data in the cluster is OK (i.e. beyond
   forcing deep-scrub on all PGs)?

Note about 4:
   The client operation in question was performing a verification of
   a disk image backup (i.e. make MD5sums of the original disk and the
   backup image, and verify both match).  I will restart the client
   machine and repeat the verification step of that backup.  This will
   tell me if this small section of cluster data is OK (and indirectly
   test the Ceph execution/data paths that originally failed).


For reference:

All Ceph versions are 12.2.5.
Client kernel version is 4.9.95.


Thank you,
Dyweni


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph OOM Killer Luminous

2018-12-21 Thread Dyweni - Ceph-Users
Hi, 

You could be running out of memory due to the default Bluestore cache
sizes. 

How many disks/OSDs in the R730xd versus the R740xd?  How much memory in
each server type?  How many are HDD versus SSD?  Are you running
Bluestore? 

OSD's in Luminous, which run Bluestore, allocate memory to use as a
"cache", since the kernel-provided page-cache is not available to
Bluestore.  Bluestore, by default, will use 1GB of memory for each HDD,
and 3GB of memory for each SSD.  OSD's do not allocate all that memory
up front, but grow into it as it is used.  This cache is in addition to
any other memory the OSD uses. 

Check out the bluestore_cache_* values (these are specified in bytes) in
the manual cache sizing section of the docs
(http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/).
  Note that the automatic cache sizing feature wasn't added until
12.2.9. 

As an example, I have OSD's running on 32bit/armhf nodes.  These nodes
have 2GB of memory.  I run 1 Bluestore OSD on each node.  In my
ceph.conf file, I have 'bluestore cache size = 536870912' and 'bluestore
cache kv max = 268435456'.  I see aprox 1.35-1.4 GB used by each OSD. 

On 2018-12-21 15:19, Pardhiv Karri wrote:

> Hi, 
> 
> We have a luminous cluster which was upgraded from Hammer --> Jewel --> 
> Luminous 12.2.8 recently. Post upgrade we are seeing issue with a few nodes 
> where they are running out of memory and dying. In the logs we are seeing OOM 
> killer. We don't have this issue before upgrade. The only difference is the 
> nodes without any issue are R730xd and the ones with the memory leak are 
> R740xd. The hardware vendor don't see anything wrong with the hardware. From 
> Ceph end we are not seeing any issue when it comes to running the cluster, 
> only issue is with memory leak. Right now we are actively rebooting the nodes 
> in timely manner to avoid crashes. One R740xd node we set all the OSDs to 0.0 
> and there is no memory leak there. Any pointers to fix the issue would be 
> helpful. 
> 
> Thanks, 
> PARDHIV KARRI 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Cluster to OSD Utilization not in Sync

2018-12-21 Thread Dyweni - Ceph-Users
Hi, 

If you are running Ceph Luminous or later, use the Ceph Manager Daemon's
Balancer module.  (http://docs.ceph.com/docs/luminous/mgr/balancer/). 

Otherwise, tweak the OSD weights (not the OSD CRUSH weights) until you
achieve uniformity.  (You should be able to get under 1 STDDEV).  I
would adjust in small amounts to not overload your cluster. 

Example: 

ceph osd reweight osd.X  y.yyy 

On 2018-12-21 14:56, Pardhiv Karri wrote:

> Hi, 
> 
> We have Ceph clusters which are greater than 1PB. We are using tree 
> algorithm. The issue is with the data placement. If the cluster utilization 
> percentage is at 65% then some of the OSDs are already above 87%. We had to 
> change the near_full ratio to 0.90 to circumvent warnings and to get back the 
> Health to OK state. 
> 
> How can we keep the OSDs utilization to be in sync with cluster utilization 
> (both percentages to be close enough) as we want to utilize the cluster to 
> the max (above 80%) without unnecessarily adding too many nodes/osd's. Right 
> now we are losing close to 400TB of the disk space unused as some OSDs are 
> above 87% and some are below 50%. If the above 87% OSDs reach 95% then the 
> cluster will have issues. What is the best way to mitigate this issue? 
> 
> Thanks, 
> Pardhiv Karri
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Your email to ceph-uses mailing list: Signature check failures.

2018-12-21 Thread Dyweni - Ceph-Users

Hi Cary,

I ran across your email on the ceph-users mailing list 'Signature check 
failures.'.


I've just run across the same issue on my end.  Also Gentoo user here.

Running Ceph 12.2.5... 32bit/armhf  and 64bit/x64_64.


Was your environment mixed or strictly just x86_64?



What is interesting, is that my 32bit/armhf (built with USE="ssl -nss") 
OSDs have no problem talking to themselves, or my 64bit/x86_64 (built 
with USE="-ssl nss") OSDs or my 64bit/x86_64 (built with USE="-ssl nss") 
clients.



Trying to build new 64bit/x86_64 (built with USE="ssl -nss") OSDs and 
getting this same error with a simple 'rbd ls -l'.



OpenSSL version is 1.0.2p.   Do you remember which version of OpenSSL 
you were building against?  'genlop -e openssl' will show you.



The locally calculated signature most times looks really short, so I'm 
wondering if we're hitting some kind of variable size issue... maybe 
overflow too?



Would appreciate any insight you could give.

Thanks!

Dyweni



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD snapshot atomicity guarantees?

2018-12-18 Thread ceph
For what it worth, we are using snapshots on a daily basis for a couple
of thousands rbd volume for some times

So far so good, we have not catched any issue

On 12/18/2018 10:28 AM, Oliver Freyermuth wrote:
> Dear Hector,
> 
> we are using the very same approach on CentOS 7 (freeze + thaw), but
> preceeded by an fstrim. With virtio-scsi, using fstrim propagates the
> discards from within the VM to Ceph RBD (if qemu is configured
> accordingly),
> and a lot of space is saved.
> 
> We have yet to observe these hangs, we are running this with ~5 VMs with
> ~10 disks for about half a year now with daily snapshots. But all of
> these VMs have very "low" I/O,
> since we put anything I/O intensive on bare metal (but with automated
> provisioning of course).
> 
> So I'll chime in on your question, especially since there might be VMs
> on our cluster in the future where the inner OS may not be running an
> agent.
> Since we did not observe this yet, I'll also add: What's your "scale",
> is it hundreds of VMs / disks? Hourly snapshots? I/O intensive VMs?
> 
> Cheers,
> Oliver
> 
> Am 18.12.18 um 10:10 schrieb Hector Martin:
>> Hi list,
>>
>> I'm running libvirt qemu guests on RBD, and currently taking backups
>> by issuing a domfsfreeze, taking a snapshot, and then issuing a
>> domfsthaw. This seems to be a common approach.
>>
>> This is safe, but it's impactful: the guest has frozen I/O for the
>> duration of the snapshot. This is usually only a few seconds.
>> Unfortunately, the freeze action doesn't seem to be very reliable.
>> Sometimes it times out, leaving the guest in a messy situation with
>> frozen I/O (thaw times out too when this happens, or returns success
>> but FSes end up frozen anyway). This is clearly a bug somewhere, but I
>> wonder whether the freeze is a hard requirement or not.
>>
>> Are there any atomicity guarantees for RBD snapshots taken *without*
>> freezing the filesystem? Obviously the filesystem will be dirty and
>> will require journal recovery, but that is okay; it's equivalent to a
>> hard shutdown/crash. But is there any chance of corruption related to
>> the snapshot being taken in a non-atomic fashion? Filesystems and
>> applications these days should have no trouble with hard shutdowns, as
>> long as storage writes follow ordering guarantees (no writes getting
>> reordered across a barrier and such).
>>
>> Put another way: do RBD snapshots have ~identical atomicity guarantees
>> to e.g. LVM snapshots?
>>
>> If we can get away without the freeze, honestly I'd rather go that
>> route. If I really need to pause I/O during the snapshot creation, I
>> might end up resorting to pausing the whole VM (suspend/resume), which
>> has higher impact but also probably a much lower chance of messing up
>> (or having excess latency), since it doesn't involve the guest OS or
>> the qemu agent at all...
>>
> 
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph 10.2.11 - Status not working

2018-12-17 Thread Dyweni - Ceph-Users



On 2018-12-17 20:16, Brad Hubbard wrote:

On Tue, Dec 18, 2018 at 10:23 AM Mike O'Connor  wrote:


Hi All

I have a ceph cluster which has been working with out issues for about 
2

years now, it was upgrade about 6 month ago to 10.2.11

root@blade3:/var/lib/ceph/mon# ceph status
2018-12-18 10:42:39.242217 7ff770471700  0 -- 10.1.5.203:0/1608630285 
>>

10.1.5.207:6789/0 pipe(0x7ff768000c80 sd=4 :0 s=1 pgs=0 cs=0 l=1
c=0x7ff768001f90).fault
2018-12-18 10:42:45.242745 7ff770471700  0 -- 10.1.5.203:0/1608630285 
>>

10.1.5.207:6789/0 pipe(0x7ff7680051e0 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7ff768002410).fault
2018-12-18 10:42:51.243230 7ff770471700  0 -- 10.1.5.203:0/1608630285 
>>

10.1.5.207:6789/0 pipe(0x7ff7680051e0 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7ff768002f40).fault
2018-12-18 10:42:54.243452 7ff770572700  0 -- 10.1.5.203:0/1608630285 
>>

10.1.5.205:6789/0 pipe(0x7ff768000c80 sd=4 :0 s=1 pgs=0 cs=0 l=1
c=0x7ff768008060).fault
2018-12-18 10:42:57.243715 7ff770471700  0 -- 10.1.5.203:0/1608630285 
>>

10.1.5.207:6789/0 pipe(0x7ff7680051e0 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7ff768003580).fault
2018-12-18 10:43:03.244280 7ff7781b9700  0 -- 10.1.5.203:0/1608630285 
>>

10.1.5.205:6789/0 pipe(0x7ff7680051e0 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7ff768003670).fault

All system can ping each other. I simple can not see why its failing.


ceph.conf

[global]
 auth client required = cephx
 auth cluster required = cephx
 auth service required = cephx
 cluster network = 10.1.5.0/24
 filestore xattr use omap = true
 fsid = 42a0f015-76da-4f47-b506-da5cdacd030f
 keyring = /etc/pve/priv/$cluster.$name.keyring
 osd journal size = 5120
 osd pool default min size = 1
 public network = 10.1.5.0/24
 mon_pg_warn_max_per_osd = 0

[client]
 rbd cache = true
[osd]
 keyring = /var/lib/ceph/osd/ceph-$id/keyring
 osd max backfills = 1
 osd recovery max active = 1
 osd_disk_threads = 1
 osd_disk_thread_ioprio_class = idle
 osd_disk_thread_ioprio_priority = 7
[mon.2]
 host = blade5
 mon addr = 10.1.5.205:6789
[mon.1]
 host = blade3
 mon addr = 10.1.5.203:6789
[mon.3]
 host = blade7
 mon addr = 10.1.5.207:6789
[mon.0]
 host = blade1
 mon addr = 10.1.5.201:6789
[mds]
 mds data = /var/lib/ceph/mds/mds.$id
 keyring = /var/lib/ceph/mds/mds.$id/mds.$id.keyring
[mds.0]
 host = blade1
[mds.1]
 host = blade3
[mds.2]
 host = blade5
[mds.3]
 host = blade7


Any ideas ? more information ?


The system on which you are running the "ceph" client, blade3
(10.1.5.203) is trying to contact monitors on 10.1.5.207 (blade7) port
6789 and 10.1.5.205 (blade5) port 6789. You need to check the ceph-mon
binary is running on blade7 and blade5 and that they are listening on
port 6789 and that that port is accessible from blade3. The simplest
explanation is the MONs are not running. The next simplest is their is
a firewall interfering with blade3's ability to connect to port 6789
on those machines. Check the above and see what you find.



Hi,

After what Brad wrote, as for what would cause your MONs to not be 
running...


Check kernel logs / dmesg... bad blocks?  (Unlikely to knock out both 
MONs)
Check disk space on /var/lib/ceph/mon/...  Did it full up?  (check both 
blocks and inodes)


You said it was running without issues... just to double check... were 
ALL your PGs healthy?  (i.e.  active+clean)?  MONs will not trim their 
logs if any PG is not healthy.  Newer versions of Ceph do grow their 
logs as fast as the older versions.


Good luck!
Dyweni

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Decommissioning cluster - rebalance questions

2018-12-12 Thread Dyweni - Ceph-Users

Safest to just 'osd crush reweight osd.X 0' and let rebalancing finish.

Then 'osd out X' and shutdown/remove osd drive.



On 2018-12-04 03:15, Jarek wrote:

On Mon, 03 Dec 2018 16:41:36 +0100
si...@turka.nl wrote:


Hi,

Currently I am decommissioning an old cluster.

For example, I want to remove OSD Server X with all its OSD's.

I am following these steps for all OSD's of Server X:
- ceph osd out 
- Wait for rebalance (active+clean)
- On OSD: service ceph stop osd.

Once the steps above are performed, the following steps should be
performed:
- ceph osd crush remove osd.
- ceph auth del osd.
- ceph osd rm 


What I don't get is, when I perform 'ceph osd out ' the cluster
is rebalancing, but when I perform 'ceph osd crush remove osd.'
it again starts to rebalance. Why does this happen? The cluster
should be already balanced after out'ed the osd. I didn't expect
another rebalance with removing the OSD from the CRUSH map.


'ceph osd out' doesn't change the host weight in crush map, 'ceph
osd crush remove' does.
Instead of 'ceph osd out' use 'ceph osd crush reweight'.

--
Pozdrawiam
Jarosław Mociak - Nettelekom GK Sp. z o.o.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] [cephfs] Kernel outage / timeout

2018-12-04 Thread ceph
Hi,

I have some wild freeze using cephfs with the kernel driver
For instance:
[Tue Dec  4 10:57:48 2018] libceph: mon1 10.5.0.88:6789 session lost,
hunting for new mon
[Tue Dec  4 10:57:48 2018] libceph: mon2 10.5.0.89:6789 session established
[Tue Dec  4 10:58:20 2018] ceph: mds0 caps stale
[..] server is now frozen, filesystem accesses are stuck
[Tue Dec  4 11:13:02 2018] libceph: mds0 10.5.0.88:6804 socket closed
(con state OPEN)
[Tue Dec  4 11:13:03 2018] libceph: mds0 10.5.0.88:6804 connection reset
[Tue Dec  4 11:13:03 2018] libceph: reset on mds0
[Tue Dec  4 11:13:03 2018] ceph: mds0 closed our session
[Tue Dec  4 11:13:03 2018] ceph: mds0 reconnect start
[Tue Dec  4 11:13:04 2018] ceph: mds0 reconnect denied
[Tue Dec  4 11:13:04 2018] ceph:  dropping dirty+flushing Fw state for
3f1ae609 1099692263746
[Tue Dec  4 11:13:04 2018] ceph:  dropping dirty+flushing Fw state for
ccd58b71 1099692263749
[Tue Dec  4 11:13:04 2018] ceph:  dropping dirty+flushing Fw state for
da5acf8f 1099692263750
[Tue Dec  4 11:13:04 2018] ceph:  dropping dirty+flushing Fw state for
5ddc2fcf 1099692263751
[Tue Dec  4 11:13:04 2018] ceph:  dropping dirty+flushing Fw state for
469a70f4 1099692263754
[Tue Dec  4 11:13:04 2018] ceph:  dropping dirty+flushing Fw state for
5c0038f9 1099692263757
[Tue Dec  4 11:13:04 2018] ceph:  dropping dirty+flushing Fw state for
e7288aa2 1099692263758
[Tue Dec  4 11:13:04 2018] ceph:  dropping dirty+flushing Fw state for
b431209a 1099692263759
[Tue Dec  4 11:13:04 2018] libceph: mds0 10.5.0.88:6804 socket closed
(con state NEGOTIATING)
[Tue Dec  4 11:13:31 2018] libceph: osd12 10.5.0.89:6805 socket closed
(con state OPEN)
[Tue Dec  4 11:13:35 2018] libceph: osd17 10.5.0.89:6800 socket closed
(con state OPEN)
[Tue Dec  4 11:13:35 2018] libceph: osd9 10.5.0.88:6813 socket closed
(con state OPEN)
[Tue Dec  4 11:13:41 2018] libceph: osd0 10.5.0.87:6800 socket closed
(con state OPEN)

Kernel 4.17 is used, we got the same issue with 4.18
Ceph 13.2.1 is used
>From what I understand, the kernel hang itself for some reason (perhaps
it simply cannot handle some wild event)

Is there a fix for that ?

Secondly, it seems that the kernel reconnect itself after 15 minutes
everytime
Where is that tunable ? Could I lower that variables, so that hang have
less impacts ?


On ceph.log, I get Health check failed: 1 MDSs report slow requests
(MDS_SLOW_REQUEST), but this is probably the consequence, not the cause

Any tip ?

Best regards,
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph.conf mon_max_pg_per_osd not recognized / set

2018-10-31 Thread ceph
Isn't this a mgr variable ?

On 10/31/2018 02:49 PM, Steven Vacaroaia wrote:
> Hi,
> 
> Any idea why different value for  mon_max_pg_per_osd is not "recognized" ?
> I am using mimic 13.2.2
> 
> Here is what I have in /etc/ceph/ceph.conf
> 
> 
> [mon]
> mon_allow_pool_delete = true
> mon_osd_min_down_reporters = 1
> mon_max_pg_per_osd = 400
> 
> checking the value with
> ceph daemon osd.6 config show| grep mon_max_pg_per_osd still shows the
> default ( 250)
> 
> 
> Injecting a different value appears to works
> ceph tell osd.* injectargs '--mon_max_pg_per_osd 500'
> 
> ceph daemon osd.6 config show| grep mon_max_pg_per_osd
> "mon_max_pg_per_osd": "500",
> 
> BUT
> 
> cluster is still complaining TOO_MANY_PGS too many PGs per OSD (262 >
> max 250)
> 
> I have restarted ceph.target services on monitor/manager server
> What else has to be done to have the cluster using the new value ?
> 
> Steven
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Verifying the location of the wal

2018-10-28 Thread ceph
IIRC there is a Command like 

Ceph osd Metadata

Where you should be able to find Information like this

Hab
- Mehmet 

Am 21. Oktober 2018 19:39:58 MESZ schrieb Robert Stanford 
:
> I did exactly this when creating my osds, and found that my total
>utilization is about the same as the sum of the utilization of the
>pools,
>plus (wal size * number osds).  So it looks like my wals are actually
>sharing OSDs.  But I'd like to be 100% sure... so I am seeking a way to
>find out
>
>On Sun, Oct 21, 2018 at 11:13 AM Serkan Çoban 
>wrote:
>
>> wal and db device will be same if you use just db path during osd
>> creation. i do not know how to verify this with ceph commands.
>> On Sun, Oct 21, 2018 at 4:17 PM Robert Stanford
>
>> wrote:
>> >
>> >
>> >  Thanks Serkan.  I am using --path instead of --dev (dev won't work
>> because I'm using VGs/LVs).  The output shows block and block.db, but
>> nothing about wal.db.  How can I learn where my wal lives?
>> >
>> >
>> >
>> >
>> > On Sun, Oct 21, 2018 at 12:43 AM Serkan Çoban
>
>> wrote:
>> >>
>> >> ceph-bluestore-tool can show you the disk labels.
>> >> ceph-bluestore-tool show-label --dev /dev/sda1
>> >> On Sun, Oct 21, 2018 at 1:29 AM Robert Stanford <
>> rstanford8...@gmail.com> wrote:
>> >> >
>> >> >
>> >> >  An email from this list stated that the wal would be created in
>the
>> same place as the db, if the db were specified when running
>ceph-volume lvm
>> create, and the db were specified on that command line.  I followed
>those
>> instructions and like the other person writing to this list today, I
>was
>> surprised to find that my cluster usage was higher than the total of
>pools
>> (higher by an amount the same as all my wal sizes on each node
>combined).
>> This leads me to think my wal actually is on the data disk and not
>the ssd
>> I specified the db should go to.
>> >> >
>> >> >  How can I verify which disk the wal is on, from the command
>line?
>> I've searched the net and not come up with anything.
>> >> >
>> >> >  Thanks and regards
>> >> >  R
>> >> >
>> >> > ___
>> >> > ceph-users mailing list
>> >> > ceph-users@lists.ceph.com
>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Error while installing ceph

2018-10-13 Thread ceph ceph
some packages could not be installed. this may mean that you have
[monitor1][debug ] requested an impossible situation or if you are using
the unstable [monitor1][debug ] distribution that some required packages
have not yet been created or been moved out of incoming.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] deep scrub error caused by missing object

2018-10-05 Thread ceph
Hello Roman,

I am Not sure if i could be a help but perhaps this Commands can help to find 
the objects in question...

Ceph Heath Detail
rados list-inconsistent-pg rbd
rados list-inconsistent-obj 2.10d

I guess it is also interresting to know you use  bluestore or filestore...

Hth
- Mehmet 

Am 4. Oktober 2018 14:06:07 MESZ schrieb Roman Steinhart :
>Hi all,
>
>since some weeks we have a small problem with one of the PG's on our
>ceph cluster.
>Every time the pg 2.10d is deep scrubbing it fails because of this:
>2018-08-06 19:36:28.080707 osd.14 osd.14 *.*.*.110:6809/3935 133 :
>cluster [ERR] 2.10d scrub stat mismatch, got 397/398 objects, 0/0
>clones, 397/398 dirty, 0/0 omap, 0/0 pinned, 0/0 hit_set_archive, 0/0
>whiteouts, 2609281919/2609293215 bytes, 0/0 hit_set_archive bytes.
>2018-08-06 19:36:28.080905 osd.14 osd.14 *.*.*.110:6809/3935 134 :
>cluster [ERR] 2.10d scrub 1 errors
>As far as I understand ceph is missing an object on that osd.14 which
>should be stored on this osd. A small ceph pg repair 2.10d fixes the
>problem but as soon as a deep scrubbing job for that pg is running
>again(manual or automatically) the problem is back again.
>I tried to find out which object is missing, but a small search leads
>me to the result that there is no real way to find out which objects
>are stored in this PG or which object exactly is missing.
>That's why I've gone for some "unconventional" methods.
>I completely removed OSD.14 from the cluster. I waited until everything
>was balanced and then added the OSD again.
>Unfortunately the problem is still there.
>
>Some weeks later we've added a huge amount of OSD's to our cluster
>which had a big impact on the crush map.
>Since then the PG 2.10d was running on two other OSD's -> [119,93] (We
>have a replica of 2)
>Still the same error message, but another OSD:
>2018-10-03 03:39:22.776521 7f12d9979700 -1 log_channel(cluster) log
>[ERR] : 2.10d scrub stat mismatch, got 728/729 objects, 0/0 clones,
>728/729 dirty, 0/0 omap, 0/0 pinned, 0/0 hit_set_archive, 0/0
>whiteouts, 7281369687/7281381269 bytes, 0/0 hit_set_archive bytes.
>
>As a first step it would be enough for me to find out which the
>problematic object is. Then I am able to check if the object is
>critical, if any recovery is required or if I am able to just drop that
>object(That would be 90% of the case)
>I hope anyone is able to help me to get rid of this.
>It's not really a problem for us. Ceph runs despite this message
>without further problems.
>It's just a bit annoying that every time the error occurs our
>monitoring triggers a big alarm because Ceph is in ERROR status. :)
>
>Thanks in advance,
>Roman
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Some questions concerning filestore --> bluestore migration

2018-10-04 Thread ceph
Hello

Am 4. Oktober 2018 02:38:35 MESZ schrieb solarflow99 :
>I use the same configuration you have, and I plan on using bluestore. 
>My
>SSDs are only 240GB and it worked with filestore all this time, I
>suspect
>bluestore should be fine too.
>
>
>On Wed, Oct 3, 2018 at 4:25 AM Massimo Sgaravatto <
>massimo.sgarava...@gmail.com> wrote:
>
>> Hi
>>
>> I have a ceph cluster, running luminous, composed of 5 OSD nodes,
>which is
>> using filestore.
>> Each OSD node has 2 E5-2620 v4 processors, 64 GB of RAM, 10x6TB SATA
>disk
>> + 2x200GB SSD disk (then I have 2 other disks in RAID for the OS), 10
>Gbps.
>> So each SSD disk is used for the journal for 5 OSDs. With this
>> configuration everything is running smoothly ...
>>
>>
>> We are now buying some new storage nodes, and I am trying to buy
>something
>> which is bluestore compliant. So the idea is to consider a
>configuration
>> something like:
>>
>> - 10 SATA disks (8TB / 10TB / 12TB each. TBD)
>> - 2 processor (~ 10 core each)
>> - 64 GB of RAM
>> - 2 SSD to be used for WAL+DB
>> - 10 Gbps
>>
>> For what concerns the size of the SSD disks I read in this mailing
>list
>> that it is suggested to have at least 10GB of SSD disk/10TB of SATA
>disk.
>>
>>
>> So, the questions:
>>
>> 1) Does this hardware configuration seem reasonable ?
>>
>> 2) Are there problems to live (forever, or until filestore
>deprecation)
>> with some OSDs using filestore (the old ones) and some OSDs using
>bluestore
>> (the old ones) ?
>>
>> 3) Would you suggest to update to bluestore also the old OSDs, even
>if the
>> available SSDs are too small (they don't satisfy the "10GB of SSD
>disk/10TB
>> of SATA disk" rule) ?

AFAIR should the db size 4% of the osd in question.

So 

For example, if the block size is 1TB, then block.db shouldn’t be less than 40GB

See: http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/

Hth
- Mehmet 

>>
>> Thanks, Massimo
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD Mirror Question

2018-10-04 Thread ceph
Hello Vikas,

Could you please provide us which Commands you have uses to Setup rbd-mirror?

Would be Great if you could Provide a short howto :)

Thanks in advise
 - Mehmet 

Am 2. Oktober 2018 22:47:08 MESZ schrieb Vikas Rana :
>Hi,
>
>We have a CEPH 3 node cluster at primary site. We created a RBD image
>and
>the image has about 100TB of data.
>
>Now we installed another 3 node cluster on secondary site. We want to
>replicate the image at primary site to this new cluster on secondary
>site.
>
>As per documentation, we enabled journaling on primary site. We
>followed
>all the procedure and peering looks good but the image is not copying.
>The status is always showing down.
>
>
>So my question is, is it possible to replicate a image which already
>have
>some data before enabling journalling?
>
>We are using the image mirroring instead of pool mirroring. Do we need
>to
>create the RBD image on secondary site? As per documentation, its not
>required.
>
>Is there any other option to copy the image to the remote site?
>
>Thanks,
>-Vikas
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [CEPH]-[RADOS] Deduplication feature status

2018-09-27 Thread ceph
As of today, there is no such feature in Ceph

Best regards,


On 09/27/2018 04:34 PM, Gaël THEROND wrote:
> Hi folks!
> 
> As I’ll soon start to work on a new really large an distributed CEPH
> project for cold data storage, I’m checking out a few features availability
> and status, with the need for deduplication among them.
> 
> I found out an interesting video about that from Cephalocon APAC 2018 and a
> seven years old bugtrack (
> https://tracker.ceph.com/issues/1576), but that doesn’t really answered my
> questions.
> 
> Suppose that I want to base this project on mimic, is deduplication kinda
> supported, if so at which level and to which extend ?
> 
> Thanks a lot for your information !
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] backup ceph

2018-09-19 Thread ceph
For cephfs & rgw, it all depends on your needs, as with rbd
You may want to trust blindly Ceph
Or you may backup all your data, just in case (better safe than sorry,
as he said)

To my knowledge, there is no (or few) impact of keeping a large number
of snapshot on a cluster

With rbd, you can indeed "map" a rbd volume (or snapshot): this will get
you a block device, those fs can be mounted freely:
root@backup1:~# rbd map 'my-image' --snap 'super-snapshot'
/dev/rbd1
root@backup1:~# mkdir /tmp/snapshot
root@backup1:~# mount /dev/rbd1 /tmp/snapshot
# here, you can access your file
root@backup1:~# umount /dev/rbd1
root@backup1:~# rbd unmap 'my-image' --snap 'super-snapshot'

(note that this works because the filesystem directly use the block
device: there is no partition or so. If there is, you must use kpartx
between the 'map' and the 'mount', to map partitions too)

FYI, at job, we are using this tool¹ to backup our Proxmox VMs:
https://github.com/JackSlateur/backurne

-> Snapshots are exported remotely, thus they are really backups
-> One or more snapshots are kept on the live cluster, for faster
recovery: if a user broke his disk, you can restore it really fast
-> Backups can be inspected on the backup cluster

Using rbd, you can also do a "duplicate-and-restore" kind of stuff
Let say, for instance, that you have a VM, with a single disk
The user remove a lot of files by mistake, and want them back
But he does not want to fully restore the disk, because some changes
must be kept
And, even more, he does not know exactly which files have been removed
In such scenario, you can add a new disk to that VM, those disk is the
backup of the first disk. You can then mount that disk to, say, /backup,
and allow the user to inspect it freely
(just for you to understand what can be done using rbd)

Regards,

[¹] I made dis

On 09/19/2018 03:40 AM, ST Wong (ITSC) wrote:
> Hi,
> 
> Thanks for your help.
> 
>> I assume that you are speaking of rbd only
> Yes, as we just started studying Ceph, we only aware of backup of RBD.   Will 
> there be other areas that need backup?   Sorry for my ignorance.
> 
>> Taking snapshot of rbd volumes and keeping all of them on the cluster is 
>> fine However, this is no backup A snapshot is only a backup if it is 
>> exported off-site
> Will this scheme (e.g. keeping 30 daily snapshots) impact performance?  
> Besides, can we somehow "mount" snapshot of nth day to get the backup of 
> particular file ?  sorry that we're still based on traditional SAN snapshot 
> concepts.
> 
> Sorry to bother, and thanks a lot.
> 
> Rgds,
> /st wong
> 
> -Original Message-
> From: ceph-users  On Behalf Of 
> c...@jack.fr.eu.org
> Sent: Tuesday, September 18, 2018 8:04 PM
> To: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] backup ceph
> 
> Hi,
> 
> I assume that you are speaking of rbd only
> 
> Taking snapshot of rbd volumes and keeping all of them on the cluster is fine 
> However, this is no backup A snapshot is only a backup if it is exported 
> off-site
> 
> On 09/18/2018 11:54 AM, ST Wong (ITSC) wrote:
>> Hi,
>>
>> We're newbie to Ceph.  Besides using incremental snapshots with RDB to 
>> backup data on one Ceph cluster to another running Ceph cluster, or using 
>> backup tools like backy2, will there be any recommended way to backup Ceph 
>> data  ?   Someone here suggested taking snapshot of RDB daily and keeps 30 
>> days to replace backup.  I wonder if this is practical and if performance 
>> will be impact...
>>
>> Thanks a lot.
>> Regards
>> /st wong
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] backup ceph

2018-09-18 Thread ceph
Hi,

I assume that you are speaking of rbd only

Taking snapshot of rbd volumes and keeping all of them on the cluster is
fine
However, this is no backup
A snapshot is only a backup if it is exported off-site

On 09/18/2018 11:54 AM, ST Wong (ITSC) wrote:
> Hi,
> 
> We're newbie to Ceph.  Besides using incremental snapshots with RDB to backup 
> data on one Ceph cluster to another running Ceph cluster, or using backup 
> tools like backy2, will there be any recommended way to backup Ceph data  ?   
> Someone here suggested taking snapshot of RDB daily and keeps 30 days to 
> replace backup.  I wonder if this is practical and if performance will be 
> impact...
> 
> Thanks a lot.
> Regards
> /st wong
> 
> 
> 
> _______
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to setup Ceph OSD auto boot up on node reboot

2018-09-07 Thread ceph
Hi karri,

Am 4. September 2018 23:30:01 MESZ schrieb Pardhiv Karri 
:
>Hi,
>
>I created a ceph cluster  manually (not using ceph-deploy). When I
>reboot
>the node the osd's doesn't come backup because the OS doesn't know that
>it
>need to bring up the OSD. I am running this on Ubuntu 1604. Is there a
>standardized way to initiate ceph osd start on node reboot?
>
>"sudo start ceph-osd-all" isn't working well and doesn't like the idea
>of "sudo start ceph-osd id=1" for each OSD in rc file.
>

Perhaps something like

Systemctl enable ceph-osd@1

Will help?

- Mehmet 

>Need to do it for both Hammer (Ubuntu 1404) and Luminous (Ubuntu 1604).
>
>--
>Thanks,
>Pardhiv Karri
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Understanding the output of dump_historic_ops

2018-09-02 Thread ceph
Hi Ronni,

Am 2. September 2018 13:32:05 MESZ schrieb Ronnie Lazar 
:
>Hello,
>
>I'm trying to understand the output of the dump_historic_ops admin sock
>command.
>I can't find information on what are the meaning of the different
>states
>that an OP can be in.
>For example, in the following excerpt:
>{
>
>
>
>"description": "MOSDPGPush(1.a5 421/239
>[PushOp(1:a534ca1b:::rbd_data.cb0f2fd796ae.317e:head,
>version:
>230'121959, data_included:
>[20480~4096,40960~4096,102400~4096,110592~4096,122880~4096,147456~4096,266240~4096,274432~4096,434176~4096,593920~4096,65
>5360~4096,774144~4096,1019904~4096,1114112~4096,1134592~4096,1142784~4096,1204224~4096,1323008~4096,1339392~4096,1359872~4096,1445888~4096,1454080~4096,1617920~4096,1712128~8192,1757184~4096,1953792~4096,1978368~4096,2134016~4096,2314240~4096,2650112~4096,2662400~4096,267878
>4~4096,2686976~4096,2744320~4096,2760704~4096,2875392~4096,2945024~4096,3330048~4096,3444736~8192,3493888~4096,3502080~4096,3522560~4096,3608576~4096,3743744~4096,3805184~4096,3915776~4096,4079616~4096,4096000~4096,4112384~4096,4128768~4096],
>data_size: 212992, omap_header_s
>ize: 0, omap_entries_size: 0, attrset_size: 2, recovery_info:
>ObjectRecoveryInfo(1:a534ca1b:::rbd_data.cb0f2fd796ae.317e:head@230'121959,
>size: 4132864, copy_subset: [0~4132864], clone_subset: {}, snapset:
>0=[]:[]), after_progress: ObjectRecoveryProgress(!first,
>data_recovered_to:4132864, data_complete:true, omap_recovered_to:,
>omap_complete:true, error:false), before_progress:
>ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false,
>omap_recovered_to:, omap_complete:false, error:false))])",
>"initiated_at": "2018-09-02 11:20:32.486670",
>"age": 594.163684,
>"duration": 2.162485,
>"type_data": {
>"flag_point": "started",
>"events": [
>{
>"time": "2018-09-02 11:20:32.486670",
>"event": "initiated"
>},
>{
>"time": "2018-09-02 11:20:32.487195",
>"event": "queued_for_pg"
>},

I guess in this Case this is where you could have a look.

A Job was queued but it took nearly 2 sec. Till it could be handled from the 
PG. Why? ... Hmm..  perhaps a busy Disk or CPU? 

>{
>"time": "2018-09-02 11:20:34.648092",
>"event": "reached_pg"
>},

Hth
- Mehmet

>{
>"time": "2018-09-02 11:20:34.648095",
>"event": "started"
>},
>{
>"time": "2018-09-02 11:20:34.649155",
>"event": "done"
>}
>]
>}
>},
>
>Seems like I have an operation that was delayed over 2 seconds in
>queued_for_pg state.
>What does that mean? What was it waiting for?
>
>Regards,
>*Ronnie Lazar*
>*R*
>
>T: +972 77 556-1727
>E: ron...@stratoscale.com
>
>
>Web <http://www.stratoscale.com/> | Blog
><http://www.stratoscale.com/blog/>
> | Twitter <https://twitter.com/Stratoscale> | Google+
><https://plus.google.com/u/1/b/108421603458396133912/108421603458396133912/posts>
> | Linkedin <https://www.linkedin.com/company/stratoscale>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] New Ceph community manager: Mike Perez

2018-08-28 Thread ceph
Great! Welcome Mike! 

Am 29. August 2018 05:36:20 MESZ schrieb Alvaro Soto :
>Welcome Mike!
>
>On Tue, Aug 28, 2018 at 10:19 PM, linghucongsong
>
>wrote:
>
>>
>>
>>
>>
>> Welcome!
>>
>>
>>
>> At 2018-08-29 09:13:24, "Sage Weil"  wrote:
>> >Hi everyone,
>> >
>> >Please help me welcome Mike Perez, the new Ceph community manager!
>> >
>> >Mike has a long history with Ceph: he started at DreamHost working
>on
>> >OpenStack and Ceph back in the early days, including work on the
>original
>> >RBD integration.  He went on to work in several roles in the
>OpenStack
>> >project, doing a mix of infrastructure, cross-project and community
>> >related initiatives, including serving as the Project Technical Lead
>for
>> >Cinder.
>> >
>> >Mike lives in Pasadena, CA, and can be reached at mpe...@redhat.com,
>on
>> >IRC as thingee, or twitter as @thingee.
>> >
>> >I am very excited to welcome Mike back to Ceph, and look forward to
>> >working together on building the Ceph developer and user
>communities!
>> >
>> >sage
>> >___
>> >ceph-users mailing list
>> >ceph-users@lists.ceph.com
>> >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
>
>-- 
>
>ATTE. Alvaro Soto Escobar
>
>--
>Great people talk about ideas,
>average people talk about things,
>small people talk ... about other people.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Still risky to remove RBD-Images?

2018-08-21 Thread ceph



Am 20. August 2018 17:22:35 MESZ schrieb Mehmet :
>Hello,

Hello me,

>
>AFAIK removing of big RBD-Images would lead ceph to produce blocked 
>requests - I dont mean caused by poor disks.
>
>Is this still the case with "Luminous (12.2.4)"?
>

To answer my question by myself :)
There is no Problem, had to delete the 2T Image at First and did not See any 
blocked requests.

The Space is being freed over a few minutes.

- Mehmet 

>I have a a few images with
>
>- 2 Terrabyte
>- 5 Terrabyte
>and
>- 20 Terrabyte
>
>in size and have to delete the images.
>
>Would be nice if you could enlightne me :)
>
>- Mehmet
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Least impact when adding PG's

2018-08-13 Thread ceph



Am 7. August 2018 18:08:05 MESZ schrieb John Petrini :
>Hi All,

Hi John, 

>
>Any advice?
>

I am Not sure but what i would do is to increase the PG Step by Step and always 
with a value of "Power of two" i.e. 256.

Also have a look on the pg_-/pgp_num. One of this should be increased First - 
not sure which one, but the docs  and Mailinglist history should be helpfull.

Hope i could give a Bit usefull hints
 - Mehmet 

>Thanks,
>
>John
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Optane 900P device class automatically set to SSD not NVME

2018-08-12 Thread ceph



Am 1. August 2018 10:33:26 MESZ schrieb Jake Grimmett :
>Dear All,

Hello Jake,

>
>Not sure if this is a bug, but when I add Intel Optane 900P drives,
>their device class is automatically set to SSD rather than NVME.
>

AFAIK ceph actually difference only between hdd and ssd. Nvme would be handled 
as same like ssd.

Hth 
- Mehmet 
 
>This happens under Mimic 13.2.0 and 13.2.1
>
>[root@ceph2 ~]# ceph-volume lvm prepare --bluestore --data /dev/nvme0n1
>
>(SNIP see http://p.ip.fi/eopR for output)
>
>Check...
>[root@ceph2 ~]# ceph osd tree | grep "osd.1 "
>  1   ssd0.25470 osd.1   up  1.0 1.0
>
>Fix is easy
>[root@ceph2 ~]# ceph osd crush rm-device-class osd.1
>done removing class of osd(s): 1
>
>[root@ceph2 ~]# ceph osd crush set-device-class nvme osd.1
>set osd(s) 1 to class 'nvme'
>
>Check...
>[root@ceph2 ~]# ceph osd tree | grep "osd.1 "
>  1  nvme0.25470 osd.1   up  1.0 1.0
>
>
>Thanks,
>
>Jake
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Running 12.2.5 without problems, should I upgrade to 12.2.7 or wait for 12.2.8?

2018-08-10 Thread ceph



Am 30. Juli 2018 09:51:23 MESZ schrieb Micha Krause :
>Hi,

Hi Micha,

>
>I'm Running 12.2.5 and I have no Problems at the moment.
>
>However my servers reporting daily that they want to upgrade to 12.2.7,
>is this save or should I wait for 12.2.8?
>
I guess you should Upgrade to 12.2.7 as soon as you can, specialy when 

Quote:
The v12.2.5 release has a potential data corruption issue with erasure coded 
pools. If you ran v12.2.5 with erasure coding, please see below.

See: https://ceph.com/releases/12-2-7-luminous-released/

Hth
- Mehmet 
>Are there any predictions when the 12.2.8 release will be available?
>
>
>Micha Krause
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] is there any filesystem like wrapper that dont need to map and mount rbd ?

2018-08-01 Thread ceph
Sound like cephfs to me


On 08/01/2018 09:33 AM, Will Zhao wrote:
> Hi:
>I want to use ceph rbd, because it shows better performance. But I dont
> like kernal module and isci target process. So  here is my requirments:
>I dont want to map it and mount it ,   But I still want to use some
> filesystem like api, or at least I can write multiple files to the rbd
> volume and read them back later. This means I just need librbd api to
> oprate the volume through network.
>I wonder if there is anyone who have done this , wrap librbd api to
> provide a very simple filesystem like api ?   Or I can do this through
> orther way ?  Thanks for giving me any advice.
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HELP! --> CLUSER DOWN (was "v13.2.1 Mimic released")

2018-07-30 Thread ceph . novice
Hey Nathan.

No blaming here. I'm very thankful for this great peace (ok, sometime more of a 
beast ;) ) of open-source SDS and all the great work around it incl. community 
and users... and happy the problem is identified and can be fixed for 
others/the future as well :)
 
Well, yes, can confirm your found "error" also here:

[root@sds20 ~]# ceph-detect-init
Traceback (most recent call last):
  File "/usr/bin/ceph-detect-init", line 9, in 
    load_entry_point('ceph-detect-init==1.0.1', 'console_scripts', 
'ceph-detect-init')()
  File "/usr/lib/python2.7/site-packages/ceph_detect_init/main.py", line 56, in 
run
print(ceph_detect_init.get(args.use_rhceph).init)
  File "/usr/lib/python2.7/site-packages/ceph_detect_init/__init__.py", line 
42, in get
release=release)
ceph_detect_init.exc.UnsupportedPlatform: Platform is not supported.: rhel  7.5


Gesendet: Sonntag, 29. Juli 2018 um 20:33 Uhr
Von: "Nathan Cutler" 
An: ceph.nov...@habmalnefrage.de, "Vasu Kulkarni" 
Cc: ceph-users , "Ceph Development" 

Betreff: Re: [ceph-users] HELP! --> CLUSER DOWN (was "v13.2.1 Mimic released")
> Strange...
> - wouldn't swear, but pretty sure v13.2.0 was working ok before
> - so what do others say/see?
> - no one on v13.2.1 so far (hard to believe) OR
> - just don't have this "systemctl ceph-osd.target" problem and all just works?
>
> If you also __MIGRATED__ from Luminous (say ~ v12.2.5 or older) to Mimic (say 
> v13.2.0 -> v13.2.1) and __DO NOT__ see the same systemctl problems, whats 
> your Linix OS and version (I'm on RHEL 7.5 here) ? :O

Best regards
 Anton



Hi ceph.novice:

I'm the one to blame for this regretful incident. Today I have
reproduced the issue in teuthology:

2018-07-29T18:20:07.288 INFO:teuthology.orchestra.run.ovh093:Running:
'sudo TESTDIR=/home/ubuntu/cephtest bash -c ceph-detect-init'
2018-07-29T18:20:07.796
INFO:teuthology.orchestra.run.ovh093.stderr:Traceback (most recent call
last):
2018-07-29T18:20:07.797 INFO:teuthology.orchestra.run.ovh093.stderr:
File "/bin/ceph-detect-init", line 9, in 
2018-07-29T18:20:07.797 INFO:teuthology.orchestra.run.ovh093.stderr:
load_entry_point('ceph-detect-init==1.0.1', 'console_scripts',
'ceph-detect-init')()
2018-07-29T18:20:07.797 INFO:teuthology.orchestra.run.ovh093.stderr:
File "/usr/lib/python2.7/site-packages/ceph_detect_init/main.py", line
56, in run
2018-07-29T18:20:07.797 INFO:teuthology.orchestra.run.ovh093.stderr:
print(ceph_detect_init.get(args.use_rhceph).init)
2018-07-29T18:20:07.797 INFO:teuthology.orchestra.run.ovh093.stderr:
File "/usr/lib/python2.7/site-packages/ceph_detect_init/__init__.py",
line 42, in get
2018-07-29T18:20:07.797 INFO:teuthology.orchestra.run.ovh093.stderr:
release=release)
2018-07-29T18:20:07.797
INFO:teuthology.orchestra.run.ovh093.stderr:ceph_detect_init.exc.UnsupportedPlatform:
Platform is not supported.: rhel 7.5

Just to be sure, can you confirm? (I.e. issue the command
"ceph-detect-init" on your RHEL 7.5 system. Instead of saying "systemd"
it gives an error like above?)

I'm working on a fix now at https://github.com/ceph/ceph/pull/23303

Nathan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HELP! --> CLUSER DOWN (was "v13.2.1 Mimic released")

2018-07-29 Thread ceph . novice

Strange...
- wouldn't swear, but pretty sure v13.2.0 was working ok before
- so what do others say/see?
 - no one on v13.2.1 so far (hard to believe) OR
 - just don't have this "systemctl ceph-osd.target" problem and all just works?

If you also __MIGRATED__ from Luminous (say ~ v12.2.5 or older) to Mimic (say 
v13.2.0 -> v13.2.1) and __DO NOT__ see the same systemctl problems, whats your 
Linix OS and version (I'm on RHEL 7.5 here) ? :O

 

Gesendet: Sonntag, 29. Juli 2018 um 03:15 Uhr
Von: "Vasu Kulkarni" 
An: ceph.nov...@habmalnefrage.de
Cc: "Sage Weil" , ceph-users , 
"Ceph Development" 
Betreff: Re: [ceph-users] HELP! --> CLUSER DOWN (was "v13.2.1 Mimic released")
On Sat, Jul 28, 2018 at 6:02 PM,  wrote:
> Have you guys changed something with the systemctl startup of the OSDs?

I think there is some kind of systemd issue hidden in mimic,
https://tracker.ceph.com/issues/25004
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HELP! --> CLUSER DOWN (was "v13.2.1 Mimic released")

2018-07-28 Thread ceph . novice
Have you guys changed something with the systemctl startup of the OSDs?

I've stopped and disabled all the OSDs on all my hosts via "systemctl 
stop|disable ceph-osd.target" and rebooted all the nodes. Everything look just 
the same.
The I started all the OSD daemons one after the other via the CLI with 
"/usr/bin/ceph-osd -f --cluster ceph --id $NR --setuser ceph --setgroup ceph > 
/tmp/osd.${NR}.log 2>&1 & " and now everything (ok, beside the ZABBIX mgr 
module?!?) seems to work :|


  cluster:
id: 2a919338-4e44-454f-bf45-e94a01c2a5e6
health: HEALTH_WARN
Failed to send data to Zabbix

  services:
mon: 3 daemons, quorum sds20,sds21,sds22
mgr: sds22(active), standbys: sds20, sds21
osd: 18 osds: 18 up, 18 in
rgw: 4 daemons active

  data:
pools:   25 pools, 1390 pgs
objects: 2.55 k objects, 3.4 GiB
usage:   26 GiB used, 8.8 TiB / 8.8 TiB avail
pgs: 1390 active+clean

  io:
client:   11 KiB/s rd, 10 op/s rd, 0 op/s wr

Any hints?

--
 

Gesendet: Samstag, 28. Juli 2018 um 23:35 Uhr
Von: ceph.nov...@habmalnefrage.de
An: "Sage Weil" 
Cc: ceph-users@lists.ceph.com, ceph-de...@vger.kernel.org
Betreff: Re: [ceph-users] HELP! --> CLUSER DOWN (was "v13.2.1 Mimic released")
Hi Sage.

Sure. Any specific OSD(s) log(s)? Or just any?

Gesendet: Samstag, 28. Juli 2018 um 16:49 Uhr
Von: "Sage Weil" 
An: ceph.nov...@habmalnefrage.de, ceph-users@lists.ceph.com, 
ceph-de...@vger.kernel.org
Betreff: Re: [ceph-users] HELP! --> CLUSER DOWN (was "v13.2.1 Mimic released")

Can you include more or your osd log file?
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HELP! --> CLUSER DOWN (was "v13.2.1 Mimic released")

2018-07-28 Thread ceph . novice
Hi Sage.

Sure. Any specific OSD(s) log(s)? Or just any?

Gesendet: Samstag, 28. Juli 2018 um 16:49 Uhr
Von: "Sage Weil" 
An: ceph.nov...@habmalnefrage.de, ceph-users@lists.ceph.com, 
ceph-de...@vger.kernel.org
Betreff: Re: [ceph-users] HELP! --> CLUSER DOWN (was "v13.2.1 Mimic released")

Can you include more or your osd log file?
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] HELP! --> CLUSER DOWN (was "v13.2.1 Mimic released")

2018-07-28 Thread ceph . novice
Dear users and developers.
 
I've updated our dev-cluster from v13.2.0 to v13.2.1 yesterday and since then 
everything is badly broken.
I've restarted all Ceph components via "systemctl" and also rebootet the server 
SDS21 and SDS24, nothing changes.

This cluster started as Kraken, was updated to Luminous (up to v12.2.5) and 
then to Mimic.

Here are some system related infos, see 
https://semestriel.framapad.org/p/DTkBspmnfU

Somehow I guess this may have to do with the various "ceph-disk", 
"ceph-volume", ceph-lvm" changes in the last months?!?

Thanks & regards
 Anton

--

 

Gesendet: Samstag, 28. Juli 2018 um 00:22 Uhr
Von: "Bryan Stillwell" 
An: "ceph-users@lists.ceph.com" 
Betreff: Re: [ceph-users] v13.2.1 Mimic released

I decided to upgrade my home cluster from Luminous (v12.2.7) to Mimic (v13.2.1) 
today and ran into a couple issues:
 
1. When restarting the OSDs during the upgrade it seems to forget my upmap 
settings.  I had to manually return them to the way they were with commands 
like:
 
ceph osd pg-upmap-items 5.1 11 18 8 6 9 0
ceph osd pg-upmap-items 5.1f 11 17
 
I also saw this when upgrading from v12.2.5 to v12.2.7.
 
2. Also after restarting the first OSD during the upgrade I saw 21 messages 
like these in ceph.log:
 
2018-07-27 15:53:49.868552 osd.1 osd.1 10.0.0.207:6806/4029643 97 : cluster 
[WRN] failed to encode map e100467 with expected crc
2018-07-27 15:53:49.922365 osd.6 osd.6 10.0.0.16:6804/90400 25 : cluster [WRN] 
failed to encode map e100467 with expected crc
2018-07-27 15:53:49.925585 osd.6 osd.6 10.0.0.16:6804/90400 26 : cluster [WRN] 
failed to encode map e100467 with expected crc
2018-07-27 15:53:49.944414 osd.18 osd.18 10.0.0.15:6808/120845 8 : cluster 
[WRN] failed to encode map e100467 with expected crc
2018-07-27 15:53:49.944756 osd.17 osd.17 10.0.0.15:6800/120749 13 : cluster 
[WRN] failed to encode map e100467 with expected crc
 
Is this a sign that full OSD maps were sent out by the mons to every OSD like 
back in the hammer days?  I seem to remember that OSD maps should be a lot 
smaller now, so maybe this isn't as big of a problem as it was back then?
 
Thanks,
Bryan
 

From: ceph-users  on behalf of Sage Weil 

Date: Friday, July 27, 2018 at 1:25 PM
To: "ceph-annou...@lists.ceph.com" , 
"ceph-users@lists.ceph.com" , 
"ceph-maintain...@lists.ceph.com" , 
"ceph-de...@vger.kernel.org" 
Subject: [ceph-users] v13.2.1 Mimic released

 

This is the first bugfix release of the Mimic v13.2.x long term stable release

series. This release contains many fixes across all components of Ceph,

including a few security fixes. We recommend that all users upgrade.

 

Notable Changes

--

 

* CVE 2018-1128: auth: cephx authorizer subject to replay attack (issue#24836 
http://tracker.ceph.com/issues/24836, Sage Weil)

* CVE 2018-1129: auth: cephx signature check is weak (issue#24837 
http://tracker.ceph.com/issues/24837[http://tracker.ceph.com/issues/24837], 
Sage Weil)

* CVE 2018-10861: mon: auth checks not correct for pool ops (issue#24838

* <http://tracker.ceph.com/issues/24838[http://tracker.ceph.com/issues/24838], 
Jason Dillaman)

 

For more details and links to various issues and pull requests, please

refer to the ceph release blog at 
https://ceph.com/releases/13-2-1-mimic-released[https://ceph.com/releases/13-2-1-mimic-released]

 

Changelog

-

* bluestore:  common/hobject: improved hash calculation for hobject_t etc 
(pr#22777, Adam Kupczyk, Sage Weil)

* bluestore,core: mimic: os/bluestore: don't store/use path_block.{db,wal} from 
meta (pr#22477, Sage Weil, Alfredo Deza)

* bluestore: os/bluestore: backport 24319 and 24550 (issue#24550, issue#24502, 
issue#24319, issue#24581, pr#22649, Sage Weil)

* bluestore: os/bluestore: fix incomplete faulty range marking when doing 
compression (pr#22910, Igor Fedotov)

* bluestore: spdk: fix ceph-osd crash when activate SPDK (issue#24472, 
issue#24371, pr#22684, tone-zhang)

* build/ops: build/ops: ceph.git has two different versions of dpdk in the 
source tree (issue#24942, issue#24032, pr#23070, Kefu Chai)

* build/ops: build/ops: install-deps.sh fails on newest openSUSE Leap 
(issue#25065, pr#23178, Kyr Shatskyy)

* build/ops: build/ops: Mimic build fails with -DWITH_RADOSGW=0 (issue#24766, 
pr#22851, Dan Mick)

* build/ops: cmake: enable RTTI for both debug and release RocksDB builds 
(pr#22299, Igor Fedotov)

* build/ops: deb/rpm: add python-six as build-time and run-time dependency 
(issue#24885, pr#22948, Nathan Cutler, Kefu Chai)

* build/ops: deb,rpm: fix block.db symlink ownership (pr#23246, Sage Weil)

* build/ops: include: fix build with older clang (OSX target) (pr#23049, 
Christopher Blum)

* build/ops: include: fix build with older clang (pr#23034, Kefu Chai)

* build/ops,rbd: build/ops: order rbdmap.service before remote-fs-pr

[ceph-users] ceph bluestore data cache on osd

2018-07-23 Thread nokia ceph
 Hi Team,

We need a mechanism to have some data cache on OSD build on bluestore  . Is
there an option available to enable data cache?

With default configurations , OSD logs state that data cache is disabled by
default,

 bluestore(/var/lib/ceph/osd/ceph-66) _set_cache_sizes cache_size 1073741824
 *meta 0.5 kv 0.5* *data 0*

We tried to change the config to have 49% for data and the OSD logs
reflected as following, however don't see any improvement in iops status.

bluestore(/var/lib/ceph/osd/ceph-66) _set_cache_sizes cache_size
1073741824 *meta
0.01 kv 0.5 data 0.49*

Thanks,
Muthu
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mimic (13.2.0) and "Failed to send data to Zabbix"

2018-07-12 Thread ceph . novice
There was no change in the ZABBIX environment... I got the this warning some 
minutes after the Linux and Luminous->Mimic update via YUM and a reboot of all 
the Ceph servers...

Is there anyone, who also had the ZABBIX module unabled under Luminos AND then 
migrated to Mimic? If yes, does it work "ok" in your place? If yes, which Linux 
OS/version are you running?

-

Ok, but the reason the Module is issuing the warning is that
zabbix_sender does not exit with status 0.

You might want to check why this is. Was there a version change of
Zabbix? If so, try to trace what might have changed that causes
zabbix_sender to exit non-zero.

Wido

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mimic (13.2.0) and "Failed to send data to Zabbix"

2018-07-11 Thread ceph . novice
at about the same time we also updated the Linux OS via "YUM" to:

# more /etc/redhat-release
Red Hat Enterprise Linux Server release 7.5 (Maipo)



from the given error message, it seems like there are 32 "measure points", 
which are to be send but 3 of them are somehow failing:

>>>  "response":"success","info":"processed: 29; failed: 3; total: 32; seconds 
>>> spent: 0.000605" <<<

and the funny thing is, our monitoring team, who runs the ZABBIX service/infra 
here, still receive "all stuff"





This is the problem, the zabbix_sender process is exiting with a
non-zero status.

You didn't change anything? You just upgraded from Luminous to Mimic and
this came along?

Wido

> ---
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com[http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com]
 
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] mimic (13.2.0) and "Failed to send data to Zabbix"

2018-07-11 Thread ceph . novice
anyone with "mgr Zabbix enabled" migrated from Luminous (12.2.5 or 5) and has 
the same problem in Mimic now?
if I disable and re-enable the "zabbix" module, the status is "HEALTH_OK" for 
some sec. and changes to "HEALTH_WARN" again...

---

# ceph -s
  cluster:
id: 
health: HEALTH_WARN
Failed to send data to Zabbix

  services:
mon: 3 daemons, quorum ceph20,ceph21,ceph22
mgr: ceph21(active), standbys: ceph20, ceph22
osd: 18 osds: 18 up, 18 in
rgw: 4 daemons active

  data:
pools:   25 pools, 1390 pgs
objects: 2.55 k objects, 3.4 GiB
usage:   26 GiB used, 8.8 TiB / 8.8 TiB avail
pgs: 1390 active+clean

  io:
client:   8.6 KiB/s rd, 9 op/s rd, 0 op/s wr

# ceph version
ceph version 13.2.0 () mimic (stable)

# grep -i zabbix /var/log/ceph/ceph-mgr.ceph21.log | tail -2
2018-07-11 09:50:10.191 7f2223582700  0 mgr[zabbix] Exception when sending: 
/usr/bin/zabbix_sender exited non-zero: zabbix_sender [18450]: DEBUG: answer 
[{"response":"success","info":"processed: 29; failed: 3; total: 32; seconds 
spent: 0.000605"}]
2018-07-11 09:51:10.222 7f2223582700  0 mgr[zabbix] Exception when sending: 
/usr/bin/zabbix_sender exited non-zero: zabbix_sender [18459]: DEBUG: answer 
[{"response":"success","info":"processed: 29; failed: 3; total: 32; seconds 
spent: 0.000692"}]

---
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


  1   2   3   4   >