[ceph-users] Is rados_write_op_* any more efficient than issuing the commands individually?

2016-09-06 Thread Dan Jakubiec
Hello, I need to issue the following commands on millions of objects: rados_write_full(oid1, ...) rados_setxattr(oid1, "attr1", ...) rados_setxattr(oid1, "attr2", ...) Would it make it any faster if I combined all 3 of these into a single rados_write_op and issued them "together" as a single

Re: [ceph-users] 2 osd failures

2016-09-06 Thread Christian Balzer
Hello, Too late I see, but still... On Tue, 6 Sep 2016 22:17:05 -0400 Shain Miley wrote: > Hello, > > It looks like we had 2 osd's fail at some point earlier today, here is > the current status of the cluster: > You will really want to find out how and why that happened, because while not

Re: [ceph-users] 2 osd failures

2016-09-06 Thread Shain Miley
I restarted both osd daemons and things are back to normal. I'm not sure why they failed in the first place but I'll keep looking. Thanks! Shain Sent from my iPhone > On Sep 6, 2016, at 10:39 PM, lyt_yudi wrote: > > hi, > >> 在 2016年9月7日,上午10:17,Shain Miley

Re: [ceph-users] 2 osd failures

2016-09-06 Thread lyt_yudi
hi, > 在 2016年9月7日,上午10:17,Shain Miley 写道: > > Hello, > > It looks like we had 2 osd's fail at some point earlier today, here is the > current status of the cluster: > > root@rbd1:~# ceph -s >cluster 504b5794-34bd-44e7-a8c3-0494cf800c23 > health HEALTH_WARN >

[ceph-users] 2 osd failures

2016-09-06 Thread Shain Miley
Hello, It looks like we had 2 osd's fail at some point earlier today, here is the current status of the cluster: root@rbd1:~# ceph -s cluster 504b5794-34bd-44e7-a8c3-0494cf800c23 health HEALTH_WARN 2 pgs backfill 5 pgs backfill_toofull 69 pgs

Re: [ceph-users] rados bench output question

2016-09-06 Thread Christian Balzer
Hello, On Tue, 6 Sep 2016 20:30:30 -0500 Brady Deetz wrote: > On the topic of understanding the general ebb and flow of a cluster's > throughput: > -Is there a way to monitor/observe how full a journal partition becomes > before it is flushed? > > I've been interested in increasing max sync

Re: [ceph-users] Changing Replication count

2016-09-06 Thread LOPEZ Jean-Charles
Hi, the stray replicas will be automatically removed in the background. JC > On Sep 6, 2016, at 17:58, Vlad Blando wrote: > > Sorry bout that > > It's all set now, i thought that was replica count as it is also 4 and 5 :) > > I can see the changes now > >

Re: [ceph-users] rados bench output question

2016-09-06 Thread Christian Balzer
hello, On Tue, 6 Sep 2016 13:38:45 +0200 lists wrote: > Hi Christian, > > Thanks for your reply. > > > What SSD model (be precise)? > Samsung 480GB PM863 SSD > So that's not your culprit then (they are supposed to handle sync writes at full speed). > > Only one SSD? > Yes. With a 5GB

Re: [ceph-users] Changing Replication count

2016-09-06 Thread Vlad Blando
Sorry bout that It's all set now, i thought that was replica count as it is also 4 and 5 :) I can see the changes now [root@controller-node ~]# ceph osd dump | grep 'replicated size' pool 4 'images' replicated size 2 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 1024 pgp_num 1024

Re: [ceph-users] Changing Replication count

2016-09-06 Thread Jeff Bailey
On 9/6/2016 8:41 PM, Vlad Blando wrote: Hi, My replication count now is this [root@controller-node ~]# ceph osd lspools 4 images,5 volumes, Those aren't replica counts they're pool ids. [root@controller-node ~]# and I made adjustment and made it to 3 for images and 2 to volumes to 3,

[ceph-users] Replacing a defective OSD

2016-09-06 Thread Vlad Blando
Hi, I replaced a failed OSD and was trying to add it back to the pool, my problem is that I am not detecting the physical disk. It looks like I need to initialize it via the hardware raid before I can see it on the OS. If I'm going to restart the said server so I can work on the RAID config

[ceph-users] Changing Replication count

2016-09-06 Thread Vlad Blando
Hi, My replication count now is this [root@controller-node ~]# ceph osd lspools 4 images,5 volumes, [root@controller-node ~]# and I made adjustment and made it to 3 for images and 2 to volumes to 3, it's been 30 mins now and the values did not change, how do I know if it was really changed.

Re: [ceph-users] Upgrade steps from Infernalis to Jewel

2016-09-06 Thread Goncalo Borges
Hi Simon. Simple answer is that you can upgrade directly to 10.2.2. We did it from 9.2.0. In cases where you have to pass by an intermediate release, the release notes should be clear about it. Cheers Goncalo From: ceph-users

[ceph-users] PG down, primary OSD no longer exists

2016-09-06 Thread Michael Sudnick
I was wondering if someone could help me recover a PG, a few days ago I had a bunch of disks die in a small home-lab cluster. I removed the disks from their hosts, and rm'ed the OSDs. Now I have a PG stuck down, that will not peer whose acting OSDs (and the primary) are one of the OSDs I had rm'ed

[ceph-users] Upgrade steps from Infernalis to Jewel

2016-09-06 Thread Simion Marius Rad
Hi all, This may be a stupid question but I ask it anyway: When upgrading a live cluster from Infernalis 9.2.1 to Jewel , must I upgrade first to 10.2.0 or can go directly to 10.2.2 ? Thanks, Simion Rad ___ ceph-users mailing list

Re: [ceph-users] Single Threaded performance for Ceph MDS

2016-09-06 Thread Wido den Hollander
> Op 6 september 2016 om 14:43 schreef John Spray : > > > On Tue, Sep 6, 2016 at 1:12 PM, Wido den Hollander wrote: > > Hi, > > > > Recent threads on on the ML revealed that the Ceph MDS can benefit from > > using a fast single threaded CPU. > > > > Some

Re: [ceph-users] objects unfound after repair (issue 15002) in 0.94.8?

2016-09-06 Thread lyt_yudi
hi, osd: objects unfound after repair (fixed by repeering the pg) (issue#15006 , pr#7961 , Jianpeng Ma, Loic Dachary, Kefu Chai) > 在 2016年9月7日,上午12:23,Graham Allan 写道: > > Does anyone know if this

Re: [ceph-users] osd dies with m_filestore_fail_eio without dmesg error

2016-09-06 Thread Ronny Aasen
On 06.09.2016 14:45, Ronny Aasen wrote: On 06. sep. 2016 00:58, Brad Hubbard wrote: On Mon, Sep 05, 2016 at 12:54:40PM +0200, Ronny Aasen wrote: > Hello > > I have a osd that regularly dies on io, especially scrubbing. > normaly i would assume a bad disk, and replace it. but then i normaly

[ceph-users] objects unfound after repair (issue 15002) in 0.94.8?

2016-09-06 Thread Graham Allan
Does anyone know if this issue was corrected in Hammer 0.94.8? http://tracker.ceph.com/issues/15002 It's marked as resolved but I don't see it listed in the release notes. G. -- Graham Allan Systems Researcher - Minnesota Supercomputing Institute - g...@umn.edu

Re: [ceph-users] radosgw error in its log rgw_bucket_sync_user_stats()

2016-09-06 Thread Arvydas Opulskis
It is not over yet. Now if user recreates problematic bucket, it appears, but with same "Access denied" error. Looks, like there are still some corrupted data left about this bucket in Ceph. No problems if user creates new bucket with very similar name. No errors in rgw log on bucket creation were

Re: [ceph-users] RBD Watch Notify for snapshots

2016-09-06 Thread Nick Fisk
Thanks for the hint, I will update my code. > -Original Message- > From: Jason Dillaman [mailto:jdill...@redhat.com] > Sent: 06 September 2016 14:44 > To: Nick Fisk > Cc: ceph-users > Subject: Re: [ceph-users] RBD Watch Notify for snapshots >

Re: [ceph-users] RBD Watch Notify for snapshots

2016-09-06 Thread Jason Dillaman
If you receive a callback to your "watch_notify2_test_errcb" function, you would need to unwatch and rewatch the object. On Tue, Sep 6, 2016 at 8:56 AM, Nick Fisk wrote: > Thanks Jason, > > I've noticed that some of the objects are no longer being watched even though > my

Re: [ceph-users] Single Threaded performance for Ceph MDS

2016-09-06 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of John > Spray > Sent: 06 September 2016 13:44 > To: Wido den Hollander > Cc: ceph-users > Subject: Re: [ceph-users] Single Threaded performance for Ceph MDS

Re: [ceph-users] RBD Watch Notify for snapshots

2016-09-06 Thread Nick Fisk
Thanks Jason, I've noticed that some of the objects are no longer being watched even though my process is still running (as seen by listwatchers), I'm guessing they have dropped off somehow and that the watch is not automatically re-established? I have seen there is a rados_watch_check

Re: [ceph-users] osd dies with m_filestore_fail_eio without dmesg error

2016-09-06 Thread Ronny Aasen
On 06. sep. 2016 00:58, Brad Hubbard wrote: On Mon, Sep 05, 2016 at 12:54:40PM +0200, Ronny Aasen wrote: > Hello > > I have a osd that regularly dies on io, especially scrubbing. > normaly i would assume a bad disk, and replace it. but then i normaly see > messages in dmesg about the device and

[ceph-users] Single Threaded performance for Ceph MDS

2016-09-06 Thread Wido den Hollander
Hi, Recent threads on on the ML revealed that the Ceph MDS can benefit from using a fast single threaded CPU. Some tasks inside the MDS are still single-threaded operations, so the faster the code executes the better. Keeping that in mind I started to look at some benchmarks:

Re: [ceph-users] rados bench output question

2016-09-06 Thread lists
Hi Christian, Thanks for your reply. What SSD model (be precise)? Samsung 480GB PM863 SSD Only one SSD? Yes. With a 5GB partition based journal for each osd. During the 0 MB/sec, there is NO increased cpu usage: it is usually around 15 - 20% for the four ceph-osd processes. Watch your

Re: [ceph-users] rados bench output question

2016-09-06 Thread Christian Balzer
Hello, On Tue, 6 Sep 2016 12:57:55 +0200 lists wrote: > Hi all, > > We're pretty new to ceph, but loving it so far. > > We have a three-node cluster, four 4TB OSDs per node, journal (5GB) on > SSD, 10G ethernet cluster network, 64GB ram on the nodes, total 12 OSDs. > What SSD model (be

[ceph-users] rados bench output question

2016-09-06 Thread lists
Hi all, We're pretty new to ceph, but loving it so far. We have a three-node cluster, four 4TB OSDs per node, journal (5GB) on SSD, 10G ethernet cluster network, 64GB ram on the nodes, total 12 OSDs. We noticed the following output when using ceph bench: root@ceph1:~# rados bench -p

[ceph-users] ceph-mon checksum mismatch after restart of servers

2016-09-06 Thread Hüning , Christian
Hi, I have a ceph-mon problem. My cluster stopped working after I had to restart the machines, which it is installed on. My setup includes 3 hosts, one of which is currently down, but the cluster remained healthy since the other two could build a quorum just fine. No I restarted the hosts,

Re: [ceph-users] RadosGW Error : Error updating periodmap, multiple master zonegroups configured

2016-09-06 Thread Yoann Moulin
Le 06/09/2016 à 11:13, Orit Wasserman a écrit : > you can try: > radosgw-admin zonegroup modify --zonegroup-id --master=false I try but I don't have any zonegroup with this ID listed, the zonegroup with this Id appear only in the zonegroup-map. anyway I can do a zonegroup get --zonegroup-id

Re: [ceph-users] RadosGW Error : Error updating periodmap, multiple master zonegroups configured

2016-09-06 Thread Orit Wasserman
you can try: radosgw-admin zonegroup modify --zonegroup-id --master=false On Tue, Sep 6, 2016 at 11:08 AM, Yoann Moulin wrote: > Hello Orit, > >> you have two (or more) zonegroups that are set as master. > > Yes I know, but I don't know how to fix this > >> First detect

[ceph-users] Ceph hammer with mitaka integration

2016-09-06 Thread Niv Azriel
Hello fellow cephers, I was wondering if there is a better step guide for integrating ceph with openstack mitaka(Specially Swift), since mitaka's keystone have changed alot with all the ferren token, and domains it has been rough trying to integrate our lovely ceph into mitaka. I would be very

Re: [ceph-users] RadosGW Error : Error updating periodmap, multiple master zonegroups configured

2016-09-06 Thread Yoann Moulin
Hello Orit, > you have two (or more) zonegroups that are set as master. Yes I know, but I don't know how to fix this > First detect which zonegroup are the problematic > get zonegroup list by running: radosgw-admin zonegroup list I only see one zonegroup : $ radosgw-admin zonegroup list

Re: [ceph-users] RadosGW Error : Error updating periodmap, multiple master zonegroups configured

2016-09-06 Thread Orit Wasserman
Hi Yoann, you have two (or more) zonegroups that are set as master. First detect which zonegroup are the problematic get zonegroup list by running: radosgw-admin zonegroup list than on each zonegroup run: radosgw-admin zonegroup get --rgw-zonegroup see in which is_master is true. Now you need to

[ceph-users] RadosGW Error : Error updating periodmap, multiple master zonegroups configured

2016-09-06 Thread Yoann Moulin
Dear List, I have an issue with my radosGW. ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374) Linux cluster002 4.2.0-42-generic #49~14.04.1-Ubuntu SMP Wed Jun 29 20:22:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Ubuntu 16.04 LTS > $ ceph -s > cluster