Re: [ceph-users] Sudden RADOS Gateway issues caused by missing xattrs

2014-02-17 Thread Wido den Hollander

On 02/16/2014 09:22 PM, Sage Weil wrote:

Hi Wido,

On Sun, 16 Feb 2014, Wido den Hollander wrote:

On 02/16/2014 06:49 PM, Gregory Farnum wrote:

Did you maybe upgrade that box to v0.67.6? This sounds like one of the
bugs Sage mentioned in it.


No, I checked it again. Version is: ceph version 0.67.5
(a60ac9194718083a4b6a225fc17cad6096c69bd1)

All machines in the cluster are on that version.


Are you sure none of the running ceph-osd processes aren't 0.67.6?  Maybe
check 'ceph daemon osd.NNN version'...



I double-verified it again, but they are all running 0.67.5

Since for example osd.25 is down right now I can't run 'ceph daemon', 
but the md5sum of /usr/bin/ceph-osd is the same as on the other machines 
which are all on 0.67.5


Auto updates with Apt are not enabled, so there is no way these machines 
could be running 0.67.6


So I'm still confused.

Wido


sage




Wido


-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Sun, Feb 16, 2014 at 4:23 AM, Wido den Hollander w...@42on.com wrote:

Hi,

Yesterday I got a notification that a RGW setup was having issues with
objects suddenly giving errors (403 and 404) when trying to access them.

I started digging and after cranking up the logs with 'debug rados' and
'debug rgw' set to 20 I found what caused RGW to throw a error:

librados: Objecter returned from getxattrs r=-2

Using ceph osd map .rgw.buckets object I found which OSDs were primary
for that object's PG and I saw that they all came from one machine which
got
a clean shutdown and start just 24 hours before that.

After taking that machine out of production the other OSDs took over and
RGW
started serving the objects again, but I'm confused.

The underlying filesystem is XFS and all 6 filesystems were clean and
healthy. Like I said, the machine only got a clean shutdown 24 hours
before
that due to a physical migration, but that's all.

Did anybody see this before? Suddenly the xattrs for those objects were
gone.

This was with Ceph 0.67.5

--
Wido den Hollander
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
Wido den Hollander
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





--
Wido den Hollander
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Block Devices and OpenStack

2014-02-17 Thread Sebastien Han
Hi,

Can I see your ceph.conf?
I suspect that [client.cinder] and [client.glance] sections are missing.

Cheers.
 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 16 Feb 2014, at 06:55, Ashish Chandra mail.ashishchan...@gmail.com wrote:

 Hi Jean,
 
 Here is the output for ceph auth list for client.cinder
 
 client.cinder
 key: AQCKaP9ScNgiMBAAwWjFnyL69rBfMzQRSHOfoQ==
 caps: [mon] allow r
 caps: [osd] allow class-read object_prefix rbd_children, allow rwx 
 pool=volumes, allow rx pool=images
 
 
 Here is the output of ceph -s:
 
 ashish@ceph-client:~$ ceph -s
 cluster afa13fcd-f662-4778-8389-85047645d034
  health HEALTH_OK
  monmap e1: 1 mons at {ceph-node1=10.0.1.11:6789/0}, election epoch 1, 
 quorum 0 ceph-node1
  osdmap e37: 3 osds: 3 up, 3 in
   pgmap v84: 576 pgs, 6 pools, 0 bytes data, 0 objects
 106 MB used, 9076 MB / 9182 MB avail
  576 active+clean
 
 I created all the keyrings and copied as suggested by the guide.
 
 
 
 
 
 
 On Sun, Feb 16, 2014 at 3:08 AM, Jean-Charles LOPEZ jc.lo...@inktank.com 
 wrote:
 Hi,
 
 what do you get when you run a 'ceph auth list' command for the user name 
 (client.cinder) you created for cinder? Are the caps and the key for this 
 user correct? No typo in the hostname in the cinder.conf file (host=) ? Did 
 you copy the keyring to the cinder running cinder (can’t really say from your 
 output and there is no ceph-s command to check the monitor names)?
 
 It could just be a typo in the ceph auth get-or-create command that’s causing 
 it.
 
 Rgds
 JC
 
 
 
 On Feb 15, 2014, at 10:35, Ashish Chandra mail.ashishchan...@gmail.com 
 wrote:
 
 Hi Cephers,
 
 I am trying to configure ceph rbd as backend for cinder and glance by 
 following the steps mentioned in:
 
 http://ceph.com/docs/master/rbd/rbd-openstack/
 
 Before I start all openstack services are running normally and ceph cluster 
 health shows HEALTH_OK
 
 But once I am done with all steps and restart openstack services, 
 cinder-volume fails to start and throws an error.
 
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd Traceback (most 
 recent call last):
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd   File 
 /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 262, in 
 check_for_setup_error
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd with 
 RADOSClient(self):
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd   File 
 /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 234, in __init__
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd self.cluster, 
 self.ioctx = driver._connect_to_rados(pool)
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd   File 
 /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 282, in 
 _connect_to_rados
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd client.connect()
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd   File 
 /usr/lib/python2.7/dist-packages/rados.py, line 185, in connect
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd raise 
 make_ex(ret, error calling connect)
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd Error: error calling 
 connect: error code 95
 2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd
 2014-02-16 00:01:42.591 ERROR cinder.volume.manager 
 [req-8134a4d7-53f8-4ada-b4b5-4d96d7cad4bc None None] Error encountered 
 during initialization of driver: RBDDriver
 2014-02-16 00:01:42.592 ERROR cinder.volume.manager 
 [req-8134a4d7-53f8-4ada-b4b5-4d96d7cad4bc None None] Bad or unexpected 
 response from the storage volume backend API: error connecting to ceph 
 cluster
 2014-02-16 00:01:42.592 TRACE cinder.volume.manager Traceback (most recent 
 call last):
 2014-02-16 00:01:42.592 TRACE cinder.volume.manager   File 
 /opt/stack/cinder/cinder/volume/manager.py, line 190, in init_host
 2014-02-16 00:01:42.592 TRACE cinder.volume.manager 
 self.driver.check_for_setup_error()
 2014-02-16 00:01:42.592 TRACE cinder.volume.manager   File 
 /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 267, in 
 check_for_setup_error
 2014-02-16 00:01:42.592 TRACE cinder.volume.manager raise 
 exception.VolumeBackendAPIException(data=msg)
 2014-02-16 00:01:42.592 TRACE cinder.volume.manager 
 VolumeBackendAPIException: Bad or unexpected response from the storage 
 volume backend API: error connecting to ceph cluster
 
 
 Here is the content of my /etc/ceph in openstack node: 
 
 ashish@ubuntu:/etc/ceph$ ls -lrt
 total 16
 -rw-r--r-- 1 cinder cinder 229 Feb 15 23:45 ceph.conf
 -rw-r--r-- 1 glance glance  65 Feb 15 23:46 ceph.client.glance.keyring
 -rw-r--r-- 1 cinder cinder  65 Feb 15 23:47 ceph.client.cinder.keyring
 -rw-r--r-- 1 cinder cinder  72 Feb 15 23:47 

[ceph-users] Journal thoughts

2014-02-17 Thread Alex Pearson
Hi All,
I've been looking, but haven't been able to find any detailed documentation 
about the journal usage on OSDs.  Does anyone have any detailed docs they could 
share?  My initial questions are:

Is the journal always write-only? (except under recovery)
I'm using BTRFS, in the default layout, which I'm thinking is very inefficient 
as it basically forces the discs to seek all the time.  (journal partition at 
start of disc)
Is there a documented process to relocate the journal, without re-creating the 
OSD?
What have other people done to optimize the journal without purchasing SSD's?


On another point, I'm running on HP Microservers (slow CPU - two cores) with 5 
discs - 1x OS, 4x OSD... I've currently got separate OSDs, however have high 
load due to having more OSD's than cores in the box.  I'm thinking of JBOD'ing 
the OSD discs into pairs using LVM (different sized disks) so I have only two 
OSD's, does anyone have any opinions on the merits of this?

Also has anyone seen any CPU usage comparisons of XFS vs EXT4 vs BTRFS?

Obviously I know I'm running an enterprise system on a shoe string, however I'm 
keen to use this as a test bed to get comfortable with ceph before recommending 
it in a real production environment, and I think optimizing and understanding 
it here could have great benefits when I scale out.

Lots of questsions, and as ever any insight would be appreciated on any of the 
points!

Regards

Alex

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Error Adding Keyring Entries

2014-02-17 Thread Georgios Dimitrakakis
Could someone help me with the following error when I try to add 
keyring entries:


# ceph -k /etc/ceph/ceph.client.admin.keyring auth add 
client.radosgw.gateway -i /etc/ceph/keyring.radosgw.gateway
Error EINVAL: entity client.radosgw.gateway exists but key does not 
match

#

Best,

G.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ReAsk: how to tell ceph-mon to listen on a specific address only

2014-02-17 Thread Kurt Bauer
Hi,

that's by design. The monitor always listens to the public side only, if
a public network is defined. If you want everything in the cluster
network, just don't specify a seperate public/cluster network. But
that's all documented in great detail at
http://ceph.com/docs/master/rados/configuration/network-config-ref/

Best regards,
Kurt

 Ron Gage mailto:r...@rongage.org
 16. Februar 2014 19:57
 Hi everyone:

 I am still trying unsuccessfully to implement a test array for a POC. 
 It is still failing to set up - specifically, the admin keyring is not
 getting set up.

 Setup is 4 x OSD, 1 x Mon/Mgr.  The Mon machine is the only one that
 is multi-homed - eth0 on a private subnet for internal Ceph
 communications and eth1 on a so-called public interface.  The
 problem is that ceph-mon is creating the listener on the public
 interface and ceph-deploy is trying to talk to it on the private
 interface.

 [ceph@cm my-cluster]$ sudo netstat -ln
 Active Internet connections (only servers)
 Proto Recv-Q Send-Q Local Address   Foreign
 Address State
 tcp0  0 0.0.0.0:22 0.0.0.0:*   LISTEN
 tcp0  0 172.24.12.91:6789 0.0.0.0:*   LISTEN
 tcp0  0 :::22 :::*LISTEN
 udp0  0 0.0.0.0:68 0.0.0.0:*
 Active UNIX domain sockets (only servers)
 Proto RefCnt Flags   Type   State I-Node Path
 unix  2  [ ACC ] STREAM LISTENING 6706
 @/com/ubuntu/upstart
 unix  2  [ ACC ] STREAM LISTENING 20490
 /var/run/ceph/ceph-mon.cm.asok
 [ceph@cm my-cluster]$ cat ceph.conf
 [global]
 auth_service_required = cephx
 filestore_xattr_use_omap = true
 auth_client_required = cephx
 auth_cluster_required = cephx
 mon_host = 10.0.0.6
 mon_initial_members = cm
 fsid = a7e0fd33-1f75-46f0-be00-152601c8fbf2
 mon host = 10.0.0.6
 cluster network = 10.0.0.0/24
 public network = 172.24.0.0/16
 debug mon = 10
 debug ms = 1

 [mon.a]
 host = cm
 mon addr = 10.0.0.6:6789


 [ceph@cm my-cluster]$


 How can I fix this?

 Thanks!

 Ron

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


smime.p7s
Description: S/MIME Cryptographic Signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Error Adding Keyring Entries

2014-02-17 Thread Georgios Dimitrakakis
I managed to solve my problem by deleting the key from the list and 
re-adding it!


Best,

G.

On Mon, 17 Feb 2014 10:46:36 +0200, Georgios Dimitrakakis wrote:

Could someone help me with the following error when I try to add
keyring entries:

# ceph -k /etc/ceph/ceph.client.admin.keyring auth add
client.radosgw.gateway -i /etc/ceph/keyring.radosgw.gateway
Error EINVAL: entity client.radosgw.gateway exists but key does not 
match

#

Best,

G.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


--
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem starting RADOS Gateway

2014-02-17 Thread Georgios Dimitrakakis

Could someone check this: http://pastebin.com/DsCh5YPm

and let me know what am I doing wrong?


Best,

G.

On Sat, 15 Feb 2014 20:27:16 +0200, Georgios Dimitrakakis wrote:

1) ceph -s is working as expected

# ceph -s
cluster c465bdb2-e0a5-49c8-8305-efb4234ac88a
 health HEALTH_OK
 monmap e1: 1 mons at {master=192.168.0.10:6789/0}, election
epoch 1, quorum 0 master
 mdsmap e111: 1/1/1 up {0=master=up:active}
 osdmap e114: 2 osds: 2 up, 2 in
  pgmap v414: 1200 pgs, 14 pools, 10596 bytes data, 67 objects
500 GB used, 1134 GB / 1722 GB avail
1200 active+clean


2) In /etc/ceph I have the following files

# ls -l
total 20
-rw-r--r-- 1 root root  64 Feb 14 17:10 ceph.client.admin.keyring
-rw-r--r-- 1 root root 401 Feb 15 16:57 ceph.conf
-rw-r--r-- 1 root root 196 Feb 14 20:26 ceph.log
-rw-r--r-- 1 root root 120 Feb 15 11:08 keyring.radosgw.gateway
-rwxr-xr-x 1 root root  92 Dec 21 00:47 rbdmap

3) ceph.conf content is the following

# cat ceph.conf
[global]
auth_service_required = cephx
filestore_xattr_use_omap = true
auth_client_required = cephx
auth_cluster_required = cephx
mon_host = 192.168.0.10
mon_initial_members = master
fsid = c465bdb2-e0a5-49c8-8305-efb4234ac88a

[client.radosgw.gateway]
host = master
keyring = /etc/ceph/keyring.radosgw.gateway
rgw socket path = /tmp/radosgw.sock
log file = /var/log/ceph/radosgw.log


4) And all the keys that exist are the following:

# ceph auth list
installed auth entries:

mds.master
key: xx==
caps: [mds] allow
caps: [mon] allow profile mds
caps: [osd] allow rwx
osd.0
key: xx==
caps: [mon] allow profile osd
caps: [osd] allow *
osd.1
key: xx==
caps: [mon] allow profile osd
caps: [osd] allow *
client.admin
key: xx==
caps: [mds] allow
caps: [mon] allow *
caps: [osd] allow *
client.bootstrap-mds
key: xx==
caps: [mon] allow profile bootstrap-mds
client.bootstrap-osd
key: AQBWLf5SGBAyBRAAzLwi5OXsAuR5vdo8hs+2zw==
caps: [mon] allow profile bootstrap-osd
client.radosgw.gateway
key: xx==
caps: [mon] allow rw
caps: [osd] allow rwx



I still don't get what is wrong...

G.

On Sat, 15 Feb 2014 16:27:41 +0100, Udo Lembke wrote:

Hi,
does ceph -s also stuck on missing keyring?

Do you have an keyring like:
cat /etc/ceph/keyring
[client.admin]
key = AQCdkHZR2NBYMBAATe/rqIwCI96LTuyS3gmMXp==

Or do you have anothe defined keyring in ceph.conf?
global-section - keyring = /etc/ceph/keyring

The key is in ceph - see
ceph auth get-key client.admin
AQCdkHZR2NBYMBAATe/rqIwCI96LTuyS3gmMXp==

or ceph auth list for all keys.
Key-genaration is doing by get-or-create key like this (but in this 
case

for bootstap-osd):
ceph auth get-or-create-key client.bootstrap-osd mon allow profile
bootstrap-osd

Udo

On 15.02.2014 15:35, Georgios Dimitrakakis wrote:

Dear all,

I am following this guide 
http://ceph.com/docs/master/radosgw/config/

to setup Object Storage on CentOS 6.5.

My problem is that when I try to start the service as indicated 
here:


http://ceph.com/docs/master/radosgw/config/#restart-services-and-start-the-gateway


I get nothing

# service ceph-radosgw start
Starting radosgw instance(s)...

and if I check if the service is running obviously it is not!

# service ceph-radosgw status
/usr/bin/radosgw is not running.


If I try to start it manually without using the service command I 
get

the following:

# /usr/bin/radosgw -d -c /etc/ceph/ceph.conf --debug_ms 10
2014-02-15 16:03:38.709235 7fb65ba64820  0 ceph version 0.72.2
(a913ded2ff138aefb8cb84d347d72164099cfd60), process radosgw, pid 
24619

2014-02-15 16:03:38.709249 7fb65ba64820 -1 WARNING: libcurl doesn't
support curl_multi_wait()
2014-02-15 16:03:38.709252 7fb65ba64820 -1 WARNING: cross zone /
region transfer performance may be affected
2014-02-15 16:03:38.713898 7fb65ba64820 10 -- :/0 ready :/0
2014-02-15 16:03:38.714323 7fb65ba64820  1 -- :/0 messenger.start
2014-02-15 16:03:38.714434 7fb65ba64820 -1 monclient(hunting): 
ERROR:

missing keyring, cannot use cephx for authentication
2014-02-15 16:03:38.714440 7fb65ba64820  0 librados: client.admin
initialization error (2) No such file or directory
2014-02-15 16:03:38.714463 7fb65ba64820 10 -- :/1024619 shutdown
:/1024619
2014-02-15 16:03:38.714468 7fb65ba64820  1 -- :/1024619 
mark_down_all
2014-02-15 16:03:38.714477 7fb65ba64820 10 -- :/1024619 wait: 
waiting

for dispatch queue
2014-02-15 16:03:38.714406 7fb64b5fe700 10 -- :/1024619 
reaper_entry

start
2014-02-15 16:03:38.714506 7fb64b5fe700 10 -- :/1024619 reaper
2014-02-15 16:03:38.714522 7fb64b5fe700 10 -- :/1024619 reaper done
2014-02-15 16:03:38.714764 7fb65ba64820 10 -- 

[ceph-users] Important note for sender / Важное сообщение для отправителя (was: ceph-users Digest, Vol 13, Issue 14)

2014-02-17 Thread kudryavtsev_ia

  
  
Dear sender, If you wish I read and respond to this e-mail for sure,
please, build subject like

KUDRYAVTSEV/Who wrote/Subject.

for example, 

KUDRYAVTSEV/Bitworks/Some subject there...

Best wishes, Ivan Kudryavtsev
  

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] eu.ceph.com rsync issues resolved

2014-02-17 Thread Wido den Hollander

Hi all,

I just noticed that eu.ceph.com had some stale data since rsync wasn't 
running with the --delete option.


I've just added it to the sync script and it's syncing right now, 
shouldn't take that much time and should finish within the hour.


Btw, nice to see that ceph.com now also has a -record, that means 
that both locations are available through IPv6 :-)


--
Wido den Hollander
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] slow requests from rados bench with small writes

2014-02-17 Thread Wido den Hollander

On 02/16/2014 05:18 PM, Sage Weil wrote:

Good catch!

It sounds like what is needed here is for the deb and rpm packages to add
/var/lib/ceph to the PRUNEPATHS in /etc/updatedb.conf.  Unfortunately
there isn't a /etc/updatedb.conf.d type file, so that promises to be
annoying.

Has anyone done this before?



No, I haven't, but I've seen this before. With Puppet I also overwrite 
this file.


Btw, I suggest we also contact Canonical to add 'ceph' to PRUNEFS, 
otherwise clients will start indexing CephFS filesystems later.


Wido


sage


On Sun, 16 Feb 2014, Dan van der Ster wrote:


After some further digging I realized that updatedb was running over
the pgs, indexing all the objects. (According to iostat, updatedb was
keeping the indexed disk 100% busy!) Oops!
Since the disks are using the deadline elevator (which by default
prioritizes reads over writes, and gives writes a deadline of 5
seconds!), it is perhaps conceivable (yet still surprising) that the
queues on a few disks were so full of reads that the writes were
starved for many 10s of seconds.

I've killed updatedb everywhere and now the rados bench below isn't
triggering slow requests.
So now I'm planning to tune deadline so it doesn't prioritize reads so
much, namely by decreasing write_expire to equal read_expire at 500ms,
and setting writes_starved to 1. Initial tests are showing that this
further decreases latency a bit -- but my hope is that this will
eliminate the possibility of a very long tail of writes. I hope that
someone will chip in if they've already been down this path and has
advice/warnings.

Cheers,
dan

-- Dan van der Ster || Data  Storage Services || CERN IT Department --

On Sat, Feb 15, 2014 at 11:48 PM, Dan van der Ster
daniel.vanders...@cern.ch wrote:

Dear Ceph experts,

We've found that a single client running rados bench can drive other
users, ex. RBD users, into slow requests.

Starting with a cluster that is not particularly busy, e.g. :

2014-02-15 23:14:33.714085 mon.0 xx:6789/0 725224 : [INF] pgmap
v6561996: 27952 pgs: 27952 active+clean; 66303 GB data, 224 TB used,
2850 TB / 3075 TB avail; 4880KB
/s rd, 28632KB/s wr, 271op/s

We then start a rados bench writing many small objects:
rados bench -p test 60 write -t 500 -b 1024 --no-cleanup

which gives these results (note the 60s max latency!!):

Total time run: 86.351424
Total writes made: 91425
Write size: 1024
Bandwidth (MB/sec): 1.034
Stddev Bandwidth: 1.26486
Max bandwidth (MB/sec): 7.14941
Min bandwidth (MB/sec): 0
Average Latency: 0.464847
Stddev Latency: 3.04961
Max latency: 66.4363
Min latency: 0.003188

30 seconds into this bench we start seeing slow requests, not only
from bench writes but also some poor RBD clients, e.g.:

2014-02-15 23:16:02.820507 osd.483 xx:6804/46799 2201 : [WRN] slow
request 30.195634 seconds old, received at 2014-02-15 23:15:32.624641:
osd_sub_op(client.18535427.0:3922272 4.d42
4eb00d42/rbd_data.11371325138b774.6577/head//4 [] v
42083'71453 snapset=0=[]:[] snapc=0=[]) v7 currently commit sent

During a longer, many-hour instance of this small write test, some of
these RBD slow writes became very user visible, with disk flushes
being blocked long enough (120s) for the VM kernels to start
complaining.

A rados bench from a 10Gig-e client writing 4MB objects doesn't have
the same long tail of latency, namely:

# rados bench -p test 60 write -t 500 --no-cleanup
...
Total time run: 62.811466
Total writes made: 8553
Write size: 4194304
Bandwidth (MB/sec): 544.678

Stddev Bandwidth: 173.163
Max bandwidth (MB/sec): 1000
Min bandwidth (MB/sec): 0
Average Latency: 3.50719
Stddev Latency: 0.309876
Max latency: 8.04493
Min latency: 0.166138

and there are zero slow requests, at least during this 60s duration.

While the vast majority of small writes are completing with a
reasonable sub-second latency, what is causing the very long tail seen
by a few writes?? -- 60-120s!! Can someone advise us where to look in
the perf dump, etc... to find which resource/queue is being exhausted
during these tests?

Oh yeah, we're running latest dumpling stable, 0.67.5, on the servers.

Best Regards, Thanks in advance!
Dan

-- Dan van der Ster || Data  Storage Services || CERN IT Department --

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Wido den Hollander
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Unable to add monitor nodes

2014-02-17 Thread yalla.gnan.kumar
Hi,

I am new user of ceph. I have installed a three node cluster following the ceph 
document. I have added OSDs and initial monitor.
But while adding additional monitors, I am receiving this error as shown below.


user1@cephadmin:~/my-cluster$ ceph-deploy mon create   cephnode2
[ceph_deploy.cli][INFO  ] Invoked (1.3.5): /usr/bin/ceph-deploy mon create 
cephnode2
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts cephnode2
[ceph_deploy.mon][DEBUG ] detecting platform for host cephnode2 ...
[cephnode2][DEBUG ] connected to host: cephnode2
[cephnode2][DEBUG ] detect platform information from remote host
[cephnode2][DEBUG ] detect machine type
[ceph_deploy.mon][INFO  ] distro info: Ubuntu 12.04 precise
[cephnode2][DEBUG ] determining if provided host has same hostname in remote
[cephnode2][DEBUG ] get remote short hostname
[cephnode2][DEBUG ] deploying mon to cephnode2
[cephnode2][DEBUG ] get remote short hostname
[cephnode2][DEBUG ] remote hostname: cephnode2
[cephnode2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[cephnode2][DEBUG ] create the mon path if it does not exist
[cephnode2][DEBUG ] checking for done path: 
/var/lib/ceph/mon/ceph-cephnode2/done
[cephnode2][DEBUG ] done path does not exist: 
/var/lib/ceph/mon/ceph-cephnode2/done
[cephnode2][INFO  ] creating keyring file: 
/var/lib/ceph/tmp/ceph-cephnode2.mon.keyring
[cephnode2][DEBUG ] create the monitor keyring file
[cephnode2][INFO  ] Running command: sudo ceph-mon --cluster ceph --mkfs -i 
cephnode2 --keyring /var/lib/ceph/tmp/ceph-cephnode2.mon.keyring
[cephnode2][DEBUG ] ceph-mon: set fsid to b3d4e423-25a2-4380-8595-8b3fae4f8806
[cephnode2][DEBUG ] ceph-mon: created monfs at /var/lib/ceph/mon/ceph-cephnode2 
for mon.cephnode2
[cephnode2][INFO  ] unlinking keyring file 
/var/lib/ceph/tmp/ceph-cephnode2.mon.keyring
[cephnode2][DEBUG ] create a done file to avoid re-doing the mon deployment
[cephnode2][DEBUG ] create the init path if it does not exist
[cephnode2][DEBUG ] locating the `service` executable...
[cephnode2][INFO  ] Running command: sudo initctl emit ceph-mon cluster=ceph 
id=cephnode2
[cephnode2][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.cephnode2.asok mon_status
[cephnode2][ERROR ] admin_socket: exception getting command descriptions: 
[Errno 2] No such file or directory
[cephnode2][WARNIN] monitor: mon.cephnode2, might not be running yet
[cephnode2][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.cephnode2.asok mon_status
[cephnode2][ERROR ] admin_socket: exception getting command descriptions: 
[Errno 2] No such file or directory
[cephnode2][WARNIN] cephnode2 is not defined in `mon initial members`
[cephnode2][WARNIN] monitor cephnode2 does not exist in monmap
[cephnode2][WARNIN] neither `public_addr` nor `public_network` keys are defined 
for monitors
[cephnode2][WARNIN] monitors may not be able to form quorum
-

What is the error about ?


Thanks
Kumar



This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited. Where allowed by local law, electronic 
communications with Accenture and its affiliates, including e-mail and instant 
messaging (including content), may be scanned by our systems for the purposes 
of information security and assessment of internal compliance with Accenture 
policy. .
__

www.accenture.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Unable to add monitor nodes

2014-02-17 Thread German Anders
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Block Devices and OpenStack

2014-02-17 Thread Ashish Chandra
Hi Sebastian, Jean;

This is my ceph.conf looks like. It was auto generated using ceph-deploy.

[global]
fsid = afa13fcd-f662-4778-8389-85047645d034
mon_initial_members = ceph-node1
mon_host = 10.0.1.11
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true

If I provide admin.keyring file to openstack node (in /etc/ceph) it works
fine and issue is gone .

Thanks

Ashish


On Mon, Feb 17, 2014 at 2:03 PM, Sebastien Han
sebastien@enovance.comwrote:

 Hi,

 Can I see your ceph.conf?
 I suspect that [client.cinder] and [client.glance] sections are missing.

 Cheers.
 
 Sébastien Han
 Cloud Engineer

 Always give 100%. Unless you're giving blood.

 Phone: +33 (0)1 49 70 99 72
 Mail: sebastien@enovance.com
 Address : 10, rue de la Victoire - 75009 Paris
 Web : www.enovance.com - Twitter : @enovance

 On 16 Feb 2014, at 06:55, Ashish Chandra mail.ashishchan...@gmail.com
 wrote:

  Hi Jean,
 
  Here is the output for ceph auth list for client.cinder
 
  client.cinder
  key: AQCKaP9ScNgiMBAAwWjFnyL69rBfMzQRSHOfoQ==
  caps: [mon] allow r
  caps: [osd] allow class-read object_prefix rbd_children, allow
 rwx pool=volumes, allow rx pool=images
 
 
  Here is the output of ceph -s:
 
  ashish@ceph-client:~$ ceph -s
  cluster afa13fcd-f662-4778-8389-85047645d034
   health HEALTH_OK
   monmap e1: 1 mons at {ceph-node1=10.0.1.11:6789/0}, election epoch
 1, quorum 0 ceph-node1
   osdmap e37: 3 osds: 3 up, 3 in
pgmap v84: 576 pgs, 6 pools, 0 bytes data, 0 objects
  106 MB used, 9076 MB / 9182 MB avail
   576 active+clean
 
  I created all the keyrings and copied as suggested by the guide.
 
 
 
 
 
 
  On Sun, Feb 16, 2014 at 3:08 AM, Jean-Charles LOPEZ 
 jc.lo...@inktank.com wrote:
  Hi,
 
  what do you get when you run a 'ceph auth list' command for the user
 name (client.cinder) you created for cinder? Are the caps and the key for
 this user correct? No typo in the hostname in the cinder.conf file (host=)
 ? Did you copy the keyring to the cinder running cinder (can't really say
 from your output and there is no ceph-s command to check the monitor names)?
 
  It could just be a typo in the ceph auth get-or-create command that's
 causing it.
 
  Rgds
  JC
 
 
 
  On Feb 15, 2014, at 10:35, Ashish Chandra mail.ashishchan...@gmail.com
 wrote:
 
  Hi Cephers,
 
  I am trying to configure ceph rbd as backend for cinder and glance by
 following the steps mentioned in:
 
  http://ceph.com/docs/master/rbd/rbd-openstack/
 
  Before I start all openstack services are running normally and ceph
 cluster health shows HEALTH_OK
 
  But once I am done with all steps and restart openstack services,
 cinder-volume fails to start and throws an error.
 
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd Traceback (most
 recent call last):
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd   File
 /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 262, in
 check_for_setup_error
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd with
 RADOSClient(self):
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd   File
 /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 234, in __init__
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd
 self.cluster, self.ioctx = driver._connect_to_rados(pool)
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd   File
 /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 282, in
 _connect_to_rados
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd
 client.connect()
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd   File
 /usr/lib/python2.7/dist-packages/rados.py, line 185, in connect
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd raise
 make_ex(ret, error calling connect)
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd Error: error
 calling connect: error code 95
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd
  2014-02-16 00:01:42.591 ERROR cinder.volume.manager
 [req-8134a4d7-53f8-4ada-b4b5-4d96d7cad4bc None None] Error encountered
 during initialization of driver: RBDDriver
  2014-02-16 00:01:42.592 ERROR cinder.volume.manager
 [req-8134a4d7-53f8-4ada-b4b5-4d96d7cad4bc None None] Bad or unexpected
 response from the storage volume backend API: error connecting to ceph
 cluster
  2014-02-16 00:01:42.592 TRACE cinder.volume.manager Traceback (most
 recent call last):
  2014-02-16 00:01:42.592 TRACE cinder.volume.manager   File
 /opt/stack/cinder/cinder/volume/manager.py, line 190, in init_host
  2014-02-16 00:01:42.592 TRACE cinder.volume.manager
 self.driver.check_for_setup_error()
  2014-02-16 00:01:42.592 TRACE cinder.volume.manager   File
 /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 267, in
 check_for_setup_error
  2014-02-16 00:01:42.592 TRACE cinder.volume.manager raise
 exception.VolumeBackendAPIException(data=msg)
  2014-02-16 00:01:42.592 TRACE 

Re: [ceph-users] Block Devices and OpenStack

2014-02-17 Thread Sebastien Han
Hi,

If cinder-volume fails to connect and putting the admin keyring works it means 
that cinder is not configured properly.
Please also try to add the following:

[client.cinder]
keyring =  path-to-keyring

Same for Glance.

Btw: ceph.conf doesn’t need to be own by Cinder, just let mod +r and keep root 
as owner.

 
Sébastien Han 
Cloud Engineer 

Always give 100%. Unless you're giving blood.” 

Phone: +33 (0)1 49 70 99 72 
Mail: sebastien@enovance.com 
Address : 10, rue de la Victoire - 75009 Paris 
Web : www.enovance.com - Twitter : @enovance 

On 17 Feb 2014, at 14:48, Ashish Chandra mail.ashishchan...@gmail.com wrote:

 Hi Sebastian, Jean;
 
 This is my ceph.conf looks like. It was auto generated using ceph-deploy.
 
 [global]
 fsid = afa13fcd-f662-4778-8389-85047645d034
 mon_initial_members = ceph-node1
 mon_host = 10.0.1.11
 auth_cluster_required = cephx
 auth_service_required = cephx
 auth_client_required = cephx
 filestore_xattr_use_omap = true
 
 If I provide admin.keyring file to openstack node (in /etc/ceph) it works 
 fine and issue is gone .
 
 Thanks 
 
 Ashish
 
 
 On Mon, Feb 17, 2014 at 2:03 PM, Sebastien Han sebastien@enovance.com 
 wrote:
 Hi,
 
 Can I see your ceph.conf?
 I suspect that [client.cinder] and [client.glance] sections are missing.
 
 Cheers.
 
 Sébastien Han
 Cloud Engineer
 
 Always give 100%. Unless you're giving blood.”
 
 Phone: +33 (0)1 49 70 99 72
 Mail: sebastien@enovance.com
 Address : 10, rue de la Victoire - 75009 Paris
 Web : www.enovance.com - Twitter : @enovance
 
 On 16 Feb 2014, at 06:55, Ashish Chandra mail.ashishchan...@gmail.com wrote:
 
  Hi Jean,
 
  Here is the output for ceph auth list for client.cinder
 
  client.cinder
  key: AQCKaP9ScNgiMBAAwWjFnyL69rBfMzQRSHOfoQ==
  caps: [mon] allow r
  caps: [osd] allow class-read object_prefix rbd_children, allow rwx 
  pool=volumes, allow rx pool=images
 
 
  Here is the output of ceph -s:
 
  ashish@ceph-client:~$ ceph -s
  cluster afa13fcd-f662-4778-8389-85047645d034
   health HEALTH_OK
   monmap e1: 1 mons at {ceph-node1=10.0.1.11:6789/0}, election epoch 1, 
  quorum 0 ceph-node1
   osdmap e37: 3 osds: 3 up, 3 in
pgmap v84: 576 pgs, 6 pools, 0 bytes data, 0 objects
  106 MB used, 9076 MB / 9182 MB avail
   576 active+clean
 
  I created all the keyrings and copied as suggested by the guide.
 
 
 
 
 
 
  On Sun, Feb 16, 2014 at 3:08 AM, Jean-Charles LOPEZ jc.lo...@inktank.com 
  wrote:
  Hi,
 
  what do you get when you run a 'ceph auth list' command for the user name 
  (client.cinder) you created for cinder? Are the caps and the key for this 
  user correct? No typo in the hostname in the cinder.conf file (host=) ? Did 
  you copy the keyring to the cinder running cinder (can’t really say from 
  your output and there is no ceph-s command to check the monitor names)?
 
  It could just be a typo in the ceph auth get-or-create command that’s 
  causing it.
 
  Rgds
  JC
 
 
 
  On Feb 15, 2014, at 10:35, Ashish Chandra mail.ashishchan...@gmail.com 
  wrote:
 
  Hi Cephers,
 
  I am trying to configure ceph rbd as backend for cinder and glance by 
  following the steps mentioned in:
 
  http://ceph.com/docs/master/rbd/rbd-openstack/
 
  Before I start all openstack services are running normally and ceph 
  cluster health shows HEALTH_OK
 
  But once I am done with all steps and restart openstack services, 
  cinder-volume fails to start and throws an error.
 
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd Traceback (most 
  recent call last):
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd   File 
  /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 262, in 
  check_for_setup_error
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd with 
  RADOSClient(self):
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd   File 
  /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 234, in __init__
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd self.cluster, 
  self.ioctx = driver._connect_to_rados(pool)
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd   File 
  /opt/stack/cinder/cinder/volume/drivers/rbd.py, line 282, in 
  _connect_to_rados
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd 
  client.connect()
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd   File 
  /usr/lib/python2.7/dist-packages/rados.py, line 185, in connect
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd raise 
  make_ex(ret, error calling connect)
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd Error: error 
  calling connect: error code 95
  2014-02-16 00:01:42.582 TRACE cinder.volume.drivers.rbd
  2014-02-16 00:01:42.591 ERROR cinder.volume.manager 
  [req-8134a4d7-53f8-4ada-b4b5-4d96d7cad4bc None None] Error encountered 
  during initialization of driver: RBDDriver
  2014-02-16 00:01:42.592 ERROR cinder.volume.manager 
  

[ceph-users] ceph hadoop using ambari

2014-02-17 Thread Kesten Broughton
I posted this to ceph-devel-owner before seeing that this is the correct place 
to post.

My company is trying to evaluate virtualized hdfs clusters using ceph as a
drop-in replacement for staging and development
following http://ceph.com/docs/master/cephfs/hadoop/.  We deploy clusters
with ambari 1.3.2.

I spun up a 10 node cluster with 3 datanodes, name, secondary, 3
zookeepers, ambari master, and accumulo master.

Our process is
This was likely the cause of shutdown errors.  Should do
1. Run ambari install
2. shut down all ambari services
3. push modified core-site.xml to datanodes, name, secondary
4. restart ambari services

I am getting errors
/usr/lib/hadoop/bin/hadoop-daemon.sh: Permission denied

in the ambari console error log from the command:
su - hdfs -c  'export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec 
/usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start
datanode'


I think this is an ambari issue, but I¹m wondering
1.  Is there a detailed guide of using ambari with ceph-hadoop, or has
anyone tried it?
2.  Is there a script or list of log files useful for debugging ceph
issues in general?

thanks,

kesten


ps.
I have opened a gist via
https://gist.github.com/darKoram/9051450
and an issue on the horton forums at
http://hortonworks.com/community/forums/topic/ambari-restart-services-give-
bash-usrlibhadoopbinhadoop-daemon-sh-permiss/#post-48793

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph hadoop using ambari

2014-02-17 Thread Noah Watkins
Hi Kesten,

It's a little difficult to tell what the source of the problem is, but
looking at the gist you referenced, I don't see anything that would
indicate that Ceph is causing the issue. For instance,
hadoop-mapred-tasktracker-xxx-yyy-hdfs01.log looks like Hadoop daemons
are having problems conneting to each other. Finding out what command
in hadoop-daemon.sh is causing the permission errors might be
informative, but I don't have any experience with Ambari.

On Mon, Feb 17, 2014 at 9:23 AM, Kesten Broughton kbrough...@21ct.com wrote:
 I posted this to ceph-devel-owner before seeing that this is the correct
 place to post.

 My company is trying to evaluate virtualized hdfs clusters using ceph as a
 drop-in replacement for staging and development
 following http://ceph.com/docs/master/cephfs/hadoop/.  We deploy clusters
 with ambari 1.3.2.

 I spun up a 10 node cluster with 3 datanodes, name, secondary, 3
 zookeepers, ambari master, and accumulo master.

 Our process is
 This was likely the cause of shutdown errors.  Should do
 1. Run ambari install
 2. shut down all ambari services
 3. push modified core-site.xml to datanodes, name, secondary
 4. restart ambari services

 I am getting errors
 /usr/lib/hadoop/bin/hadoop-daemon.sh: Permission denied

 in the ambari console error log from the command:
 su - hdfs -c  'export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec 
 /usr/lib/hadoop/bin/hadoop-daemon.sh --config /etc/hadoop/conf start
 datanode'


 I think this is an ambari issue, but I¹m wondering
 1.  Is there a detailed guide of using ambari with ceph-hadoop, or has
 anyone tried it?
 2.  Is there a script or list of log files useful for debugging ceph
 issues in general?

 thanks,

 kesten


 ps.
 I have opened a gist via
 https://gist.github.com/darKoram/9051450
 and an issue on the horton forums at
 http://hortonworks.com/community/forums/topic/ambari-restart-services-give-
 bash-usrlibhadoopbinhadoop-daemon-sh-permiss/#post-48793


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] OSD flapping during recovery

2014-02-17 Thread Craig Lewis
I had some issues with OSD flapping after 2 days of recovery.  It 
appears to be related to swapping, even though I have plenty of RAM for 
the number of OSDs I have.  The cluster was completely unusable, and I 
ended up rebooting all the nodes.  It's been great ever since, but I'm 
assuming it will happen again.


Details are below, but I'm wondering if anybody has any idea what happened?




I noticed some lumpy data distribution on my OSDs.  Following the advice 
on the mailling list, I increased the pg_num and pgp_num to the values 
from the formula.  .rgw.buckets is the only large pool, so I increased 
pg_num and pgp_num from 128 to 2048 on that one pool.  Cluster status 
changes to HEALTH_WARN, there were 1920 PGs with state 
active+remapped+wait_backfill, and 32% of the objects were degraded.


Recovery was slow, and we were having some performance issues.  I 
lowered osd_max_backfills from 10 to 2, and osd_recovery_op_priority 
from 10 to 2.  This didn't slow the recovery down much, but made my 
application much more responsive. My journals are on the OSD disks (no 
SSDs).  I believe the osd_max_backfills was the more important change, 
but it's much slower to test than the osd_recovery_op_priority change.  
Aside from those two, my notes say I changed and reverted 
osd_disk_threads, osd_op_threads, osd_recovery_threads.  All changes 
were pushed out using ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok 
config set osd_max_backfills 2



I watched the cluster on and off over the weekend.  Ceph was steadily 
recovering.  It was down to ~900 PGs in active+remapped+wait_backfill, 
with 17%  of objects degraded.  A few OSDs have been marked down and 
recovered, so a few tens of PGs are in state 
active+degraded+remapped+wait_backfill and 
active+degraded+remapped+backfilling.  I was poking around, and I 
noticed kswapd was using betwen 5% and 30% CPU on all nodes.  It was 
bursty, peaking at 30% CPU usage for about 5sec out of every 30sec. Swap 
usage wasn't increasing, and kswapd appeared to be doing a lot of 
nothing.  My machines have 8 OSDs, and 36GB of RAM.  top said that all 
machines were caching 30GB of data.  The 8 ceph-osd daemons are using 
0.5GB to 1.2GB of RAM.  I don't have the exact numbers, but I believe 
they were using about 5GB for all 8 ceph-osd daemons.



A few hours later, and the OSDs really started flapping.  They're being 
voted unresponsive and marked down faster than they can rejoin.  At one 
point, a third of the OSDs were marked down.  ceph -w is complaining 
about hundreds of slow requests greater than 900 seconds.  Most RGW 
accesses are failing with HTTP timeouts.  kswapd is using a consistent 
33% CPU on all nodes, with no variance that I can see.  To add insult, 
the cluster was running a scrub and a deep scrub.



I eventually rebooted all nodes in the cluster, one at a time.  Once 
quorum reestablished, recovery proceeded at the original speed.  The 
OSDs are responding, and all my RGW requests are returning in a 
reasonable amount of time.  There are no complaints of slow requests in 
ceph -w.  kswapd is using 0% of the CPU.



I'm running Ceph 0.72.2 on Ubuntu 12.04.4, with kernel 3.5.0-37-generic 
#58~precise1-Ubuntu SMP.


I monitor the running version as well as the installed version, so I 
know that all daemons were restarted after the 0.72.1 - 0.72.2 
upgrade.  That happened on Jan 22nd.




Any idea what happened?  I'm assuming it will happen again if recovery 
takes long enough.





--

*Craig Lewis*
Senior Systems Engineer
Office +1.714.602.1309
Email cle...@centraldesktop.com mailto:cle...@centraldesktop.com

*Central Desktop. Work together in ways you never thought possible.*
Connect with us Website http://www.centraldesktop.com/  | Twitter 
http://www.twitter.com/centraldesktop  | Facebook 
http://www.facebook.com/CentralDesktop  | LinkedIn 
http://www.linkedin.com/groups?gid=147417  | Blog 
http://cdblog.centraldesktop.com/


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD flapping during recovery

2014-02-17 Thread Christian Balzer
On Mon, 17 Feb 2014 11:24:42 -0800 Craig Lewis wrote:

[kswapd going bersek]
 
 Any idea what happened?  I'm assuming it will happen again if recovery 
 takes long enough.

You're running into a well known, but poorly rectified (if at all) kernel
problem, there is little Ceph has to do with it other than doing what is
supposed to, move large amounts of data around.

Check out:
https://bugzilla.redhat.com/show_bug.cgi?id=712019

and linked from there:
https://lkml.org/lkml/2013/3/17/50

Ubuntu has this bug logged as well:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/721896

Regards,

Christian
-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Maximum realistic latency between Mons, MDS, and OSDs

2014-02-17 Thread David Jericho
Hi all,

I've been playing with Ceph across high latency high speed links, with a range 
of results.

In general, Ceph MDS, monitors, and OSDs are solid across thousand kilometre 
network links. Jitter is low, latency is predictable, and capacity of the 
network is well beyond what the servers can push. I get the obvious expected 
slowdowns to do with quorum and other communications, the cluster remaining 
reliable and in sync.

What is the generally accepted maximum latency for a usable Ceph cluster 
between nodes? I realise it's quite an open ended question ,with quite a number 
of ifs and buts about it. I am however interested to hear what people have done 
in production and accepted, and what tweaks have been done.

For reference, I have an experimental cluster running across 12ms between the 
nodes, and while the IOPS itself is down, I'm able to write in at a few hundred 
Mbps which would cover many use cases. The latency was the obvious issue.

David Jericho
Senior Systems Administrator
AARNet Pty Ltd

t. +61 (0) 7 3317 9576 m. +61 423 027 185 e. david.jeri...@aarnet.edu.au w. 
www.aarnet.edu.au

street address: Ground Floor, 143 Coronation Drive, Milton, QLD, 4064, Australia

Please consider the environment before printing this email.

important
This email and any files transmitted with it are confidential, and the rights 
of confidentiality in such information are not waived or lost by its mistaken 
delivery to you.  Any dissemination, copying, use or disclosure of the email 
and/or such files without the permission of AARNet, or the sender, is strictly 
prohibited.  If you have received this email in error, please contact the 
sender immediately and delete all copies of this transmission.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com