[ceph-users] rados gateway

2017-03-21 Thread Garg, Pankaj
Hi,
I'm installing Rados Gateway, using Jewel 10.2.5, and can't seem to find the 
correct documentation.
I used ceph-deploy to start the gateway, but cant seem to restart the process 
correctly.

Can someone point me to the correct steps?
Also, how do I start my rados gateway back.

This is what I was following :

http://docs.ceph.com/docs/jewel/install/install-ceph-gateway/

I'm on Ubuntu 16.04.

Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Erasure Code Library Symbols

2017-03-06 Thread Garg, Pankaj
Hi,
I'm building Ceph 10.2.5 and doing some benchmarking with Erasure Coding.
However I notice that perf can't find any symbols in Erasure Coding libraries. 
It seems those have been stripped, whereas most other stuff has the symbols 
intact.
How can I build with symbols or make sure they don't get stripped. I have not 
been able to find a way for it.

Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] New Cluster OSD Issues

2016-09-30 Thread Garg, Pankaj
Hi,
I just created a new cluster with 0.94.8 and I'm getting this message:

2016-09-29 21:36:47.065642 mon.0 [INF] disallowing boot of OSD osd.35 
10.22.21.49:6844/9544 because the osdmap requires CEPH_FEATURE_SERVER_JEWEL but 
the osd lacks CEPH_FEATURE_SERVER_JEWEL

This is really bizzare. All the OSDS are down due to this. Can someone shed any 
light?

Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] RocksDB compression

2016-07-28 Thread Garg, Pankaj
Hi,
Has anyone configured compression in RockDB for BlueStore? Does it work?
Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Terrible RBD performance with Jewel

2016-07-14 Thread Garg, Pankaj
Disregard the last msg. Still getting long 0 IOPS periods.


From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Garg, 
Pankaj
Sent: Thursday, July 14, 2016 10:05 AM
To: Somnath Roy; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Terrible RBD performance with Jewel

Something in this section is causing all the 0 IOPS issue. Have not been able 
to nail down it yet. (I did comment out the filestore_max_inline_xattr_size 
entries, and problem still exists).
If I take out the whole [osd] section, I was able to get rid of IOPS staying at 
0 for long periods of time. Performance is still not where I would expect.
[osd]
osd_enable_op_tracker = false
osd_op_num_shards = 2
filestore_wbthrottle_enable = false
filestore_max_sync_interval = 1
filestore_odsync_write = true
#filestore_max_inline_xattr_size = 254
#filestore_max_inline_xattrs = 6
filestore_queue_committing_max_bytes = 1048576000
filestore_queue_committing_max_ops = 5000
filestore_queue_max_bytes = 1048576000
filestore_queue_max_ops = 500
journal_max_write_bytes = 1048576000
journal_max_write_entries = 1000
journal_queue_max_bytes = 1048576000
journal_queue_max_ops = 3000
filestore_fd_cache_shards = 32
filestore_fd_cache_size = 64

From: Somnath Roy [mailto:somnath@sandisk.com]
Sent: Wednesday, July 13, 2016 7:05 PM
To: Garg, Pankaj; ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: RE: Terrible RBD performance with Jewel

I am not sure whether you need to set the following. What's the point of 
reducing inline xattr stuff ? I forgot the calculation but lower values could 
redirect your xattrs to omap. Better comment those out.

filestore_max_inline_xattr_size = 254
filestore_max_inline_xattrs = 6

We could do some improvement on some of the params but nothing it seems 
responsible for the behavior you are seeing.
Could you run iotop and see if any process (like xfsaild) is doing io on the 
drives during that time ?

Thanks & Regards
Somnath

From: Garg, Pankaj [mailto:pankaj.g...@cavium.com]
Sent: Wednesday, July 13, 2016 6:40 PM
To: Somnath Roy; ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: RE: Terrible RBD performance with Jewel

I agree, but I'm dealing with something else out here with this setup.
I just ran a test, and within 3 seconds my IOPS went to 0, and stayed there for 
90 secondsthen started and within seconds again went to 0.
This doesn't seem normal at all. Here is my ceph.conf:

[global]
fsid = xx
public_network = 
cluster_network = 
mon_initial_members = ceph1
mon_host = 
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true
osd_mkfs_options = -f -i size=2048 -n size=64k
osd_mount_options_xfs = inode64,noatime,logbsize=256k
filestore_merge_threshold = 40
filestore_split_multiple = 8
osd_op_threads = 12
osd_pool_default_size = 2
mon_pg_warn_max_object_skew = 10
mon_pg_warn_min_per_osd = 0
mon_pg_warn_max_per_osd = 32768
filestore_op_threads = 6

[osd]
osd_enable_op_tracker = false
osd_op_num_shards = 2
filestore_wbthrottle_enable = false
filestore_max_sync_interval = 1
filestore_odsync_write = true
filestore_max_inline_xattr_size = 254
filestore_max_inline_xattrs = 6
filestore_queue_committing_max_bytes = 1048576000
filestore_queue_committing_max_ops = 5000
filestore_queue_max_bytes = 1048576000
filestore_queue_max_ops = 500
journal_max_write_bytes = 1048576000
journal_max_write_entries = 1000
journal_queue_max_bytes = 1048576000
journal_queue_max_ops = 3000
filestore_fd_cache_shards = 32
filestore_fd_cache_size = 64


From: Somnath Roy [mailto:somnath@sandisk.com]
Sent: Wednesday, July 13, 2016 6:06 PM
To: Garg, Pankaj; ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: RE: Terrible RBD performance with Jewel

You should do that first to get a stable performance out with filestore.
1M seq write for the entire image should be sufficient to precondition it.

From: Garg, Pankaj [mailto:pankaj.g...@cavium.com]
Sent: Wednesday, July 13, 2016 6:04 PM
To: Somnath Roy; ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: RE: Terrible RBD performance with Jewel

No I have not.

From: Somnath Roy [mailto:somnath@sandisk.com]
Sent: Wednesday, July 13, 2016 6:00 PM
To: Garg, Pankaj; ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: RE: Terrible RBD performance with Jewel

In fact, I was wrong , I missed you are running with 12 OSDs (considering one 
OSD per SSD). In that case, it will take ~250 second to fill up the journal.
Have you preconditioned the entire image with bigger block say 1M before doing 
any real test ?

From: Garg, Pankaj [mailto:pankaj.g...@cavium.com]
Sent: Wednesday, July 13, 2016 5:55 PM
To: Somnath Roy; ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com&g

Re: [ceph-users] Terrible RBD performance with Jewel

2016-07-14 Thread Garg, Pankaj
Something in this section is causing all the 0 IOPS issue. Have not been able 
to nail down it yet. (I did comment out the filestore_max_inline_xattr_size 
entries, and problem still exists).
If I take out the whole [osd] section, I was able to get rid of IOPS staying at 
0 for long periods of time. Performance is still not where I would expect.
[osd]
osd_enable_op_tracker = false
osd_op_num_shards = 2
filestore_wbthrottle_enable = false
filestore_max_sync_interval = 1
filestore_odsync_write = true
#filestore_max_inline_xattr_size = 254
#filestore_max_inline_xattrs = 6
filestore_queue_committing_max_bytes = 1048576000
filestore_queue_committing_max_ops = 5000
filestore_queue_max_bytes = 1048576000
filestore_queue_max_ops = 500
journal_max_write_bytes = 1048576000
journal_max_write_entries = 1000
journal_queue_max_bytes = 1048576000
journal_queue_max_ops = 3000
filestore_fd_cache_shards = 32
filestore_fd_cache_size = 64

From: Somnath Roy [mailto:somnath@sandisk.com]
Sent: Wednesday, July 13, 2016 7:05 PM
To: Garg, Pankaj; ceph-users@lists.ceph.com
Subject: RE: Terrible RBD performance with Jewel

I am not sure whether you need to set the following. What's the point of 
reducing inline xattr stuff ? I forgot the calculation but lower values could 
redirect your xattrs to omap. Better comment those out.

filestore_max_inline_xattr_size = 254
filestore_max_inline_xattrs = 6

We could do some improvement on some of the params but nothing it seems 
responsible for the behavior you are seeing.
Could you run iotop and see if any process (like xfsaild) is doing io on the 
drives during that time ?

Thanks & Regards
Somnath

From: Garg, Pankaj [mailto:pankaj.g...@cavium.com]
Sent: Wednesday, July 13, 2016 6:40 PM
To: Somnath Roy; ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: RE: Terrible RBD performance with Jewel

I agree, but I'm dealing with something else out here with this setup.
I just ran a test, and within 3 seconds my IOPS went to 0, and stayed there for 
90 secondsthen started and within seconds again went to 0.
This doesn't seem normal at all. Here is my ceph.conf:

[global]
fsid = xx
public_network = 
cluster_network = 
mon_initial_members = ceph1
mon_host = 
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true
osd_mkfs_options = -f -i size=2048 -n size=64k
osd_mount_options_xfs = inode64,noatime,logbsize=256k
filestore_merge_threshold = 40
filestore_split_multiple = 8
osd_op_threads = 12
osd_pool_default_size = 2
mon_pg_warn_max_object_skew = 10
mon_pg_warn_min_per_osd = 0
mon_pg_warn_max_per_osd = 32768
filestore_op_threads = 6

[osd]
osd_enable_op_tracker = false
osd_op_num_shards = 2
filestore_wbthrottle_enable = false
filestore_max_sync_interval = 1
filestore_odsync_write = true
filestore_max_inline_xattr_size = 254
filestore_max_inline_xattrs = 6
filestore_queue_committing_max_bytes = 1048576000
filestore_queue_committing_max_ops = 5000
filestore_queue_max_bytes = 1048576000
filestore_queue_max_ops = 500
journal_max_write_bytes = 1048576000
journal_max_write_entries = 1000
journal_queue_max_bytes = 1048576000
journal_queue_max_ops = 3000
filestore_fd_cache_shards = 32
filestore_fd_cache_size = 64


From: Somnath Roy [mailto:somnath@sandisk.com]
Sent: Wednesday, July 13, 2016 6:06 PM
To: Garg, Pankaj; ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: RE: Terrible RBD performance with Jewel

You should do that first to get a stable performance out with filestore.
1M seq write for the entire image should be sufficient to precondition it.

From: Garg, Pankaj [mailto:pankaj.g...@cavium.com]
Sent: Wednesday, July 13, 2016 6:04 PM
To: Somnath Roy; ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: RE: Terrible RBD performance with Jewel

No I have not.

From: Somnath Roy [mailto:somnath@sandisk.com]
Sent: Wednesday, July 13, 2016 6:00 PM
To: Garg, Pankaj; ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: RE: Terrible RBD performance with Jewel

In fact, I was wrong , I missed you are running with 12 OSDs (considering one 
OSD per SSD). In that case, it will take ~250 second to fill up the journal.
Have you preconditioned the entire image with bigger block say 1M before doing 
any real test ?

From: Garg, Pankaj [mailto:pankaj.g...@cavium.com]
Sent: Wednesday, July 13, 2016 5:55 PM
To: Somnath Roy; ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: RE: Terrible RBD performance with Jewel

Thanks Somnath. I will try all these, but I think there is something else going 
on too.
Firstly my test reaches 0 IOPS within 10 seconds sometimes.
Secondly, when I'm at 0 IOPS, I see NO disk activity on IOSTAT and no CPU 
activity either. This part is strange.

Thanks
Pankaj

Re: [ceph-users] Terrible RBD performance with Jewel

2016-07-13 Thread Garg, Pankaj
I agree, but I'm dealing with something else out here with this setup.
I just ran a test, and within 3 seconds my IOPS went to 0, and stayed there for 
90 secondsthen started and within seconds again went to 0.
This doesn't seem normal at all. Here is my ceph.conf:

[global]
fsid = xx
public_network = 
cluster_network = 
mon_initial_members = ceph1
mon_host = 
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true
osd_mkfs_options = -f -i size=2048 -n size=64k
osd_mount_options_xfs = inode64,noatime,logbsize=256k
filestore_merge_threshold = 40
filestore_split_multiple = 8
osd_op_threads = 12
osd_pool_default_size = 2
mon_pg_warn_max_object_skew = 10
mon_pg_warn_min_per_osd = 0
mon_pg_warn_max_per_osd = 32768
filestore_op_threads = 6

[osd]
osd_enable_op_tracker = false
osd_op_num_shards = 2
filestore_wbthrottle_enable = false
filestore_max_sync_interval = 1
filestore_odsync_write = true
filestore_max_inline_xattr_size = 254
filestore_max_inline_xattrs = 6
filestore_queue_committing_max_bytes = 1048576000
filestore_queue_committing_max_ops = 5000
filestore_queue_max_bytes = 1048576000
filestore_queue_max_ops = 500
journal_max_write_bytes = 1048576000
journal_max_write_entries = 1000
journal_queue_max_bytes = 1048576000
journal_queue_max_ops = 3000
filestore_fd_cache_shards = 32
filestore_fd_cache_size = 64


From: Somnath Roy [mailto:somnath@sandisk.com]
Sent: Wednesday, July 13, 2016 6:06 PM
To: Garg, Pankaj; ceph-users@lists.ceph.com
Subject: RE: Terrible RBD performance with Jewel

You should do that first to get a stable performance out with filestore.
1M seq write for the entire image should be sufficient to precondition it.

From: Garg, Pankaj [mailto:pankaj.g...@cavium.com]
Sent: Wednesday, July 13, 2016 6:04 PM
To: Somnath Roy; ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: RE: Terrible RBD performance with Jewel

No I have not.

From: Somnath Roy [mailto:somnath@sandisk.com]
Sent: Wednesday, July 13, 2016 6:00 PM
To: Garg, Pankaj; ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: RE: Terrible RBD performance with Jewel

In fact, I was wrong , I missed you are running with 12 OSDs (considering one 
OSD per SSD). In that case, it will take ~250 second to fill up the journal.
Have you preconditioned the entire image with bigger block say 1M before doing 
any real test ?

From: Garg, Pankaj [mailto:pankaj.g...@cavium.com]
Sent: Wednesday, July 13, 2016 5:55 PM
To: Somnath Roy; ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: RE: Terrible RBD performance with Jewel

Thanks Somnath. I will try all these, but I think there is something else going 
on too.
Firstly my test reaches 0 IOPS within 10 seconds sometimes.
Secondly, when I'm at 0 IOPS, I see NO disk activity on IOSTAT and no CPU 
activity either. This part is strange.

Thanks
Pankaj

From: Somnath Roy [mailto:somnath@sandisk.com]
Sent: Wednesday, July 13, 2016 5:49 PM
To: Somnath Roy; Garg, Pankaj; 
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: RE: Terrible RBD performance with Jewel

Also increase the following..

filestore_op_threads

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
Somnath Roy
Sent: Wednesday, July 13, 2016 5:47 PM
To: Garg, Pankaj; ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: Re: [ceph-users] Terrible RBD performance with Jewel

Pankaj,

Could be related to the new throttle parameter introduced in jewel. By default 
these throttles are off , you need to tweak it according to your setup.
What is your journal size and fio block size ?
If it is default 5GB , with this rate (assuming 4K RW)   you mentioned and 
considering 3X replication , it can fill up your journal and stall io within 
~30 seconds or so.
If you think this is what is happening in your system , you need to turn this 
throttle on (see 
https://github.com/ceph/ceph/blob/jewel/src/doc/dynamic-throttle.txt ) and also 
need to lower the filestore_max_sync_interval to ~1 (or even lower). Since you 
are trying on SSD , I would also recommend to turn the following parameter on 
for the stable performance out.


filestore_odsync_write = true

Thanks & Regards
Somnath
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Garg, 
Pankaj
Sent: Wednesday, July 13, 2016 4:57 PM
To: ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: [ceph-users] Terrible RBD performance with Jewel

Hi,
I just  installed jewel on a small cluster of 3 machines with 4 SSDs each. I 
created 8 RBD images, and use a single client, with 8 threads, to do random 
writes (using FIO with RBD engine) on the images ( 1 thread per image).
The cluster has 3X replication and 10G cluster and client

Re: [ceph-users] Terrible RBD performance with Jewel

2016-07-13 Thread Garg, Pankaj
No I have not.

From: Somnath Roy [mailto:somnath@sandisk.com]
Sent: Wednesday, July 13, 2016 6:00 PM
To: Garg, Pankaj; ceph-users@lists.ceph.com
Subject: RE: Terrible RBD performance with Jewel

In fact, I was wrong , I missed you are running with 12 OSDs (considering one 
OSD per SSD). In that case, it will take ~250 second to fill up the journal.
Have you preconditioned the entire image with bigger block say 1M before doing 
any real test ?

From: Garg, Pankaj [mailto:pankaj.g...@cavium.com]
Sent: Wednesday, July 13, 2016 5:55 PM
To: Somnath Roy; ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: RE: Terrible RBD performance with Jewel

Thanks Somnath. I will try all these, but I think there is something else going 
on too.
Firstly my test reaches 0 IOPS within 10 seconds sometimes.
Secondly, when I'm at 0 IOPS, I see NO disk activity on IOSTAT and no CPU 
activity either. This part is strange.

Thanks
Pankaj

From: Somnath Roy [mailto:somnath@sandisk.com]
Sent: Wednesday, July 13, 2016 5:49 PM
To: Somnath Roy; Garg, Pankaj; 
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: RE: Terrible RBD performance with Jewel

Also increase the following..

filestore_op_threads

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
Somnath Roy
Sent: Wednesday, July 13, 2016 5:47 PM
To: Garg, Pankaj; ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: Re: [ceph-users] Terrible RBD performance with Jewel

Pankaj,

Could be related to the new throttle parameter introduced in jewel. By default 
these throttles are off , you need to tweak it according to your setup.
What is your journal size and fio block size ?
If it is default 5GB , with this rate (assuming 4K RW)   you mentioned and 
considering 3X replication , it can fill up your journal and stall io within 
~30 seconds or so.
If you think this is what is happening in your system , you need to turn this 
throttle on (see 
https://github.com/ceph/ceph/blob/jewel/src/doc/dynamic-throttle.txt ) and also 
need to lower the filestore_max_sync_interval to ~1 (or even lower). Since you 
are trying on SSD , I would also recommend to turn the following parameter on 
for the stable performance out.


filestore_odsync_write = true

Thanks & Regards
Somnath
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Garg, 
Pankaj
Sent: Wednesday, July 13, 2016 4:57 PM
To: ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: [ceph-users] Terrible RBD performance with Jewel

Hi,
I just  installed jewel on a small cluster of 3 machines with 4 SSDs each. I 
created 8 RBD images, and use a single client, with 8 threads, to do random 
writes (using FIO with RBD engine) on the images ( 1 thread per image).
The cluster has 3X replication and 10G cluster and client networks.
FIO prints the aggregate IOPS every second for the cluster. Before Jewel, I get 
roughtly 10K IOPS. It was up and down, but still kept going.
Now I see IOPS that go to 13-15K, but then it drops, and eventually drops to 
ZERO for several seconds, and then starts back up again.

What am I missing?

Thanks
Pankaj
PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Terrible RBD performance with Jewel

2016-07-13 Thread Garg, Pankaj
Hi,
I just  installed jewel on a small cluster of 3 machines with 4 SSDs each. I 
created 8 RBD images, and use a single client, with 8 threads, to do random 
writes (using FIO with RBD engine) on the images ( 1 thread per image).
The cluster has 3X replication and 10G cluster and client networks.
FIO prints the aggregate IOPS every second for the cluster. Before Jewel, I get 
roughtly 10K IOPS. It was up and down, but still kept going.
Now I see IOPS that go to 13-15K, but then it drops, and eventually drops to 
ZERO for several seconds, and then starts back up again.

What am I missing?

Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD - Slow Requests

2016-05-05 Thread Garg, Pankaj
HI Christian,
Thanks for your response. But strangely enough, this is a new problem. I have 
used the same cluster and hardware for over a year. I have my drives in a new 
chassis now, and that is the only change.
My problem OSDs, change if I just reboot the system. Also, since this is 
benchmarking, when I reach my limit, it should throttle, and not have errors.
BTW, sometimes I'm able to run my whole benchmark writes, without any issues, 
and other times I see these errors.

Thanks
Pankaj

-Original Message-
From: Christian Balzer [mailto:ch...@gol.com] 
Sent: Wednesday, May 04, 2016 9:01 PM
To: ceph-users@lists.ceph.com
Cc: Garg, Pankaj
Subject: Re: [ceph-users] OSD - Slow Requests


Hello,

On Wed, 4 May 2016 21:08:02 + Garg, Pankaj wrote:

> Hi,
> 
> I am getting messages like the following from my Ceph systems. 
> Normally this would indicate issues with Drives. But when I restart my 
> system, different and randomly a couple OSDs again start spitting out 
> the same message. SO definitely it's not the same drives every time.
> 
> Any ideas on how to debug this. I don't see any drive related issues 
> in dmesg log either.
>

Drives having issues (as in being slow due to errors or firmware bugs) is a 
possible reason, but it would be not at the top of my list.

You want to run atop, iostat or the likes and graph actual drive and various 
Ceph performance counters to see what is going on and if a particular drive is 
slower than the rest or if your whole system is just reaching the limit of its 
performance.

Looking at your ceph log output, the first thing that catches the eye is that 
all slow objects are for benchmark runs (rados bench), so you seem to stress 
testing the cluster and have found its limits...

In addition to that all the slow requests include osd.84, so you might give 
that one a closer look. 
But that could of course be a coincidence due to limited log samples.

Christian

> Thanks
> Pankaj
> 
> 
> 
> 2016-05-04 14:02:52.499115 osd.72 [WRN] slow request 30.429347 seconds 
> old, received at 2016-05-04 14:02:22.069658:
> osd_op(client.2859198.0:9559 benchmark_data_x86Ceph3_54385_object9558
> [write 0~131072] 309.17ee1e0e ack+ondisk+write+known_if_redirected
> e14815) currently waiting for subops from 84,104 2016-05-04
> 14:02:54.499453 osd.72 [WRN] 24 slow requests, 1 included below; 
> oldest blocked for > 52.866778 secs 2016-05-04 14:02:54.499467 osd.72 
> [WRN] slow request 30.660900 seconds old, received at 2016-05-04
> 14:02:23.838455: osd_op(client.2859198.0:9661
> benchmark_data_x86Ceph3_54385_object9660 [write 0~131072] 309.4054960e
> ack+ondisk+write+known_if_redirected e14815) currently waiting for
> subops from 84,104 2016-05-04 14:02:56.499822 osd.72 [WRN] 25 slow 
> requests, 1 included below; oldest blocked for > 54.867154 secs
> 2016-05-04 14:02:56.499835 osd.72 [WRN] slow request 30.940457 seconds 
> old, received at 2016-05-04 14:02:25.559273:
> osd_op(client.2859197.0:9796 benchmark_data_x86Ceph1_24943_object9795
> [write 0~131072] 308.7e0944a ack+ondisk+write+known_if_redirected
> e14815) currently waiting for subops from 84,97 2016-05-04
> 14:02:59.140562 osd.84 [WRN] 33 slow requests, 1 included below; 
> oldest blocked for > 58.267177 secs
> 
> 
> 


-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Rakuten Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD Crashes

2016-04-29 Thread Garg, Pankaj
I think the issue is possibly coming from my Journal drives after upgrade to 
Infernalis. I have 2 SSDs, which have 6 partitions each for a total of 12 
Journals / server.

When I create OSDS, I pass the partition names as Journals 
For e.g.  ceph-deploy osd prepare x86Ceph7:/dev/sdd:/dev/sdb1

This works, but since the ownership on the journals is not ceph:ceph, 
everything fails, until I run chown ceph:ceph /dev/sda4.
This change doesn’t persist after reboots.

An idea how to fix this. 

Thanks
Pankaj



-Original Message-
From: Somnath Roy [mailto:somnath@sandisk.com] 
Sent: Friday, April 29, 2016 9:03 AM
To: Garg, Pankaj; Samuel Just
Cc: ceph-users@lists.ceph.com
Subject: RE: [ceph-users] OSD Crashes

Check system log and search for the corresponding drive. It should have the 
information what is failing..

Thanks & Regards
Somnath

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Garg, 
Pankaj
Sent: Friday, April 29, 2016 8:59 AM
To: Samuel Just
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] OSD Crashes

I can see that. I guess what would that be symptomatic of? How is it doing that 
on 6 different systems and on multiple OSDs?

-Original Message-
From: Samuel Just [mailto:sj...@redhat.com]
Sent: Friday, April 29, 2016 8:57 AM
To: Garg, Pankaj
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] OSD Crashes

Your fs is throwing an EIO on open.
-Sam

On Fri, Apr 29, 2016 at 8:54 AM, Garg, Pankaj <pankaj.g...@caviumnetworks.com> 
wrote:
> Hi,
>
> I had a fully functional Ceph cluster with 3 x86 Nodes and 3 ARM64 
> nodes, each with 12 HDD Drives and 2SSD Drives. All these were 
> initially running Hammer, and then were successfully updated to Infernalis 
> (9.2.0).
>
> I recently deleted all my OSDs and swapped my drives with new ones on 
> the
> x86 Systems, and the ARM servers were swapped with different ones 
> (keeping drives same).
>
> I again provisioned the OSDs, keeping the same cluster and Ceph 
> versions as before. But now, every time I try to run RADOS bench, my 
> OSDs start crashing (on both ARM and x86 servers).
>
> I’m not sure why this is happening on all 6 systems. On the x86, it’s 
> the same Ceph bits as before, and the only thing different is the new drives.
>
> It’s the same stack (pasted below) on all the OSDs too.
>
> Can anyone provide any clues?
>
>
>
> Thanks
>
> Pankaj
>
>
>
>
>
>
>
>
>
>
>
>   -14> 2016-04-28 08:09:45.423950 7f1ef05b1700  1 --
> 192.168.240.117:6820/14377 <== osd.93 192.168.240.116:6811/47080 1236 
> 
> osd_repop(client.2794263.0:37721 284.6d4 
> 284/afa8fed4/benchmark_data_x86Ceph1_147212_object37720/head v
> 12284'26) v1  981+0+4759 (3923326827 0 3705383247) 0x5634cbabc400 
> con 0x5634c5168420
>
>-13> 2016-04-28 08:09:45.423981 7f1ef05b1700  5 -- op tracker -- seq:
> 29404, time: 2016-04-28 08:09:45.423882, event: header_read, op:
> osd_repop(client.2794263.0:37721 284.6d4 
> 284/afa8fed4/benchmark_data_x86Ceph1_147212_object37720/head v
> 12284'26)
>
>-12> 2016-04-28 08:09:45.423991 7f1ef05b1700  5 -- op tracker -- seq:
> 29404, time: 2016-04-28 08:09:45.423884, event: throttled, op:
> osd_repop(client.2794263.0:37721 284.6d4 
> 284/afa8fed4/benchmark_data_x86Ceph1_147212_object37720/head v
> 12284'26)
>
>-11> 2016-04-28 08:09:45.423996 7f1ef05b1700  5 -- op tracker -- seq:
> 29404, time: 2016-04-28 08:09:45.423942, event: all_read, op:
> osd_repop(client.2794263.0:37721 284.6d4 
> 284/afa8fed4/benchmark_data_x86Ceph1_147212_object37720/head v
> 12284'26)
>
>-10> 2016-04-28 08:09:45.424001 7f1ef05b1700  5 -- op tracker -- seq:
> 29404, time: 0.00, event: dispatched, op:
> osd_repop(client.2794263.0:37721 284.6d4 
> 284/afa8fed4/benchmark_data_x86Ceph1_147212_object37720/head v
> 12284'26)
>
> -9> 2016-04-28 08:09:45.424014 7f1ef05b1700  5 -- op tracker -- seq:
> 29404, time: 2016-04-28 08:09:45.424014, event: queued_for_pg, op:
> osd_repop(client.2794263.0:37721 284.6d4 
> 284/afa8fed4/benchmark_data_x86Ceph1_147212_object37720/head v
> 12284'26)
>
> -8> 2016-04-28 08:09:45.561827 7f1f15799700  5 osd.102 12284 
> tick_without_osd_lock
>
> -7> 2016-04-28 08:09:45.973944 7f1f0801a700  1 --
> 192.168.240.117:6821/14377 <== osd.73 192.168.240.115:0/26572 1306 
>  osd_ping(ping e12284 stamp 2016-04-28 08:09:45.971751) v2 
> 47+0+0
> (846632602 0 0) 0x5634c8305c00 con 0x5634c58dd760
>
> -6> 2016-04-28 08:09:45.973995 7f1f0801a700  1 --
> 192.168.240.117:6821/14377 --> 192.168.240.115:0/26572 -- 
> osd_ping(ping_reply e12284 stamp 2016-04-28 08:09:45.971751) v2 -- ?+0
> 0x5634c7ba8000 con 0x5634c58dd760
>
>   

Re: [ceph-users] OSD Crashes

2016-04-29 Thread Garg, Pankaj
I can see that. I guess what would that be symptomatic of? How is it doing that 
on 6 different systems and on multiple OSDs?

-Original Message-
From: Samuel Just [mailto:sj...@redhat.com] 
Sent: Friday, April 29, 2016 8:57 AM
To: Garg, Pankaj
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] OSD Crashes

Your fs is throwing an EIO on open.
-Sam

On Fri, Apr 29, 2016 at 8:54 AM, Garg, Pankaj <pankaj.g...@caviumnetworks.com> 
wrote:
> Hi,
>
> I had a fully functional Ceph cluster with 3 x86 Nodes and 3 ARM64 
> nodes, each with 12 HDD Drives and 2SSD Drives. All these were 
> initially running Hammer, and then were successfully updated to Infernalis 
> (9.2.0).
>
> I recently deleted all my OSDs and swapped my drives with new ones on 
> the
> x86 Systems, and the ARM servers were swapped with different ones 
> (keeping drives same).
>
> I again provisioned the OSDs, keeping the same cluster and Ceph 
> versions as before. But now, every time I try to run RADOS bench, my 
> OSDs start crashing (on both ARM and x86 servers).
>
> I’m not sure why this is happening on all 6 systems. On the x86, it’s 
> the same Ceph bits as before, and the only thing different is the new drives.
>
> It’s the same stack (pasted below) on all the OSDs too.
>
> Can anyone provide any clues?
>
>
>
> Thanks
>
> Pankaj
>
>
>
>
>
>
>
>
>
>
>
>   -14> 2016-04-28 08:09:45.423950 7f1ef05b1700  1 --
> 192.168.240.117:6820/14377 <== osd.93 192.168.240.116:6811/47080 1236 
> 
> osd_repop(client.2794263.0:37721 284.6d4 
> 284/afa8fed4/benchmark_data_x86Ceph1_147212_object37720/head v 
> 12284'26) v1  981+0+4759 (3923326827 0 3705383247) 0x5634cbabc400 
> con 0x5634c5168420
>
>-13> 2016-04-28 08:09:45.423981 7f1ef05b1700  5 -- op tracker -- seq:
> 29404, time: 2016-04-28 08:09:45.423882, event: header_read, op:
> osd_repop(client.2794263.0:37721 284.6d4 
> 284/afa8fed4/benchmark_data_x86Ceph1_147212_object37720/head v 
> 12284'26)
>
>-12> 2016-04-28 08:09:45.423991 7f1ef05b1700  5 -- op tracker -- seq:
> 29404, time: 2016-04-28 08:09:45.423884, event: throttled, op:
> osd_repop(client.2794263.0:37721 284.6d4 
> 284/afa8fed4/benchmark_data_x86Ceph1_147212_object37720/head v 
> 12284'26)
>
>-11> 2016-04-28 08:09:45.423996 7f1ef05b1700  5 -- op tracker -- seq:
> 29404, time: 2016-04-28 08:09:45.423942, event: all_read, op:
> osd_repop(client.2794263.0:37721 284.6d4 
> 284/afa8fed4/benchmark_data_x86Ceph1_147212_object37720/head v 
> 12284'26)
>
>-10> 2016-04-28 08:09:45.424001 7f1ef05b1700  5 -- op tracker -- seq:
> 29404, time: 0.00, event: dispatched, op:
> osd_repop(client.2794263.0:37721 284.6d4 
> 284/afa8fed4/benchmark_data_x86Ceph1_147212_object37720/head v 
> 12284'26)
>
> -9> 2016-04-28 08:09:45.424014 7f1ef05b1700  5 -- op tracker -- seq:
> 29404, time: 2016-04-28 08:09:45.424014, event: queued_for_pg, op:
> osd_repop(client.2794263.0:37721 284.6d4 
> 284/afa8fed4/benchmark_data_x86Ceph1_147212_object37720/head v 
> 12284'26)
>
> -8> 2016-04-28 08:09:45.561827 7f1f15799700  5 osd.102 12284 
> tick_without_osd_lock
>
> -7> 2016-04-28 08:09:45.973944 7f1f0801a700  1 --
> 192.168.240.117:6821/14377 <== osd.73 192.168.240.115:0/26572 1306 
>  osd_ping(ping e12284 stamp 2016-04-28 08:09:45.971751) v2  
> 47+0+0
> (846632602 0 0) 0x5634c8305c00 con 0x5634c58dd760
>
> -6> 2016-04-28 08:09:45.973995 7f1f0801a700  1 --
> 192.168.240.117:6821/14377 --> 192.168.240.115:0/26572 -- 
> osd_ping(ping_reply e12284 stamp 2016-04-28 08:09:45.971751) v2 -- ?+0
> 0x5634c7ba8000 con 0x5634c58dd760
>
> -5> 2016-04-28 08:09:45.974300 7f1f0981d700  1 --
> 10.18.240.117:6821/14377 <== osd.73 192.168.240.115:0/26572 1306  
> osd_ping(ping e12284 stamp 2016-04-28 08:09:45.971751) v2  47+0+0
> (846632602 0 0) 0x5634c8129400 con 0x5634c58dcf20
>
> -4> 2016-04-28 08:09:45.974337 7f1f0981d700  1 --
> 10.18.240.117:6821/14377 --> 192.168.240.115:0/26572 -- 
> osd_ping(ping_reply
> e12284 stamp 2016-04-28 08:09:45.971751) v2 -- ?+0 0x5634c617d600 con
> 0x5634c58dcf20
>
> -3> 2016-04-28 08:09:46.174079 7f1f11f92700  0
> filestore(/var/lib/ceph/osd/ceph-102) write couldn't open
> 287.6f9_head/287/ae33fef9/benchmark_data_ceph7_17591_object39895/head: 
> (117) Structure needs cleaning
>
> -2> 2016-04-28 08:09:46.174103 7f1f11f92700  0
> filestore(/var/lib/ceph/osd/ceph-102)  error (117) Structure needs 
> cleaning not handled on operation 0x5634c885df9e (16590.1.0, or op 0, 
> counting from
> 0)
>
> -1> 2016-04-28 08:09:46.174109 7f1f11f92700  0
> filestore(/var/lib/ceph/o

[ceph-users] OSD Crashes

2016-04-29 Thread Garg, Pankaj
Hi,
I had a fully functional Ceph cluster with 3 x86 Nodes and 3 ARM64 nodes, each 
with 12 HDD Drives and 2SSD Drives. All these were initially running Hammer, 
and then were successfully updated to Infernalis (9.2.0).
I recently deleted all my OSDs and swapped my drives with new ones on the x86 
Systems, and the ARM servers were swapped with different ones (keeping drives 
same).
I again provisioned the OSDs, keeping the same cluster and Ceph versions as 
before. But now, every time I try to run RADOS bench, my OSDs start crashing 
(on both ARM and x86 servers).
I'm not sure why this is happening on all 6 systems. On the x86, it's the same 
Ceph bits as before, and the only thing different is the new drives.
It's the same stack (pasted below) on all the OSDs too.
Can anyone provide any clues?

Thanks
Pankaj





  -14> 2016-04-28 08:09:45.423950 7f1ef05b1700  1 -- 192.168.240.117:6820/14377 
<== osd.93 192.168.240.116:6811/47080 1236  
osd_repop(client.2794263.0:37721 284.6d4 
284/afa8fed4/benchmark_data_x86Ceph1_147212_object37720/head v 12284'26) v1 
 981+0+4759 (3923326827 0 3705383247) 0x5634cbabc400 con 0x5634c5168420
   -13> 2016-04-28 08:09:45.423981 7f1ef05b1700  5 -- op tracker -- seq: 29404, 
time: 2016-04-28 08:09:45.423882, event: header_read, op: 
osd_repop(client.2794263.0:37721 284.6d4 
284/afa8fed4/benchmark_data_x86Ceph1_147212_object37720/head v 12284'26)
   -12> 2016-04-28 08:09:45.423991 7f1ef05b1700  5 -- op tracker -- seq: 29404, 
time: 2016-04-28 08:09:45.423884, event: throttled, op: 
osd_repop(client.2794263.0:37721 284.6d4 
284/afa8fed4/benchmark_data_x86Ceph1_147212_object37720/head v 12284'26)
   -11> 2016-04-28 08:09:45.423996 7f1ef05b1700  5 -- op tracker -- seq: 29404, 
time: 2016-04-28 08:09:45.423942, event: all_read, op: 
osd_repop(client.2794263.0:37721 284.6d4 
284/afa8fed4/benchmark_data_x86Ceph1_147212_object37720/head v 12284'26)
   -10> 2016-04-28 08:09:45.424001 7f1ef05b1700  5 -- op tracker -- seq: 29404, 
time: 0.00, event: dispatched, op: osd_repop(client.2794263.0:37721 284.6d4 
284/afa8fed4/benchmark_data_x86Ceph1_147212_object37720/head v 12284'26)
-9> 2016-04-28 08:09:45.424014 7f1ef05b1700  5 -- op tracker -- seq: 29404, 
time: 2016-04-28 08:09:45.424014, event: queued_for_pg, op: 
osd_repop(client.2794263.0:37721 284.6d4 
284/afa8fed4/benchmark_data_x86Ceph1_147212_object37720/head v 12284'26)
-8> 2016-04-28 08:09:45.561827 7f1f15799700  5 osd.102 12284 
tick_without_osd_lock
-7> 2016-04-28 08:09:45.973944 7f1f0801a700  1 -- 
192.168.240.117:6821/14377 <== osd.73 192.168.240.115:0/26572 1306  
osd_ping(ping e12284 stamp 2016-04-28 08:09:45.971751) v2  47+0+0 
(846632602 0 0) 0x5634c8305c00 con 0x5634c58dd760
-6> 2016-04-28 08:09:45.973995 7f1f0801a700  1 -- 
192.168.240.117:6821/14377 --> 192.168.240.115:0/26572 -- osd_ping(ping_reply 
e12284 stamp 2016-04-28 08:09:45.971751) v2 -- ?+0 0x5634c7ba8000 con 
0x5634c58dd760
-5> 2016-04-28 08:09:45.974300 7f1f0981d700  1 -- 10.18.240.117:6821/14377 
<== osd.73 192.168.240.115:0/26572 1306  osd_ping(ping e12284 stamp 
2016-04-28 08:09:45.971751) v2  47+0+0 (846632602 0 0) 0x5634c8129400 con 
0x5634c58dcf20
-4> 2016-04-28 08:09:45.974337 7f1f0981d700  1 -- 10.18.240.117:6821/14377 
--> 192.168.240.115:0/26572 -- osd_ping(ping_reply e12284 stamp 2016-04-28 
08:09:45.971751) v2 -- ?+0 0x5634c617d600 con 0x5634c58dcf20
-3> 2016-04-28 08:09:46.174079 7f1f11f92700  0 
filestore(/var/lib/ceph/osd/ceph-102) write couldn't open 
287.6f9_head/287/ae33fef9/benchmark_data_ceph7_17591_object39895/head: (117) 
Structure needs cleaning
-2> 2016-04-28 08:09:46.174103 7f1f11f92700  0 
filestore(/var/lib/ceph/osd/ceph-102)  error (117) Structure needs cleaning not 
handled on operation 0x5634c885df9e (16590.1.0, or op 0, counting from 0)
-1> 2016-04-28 08:09:46.174109 7f1f11f92700  0 
filestore(/var/lib/ceph/osd/ceph-102) unexpected error code
 0> 2016-04-28 08:09:46.178707 7f1f11791700 -1 os/FileStore.cc: In function 
'int FileStore::lfn_open(coll_t, const ghobject_t&, bool, FDRef*, Index*)' 
thread 7f1f11791700 time 2016-04-28 08:09:46.173250
os/FileStore.cc: 335: FAILED assert(!m_filestore_fail_eio || r != -5)

ceph version 9.2.1 (752b6a3020c3de74e07d2a8b4c5e48dab5a6b6fd)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) 
[0x5634c02ec7eb]
2: (FileStore::lfn_open(coll_t, ghobject_t const&, bool, 
std::shared_ptr*, Index*)+0x1191) [0x5634bffb2d01]
3: (FileStore::_write(coll_t, ghobject_t const&, unsigned long, unsigned long, 
ceph::buffer::list const&, unsigned int)+0xf0) [0x5634bffbb7b0]
4: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, 
ThreadPool::TPHandle*)+0x2901) [0x5634bffc6f51]
5: (FileStore::_do_transactions(std::list >&, unsigned long, 
ThreadPool::TPHandle*)+0x64) [0x5634bffcc404]
6: 

Re: [ceph-users] INFARNALIS with 64K Kernel PAGES

2016-03-01 Thread Garg, Pankaj
The OSDS were created with 64K page size, and mkfs was done with the same size.
After upgrade, I have not changed anything on the machine (except applied the 
ownership fix for files for user ceph:ceph)

From: Somnath Roy [mailto:somnath@sandisk.com]
Sent: Tuesday, March 01, 2016 9:32 PM
To: Garg, Pankaj; ceph-users@lists.ceph.com
Subject: RE: INFARNALIS with 64K Kernel PAGES

Did you recreated OSDs on this setup meaning did you do mkfs with 64K page size 
?

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Garg, 
Pankaj
Sent: Tuesday, March 01, 2016 9:07 PM
To: ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: [ceph-users] INFARNALIS with 64K Kernel PAGES

Hi,
Is there a known issue with using 64K Kernel PAGE_SIZE?
I am using ARM64 systems, and I upgraded from 0.94.4 to 9.2.1 today. The system 
which was on 4K page size, came up OK and OSDs are all online.
Systems with 64K Page size are all seeing the OSDs crash with following stack:

Begin dump of recent events ---
   -54> 2016-03-01 20:52:56.489752 97e38f10  5 asok(0xff6c) 
register_command perfcounters_dump hook 0xff63c030
   -53> 2016-03-01 20:52:56.489798 97e38f10  5 asok(0xff6c) 
register_command 1 hook 0xff63c030
   -52> 2016-03-01 20:52:56.489809 97e38f10  5 asok(0xff6c) 
register_command perf dump hook 0xff63c030
   -51> 2016-03-01 20:52:56.489819 97e38f10  5 asok(0xff6c) 
register_command perfcounters_schema hook 0xff63c030
   -50> 2016-03-01 20:52:56.489829 97e38f10  5 asok(0xff6c) 
register_command 2 hook 0xff63c030
   -49> 2016-03-01 20:52:56.489839 97e38f10  5 asok(0xff6c) 
register_command perf schema hook 0xff63c030
   -48> 2016-03-01 20:52:56.489849 97e38f10  5 asok(0xff6c) 
register_command perf reset hook 0xff63c030
   -47> 2016-03-01 20:52:56.489858 97e38f10  5 asok(0xff6c) 
register_command config show hook 0xff63c030
   -46> 2016-03-01 20:52:56.489868 97e38f10  5 asok(0xff6c) 
register_command config set hook 0xff63c030
   -45> 2016-03-01 20:52:56.489877 97e38f10  5 asok(0xff6c) 
register_command config get hook 0xff63c030
   -44> 2016-03-01 20:52:56.489886 97e38f10  5 asok(0xff6c) 
register_command config diff hook 0xff63c030
   -43> 2016-03-01 20:52:56.489896 97e38f10  5 asok(0xff6c) 
register_command log flush hook 0xff63c030
   -42> 2016-03-01 20:52:56.489905 97e38f10  5 asok(0xff6c) 
register_command log dump hook 0xff63c030
   -41> 2016-03-01 20:52:56.489914 97e38f10  5 asok(0xff6c) 
register_command log reopen hook 0xff63c030
   -40> 2016-03-01 20:52:56.497924 97e38f10  0 set uid:gid to 64045:64045
   -39> 2016-03-01 20:52:56.498074 97e38f10  0 ceph version 9.2.1 
(752b6a3020c3de74e07d2a8b4c5e48dab5a6b6fd), process ceph-osd, pid 17095
   -38> 2016-03-01 20:52:56.499547 97e38f10  1 -- 10.18.240.124:0/0 learned 
my addr 10.18.240.124:0/0
   -37> 2016-03-01 20:52:56.499572 97e38f10  1 accepter.accepter.bind 
my_inst.addr is 10.18.240.124:6802/17095 need_addr=0
   -36> 2016-03-01 20:52:56.499620 97e38f10  1 -- 192.168.240.124:0/0 
learned my addr 192.168.240.124:0/0
   -35> 2016-03-01 20:52:56.499638 97e38f10  1 accepter.accepter.bind 
my_inst.addr is 192.168.240.124:6802/17095 need_addr=0
   -34> 2016-03-01 20:52:56.499673 97e38f10  1 -- 192.168.240.124:0/0 
learned my addr 192.168.240.124:0/0
   -33> 2016-03-01 20:52:56.499690 97e38f10  1 accepter.accepter.bind 
my_inst.addr is 192.168.240.124:6803/17095 need_addr=0
   -32> 2016-03-01 20:52:56.499724 97e38f10  1 -- 10.18.240.124:0/0 learned 
my addr 10.18.240.124:0/0
   -31> 2016-03-01 20:52:56.499741 97e38f10  1 accepter.accepter.bind 
my_inst.addr is 10.18.240.124:6803/17095 need_addr=0
   -30> 2016-03-01 20:52:56.503307 97e38f10  5 asok(0xff6c) init 
/var/run/ceph/ceph-osd.100.asok
   -29> 2016-03-01 20:52:56.503329 97e38f10  5 asok(0xff6c) 
bind_and_listen /var/run/ceph/ceph-osd.100.asok
   -28> 2016-03-01 20:52:56.503460 97e38f10  5 asok(0xff6c) 
register_command 0 hook 0xff6380c0
   -27> 2016-03-01 20:52:56.503479 97e38f10  5 asok(0xff6c) 
register_command version hook 0xff6380c0
   -26> 2016-03-01 20:52:56.503490 97e38f10  5 asok(0xff6c) 
register_command git_version hook 0xff6380c0
   -25> 2016-03-01 20:52:56.503500 97e38f10  5 asok(0xff6c) 
register_command help hook 0xff63c1e0
   -24> 2016-03-01 20:52:56.503510 97e38f10  5 asok(0xff6c) 
register_command get_command_descriptions hook 0xff63c1f0
   -23> 2016-03-01 20:52:56.503566 9643f030  5 asok(0xff6c) entry 
start
   -22> 2016-03-01 20:52:56.503635 97e38f10 10 monclient(hunting): 
build_init

Re: [ceph-users] Upgrade to INFERNALIS

2016-03-01 Thread Garg, Pankaj
Thanks François. That was the issue. After changing Journal partition 
permissions, things look better now.

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
Francois Lafont
Sent: Tuesday, March 01, 2016 4:06 PM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Upgrade to INFERNALIS

Hi,

On 02/03/2016 00:12, Garg, Pankaj wrote:

> I have upgraded my cluster from 0.94.4 as recommended to the just released 
> Infernalis (9.2.1) Update directly (skipped 9.2.0).
> I installed the packaged on each system, manually (.deb files that I built).
> 
> After that I followed the steps :
> 
> Stop ceph-all
> chown -R  ceph:ceph /var/lib/ceph
> start ceph-all

Ok, and the journals?

> I am still getting errors on starting OSDs.
> 
> 2016-03-01 22:44:45.991043 7fa185f000 -1 filestore(/var/lib/ceph/osd/ceph-69) 
> mount failed to open journal /var/lib/ceph/osd/ceph-69/journal: (13) 
> Permission denied

I suppose your journal is a symlink which targets to a raw partition, correct? 
In this case, the ceph Unix account seems currently to be unable to read and 
write in this partition. If this partition is /dev/sdb2 (for instance), you 
have to set the Unix rights this "file" /dev/sdb2 (manually or via a udev rule).

> 2016-03-01 22:44:46.001112 7fa185f000 -1 osd.69 0 OSD:init: unable to mount 
> object store
> 2016-03-01 22:44:46.001128 7fa185f000 -1  ** ERROR: osd init failed: (13) 
> Permission denied
> 
> 
> What am I missing?

I think you missed to set the Unix rights of the journal partitions. The ceph 
account must be able to read/write in /var/lib/ceph/osd/$cluster-$id/ _and_ in 
the journal partitions too.

Regards.

-- 
François Lafont
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] INFARNALIS with 64K Kernel PAGES

2016-03-01 Thread Garg, Pankaj
Hi,
Is there a known issue with using 64K Kernel PAGE_SIZE?
I am using ARM64 systems, and I upgraded from 0.94.4 to 9.2.1 today. The system 
which was on 4K page size, came up OK and OSDs are all online.
Systems with 64K Page size are all seeing the OSDs crash with following stack:

Begin dump of recent events ---
   -54> 2016-03-01 20:52:56.489752 97e38f10  5 asok(0xff6c) 
register_command perfcounters_dump hook 0xff63c030
   -53> 2016-03-01 20:52:56.489798 97e38f10  5 asok(0xff6c) 
register_command 1 hook 0xff63c030
   -52> 2016-03-01 20:52:56.489809 97e38f10  5 asok(0xff6c) 
register_command perf dump hook 0xff63c030
   -51> 2016-03-01 20:52:56.489819 97e38f10  5 asok(0xff6c) 
register_command perfcounters_schema hook 0xff63c030
   -50> 2016-03-01 20:52:56.489829 97e38f10  5 asok(0xff6c) 
register_command 2 hook 0xff63c030
   -49> 2016-03-01 20:52:56.489839 97e38f10  5 asok(0xff6c) 
register_command perf schema hook 0xff63c030
   -48> 2016-03-01 20:52:56.489849 97e38f10  5 asok(0xff6c) 
register_command perf reset hook 0xff63c030
   -47> 2016-03-01 20:52:56.489858 97e38f10  5 asok(0xff6c) 
register_command config show hook 0xff63c030
   -46> 2016-03-01 20:52:56.489868 97e38f10  5 asok(0xff6c) 
register_command config set hook 0xff63c030
   -45> 2016-03-01 20:52:56.489877 97e38f10  5 asok(0xff6c) 
register_command config get hook 0xff63c030
   -44> 2016-03-01 20:52:56.489886 97e38f10  5 asok(0xff6c) 
register_command config diff hook 0xff63c030
   -43> 2016-03-01 20:52:56.489896 97e38f10  5 asok(0xff6c) 
register_command log flush hook 0xff63c030
   -42> 2016-03-01 20:52:56.489905 97e38f10  5 asok(0xff6c) 
register_command log dump hook 0xff63c030
   -41> 2016-03-01 20:52:56.489914 97e38f10  5 asok(0xff6c) 
register_command log reopen hook 0xff63c030
   -40> 2016-03-01 20:52:56.497924 97e38f10  0 set uid:gid to 64045:64045
   -39> 2016-03-01 20:52:56.498074 97e38f10  0 ceph version 9.2.1 
(752b6a3020c3de74e07d2a8b4c5e48dab5a6b6fd), process ceph-osd, pid 17095
   -38> 2016-03-01 20:52:56.499547 97e38f10  1 -- 10.18.240.124:0/0 learned 
my addr 10.18.240.124:0/0
   -37> 2016-03-01 20:52:56.499572 97e38f10  1 accepter.accepter.bind 
my_inst.addr is 10.18.240.124:6802/17095 need_addr=0
   -36> 2016-03-01 20:52:56.499620 97e38f10  1 -- 192.168.240.124:0/0 
learned my addr 192.168.240.124:0/0
   -35> 2016-03-01 20:52:56.499638 97e38f10  1 accepter.accepter.bind 
my_inst.addr is 192.168.240.124:6802/17095 need_addr=0
   -34> 2016-03-01 20:52:56.499673 97e38f10  1 -- 192.168.240.124:0/0 
learned my addr 192.168.240.124:0/0
   -33> 2016-03-01 20:52:56.499690 97e38f10  1 accepter.accepter.bind 
my_inst.addr is 192.168.240.124:6803/17095 need_addr=0
   -32> 2016-03-01 20:52:56.499724 97e38f10  1 -- 10.18.240.124:0/0 learned 
my addr 10.18.240.124:0/0
   -31> 2016-03-01 20:52:56.499741 97e38f10  1 accepter.accepter.bind 
my_inst.addr is 10.18.240.124:6803/17095 need_addr=0
   -30> 2016-03-01 20:52:56.503307 97e38f10  5 asok(0xff6c) init 
/var/run/ceph/ceph-osd.100.asok
   -29> 2016-03-01 20:52:56.503329 97e38f10  5 asok(0xff6c) 
bind_and_listen /var/run/ceph/ceph-osd.100.asok
   -28> 2016-03-01 20:52:56.503460 97e38f10  5 asok(0xff6c) 
register_command 0 hook 0xff6380c0
   -27> 2016-03-01 20:52:56.503479 97e38f10  5 asok(0xff6c) 
register_command version hook 0xff6380c0
   -26> 2016-03-01 20:52:56.503490 97e38f10  5 asok(0xff6c) 
register_command git_version hook 0xff6380c0
   -25> 2016-03-01 20:52:56.503500 97e38f10  5 asok(0xff6c) 
register_command help hook 0xff63c1e0
   -24> 2016-03-01 20:52:56.503510 97e38f10  5 asok(0xff6c) 
register_command get_command_descriptions hook 0xff63c1f0
   -23> 2016-03-01 20:52:56.503566 9643f030  5 asok(0xff6c) entry 
start
   -22> 2016-03-01 20:52:56.503635 97e38f10 10 monclient(hunting): 
build_initial_monmap
   -21> 2016-03-01 20:52:56.520227 97e38f10  5 adding auth protocol: cephx
   -20> 2016-03-01 20:52:56.520244 97e38f10  5 adding auth protocol: cephx
   -19> 2016-03-01 20:52:56.520427 97e38f10  5 asok(0xff6c) 
register_command objecter_requests hook 0xff63c2b0
   -18> 2016-03-01 20:52:56.520538 97e38f10  1 -- 10.18.240.124:6802/17095 
messenger.start
   -17> 2016-03-01 20:52:56.520601 97e38f10  1 -- :/0 messenger.start
   -16> 2016-03-01 20:52:56.520655 97e38f10  1 -- 10.18.240.124:6803/17095 
messenger.start
   -15> 2016-03-01 20:52:56.520712 97e38f10  1 -- 
192.168.240.124:6803/17095 messenger.start
   -14> 2016-03-01 20:52:56.520768 97e38f10  1 -- 
192.168.240.124:6802/17095 messenger.start
   -13> 2016-03-01 20:52:56.520824 97e38f10  1 -- :/0 

[ceph-users] Upgrade to INFERNALIS

2016-03-01 Thread Garg, Pankaj
Hi,
I have upgraded my cluster from 0.94.4 as recommended to the just released 
Infernalis (9.2.1) Update directly (skipped 9.2.0).
I installed the packaged on each system, manually (.deb files that I built).

After that I followed the steps :

Stop ceph-all
chown -R  ceph:ceph /var/lib/ceph
start ceph-all


I am still getting errors on starting OSDs.

2016-03-01 22:44:45.991043 7fa185f000 -1 filestore(/var/lib/ceph/osd/ceph-69) 
mount failed to open journal /var/lib/ceph/osd/ceph-69/journal: (13) Permission 
denied
2016-03-01 22:44:46.001112 7fa185f000 -1 osd.69 0 OSD:init: unable to mount 
object store
2016-03-01 22:44:46.001128 7fa185f000 -1  ** ERROR: osd init failed: (13) 
Permission denied


What am I missing?

Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph Read Performance Issues

2015-07-09 Thread Garg, Pankaj
Hi,
I'm experiencing READ performance issues in my Cluster. I have 3 x86 servers 
each with 2 SSDs and 9 OSDs. SSDs are being used for Journaling.
I seem to get erratic READ performance numbers when using Rados Bench read test.

I ran a test with just a single x86 server, with 2 SSDs, and 9 OSDS. Pool had 
replication factor of 3.

Write  Bandwidth : 527 MB/Sec (rados bench write with default options)

Read Bandwidth : Run 1 : 201 MB/Sec (redos bench read with same pool as write).

Read Bandwidth : Run 2 : 381 MB/Sec

Read Bandwidth : Run 2 : 482 MB/Sec

In Run 2 and Run 3 : I start off with 1100 MB/Sec (basically maxing out my 10G 
Link), but by the time the 60 second read test finishes, the bandwidth drops to 
400 MB/Sec range.

Any ideas to what might be going wrong? Overall my writes are faster than 
reads. That doesn't seem correct.


Thanks
Pankaj




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Block Size

2015-06-19 Thread Garg, Pankaj
Hi,

I have been formatting my OSD drives with XFS (using mkfs.xfs )with default 
options. Is it recommended for Ceph to choose a bigger block size?
I'd like to understand the impact of block size. Any recommendations?

Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Erasure Coded Pools and PGs

2015-06-17 Thread Garg, Pankaj
Hi,

I have 5 OSD servers, with total of 45 OSDS in my clusters. I am trying out 
Erasure Coding with different K and m values.
I seem to always get Warnings about : Degraded and Undersized PGs, whenever I 
create a profile and create a Pool based on that profile.
I have profiles with K and M value pairs : (2,1), (3,3) and (5,3).
What would be appropriate PG values? I have tried from as low as 12 to 1024 and 
always get the Degraded and Undersized PGs. This is quite confusing.

Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RADOS Bench

2015-06-15 Thread Garg, Pankaj
Thanks Somnath. Do you mean that I should run Rados Bench in parallel on 2 
different clients?
Is there a way to run Rados Bench from 2 clients, so that they run in parallel, 
except launching them together manually?

From: Somnath Roy [mailto:somnath@sandisk.com]
Sent: Monday, June 15, 2015 1:01 PM
To: Garg, Pankaj; ceph-users@lists.ceph.com
Subject: RE: RADOS Bench

Pankaj,
It is the cumulative BW of ceph cluster but you will be limited by your single 
client BW always.
To verify if you are single client 10Gb network limited or not, put another 
client and see if it is scaling or not.

Thanks  Regards
Somnath

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Garg, 
Pankaj
Sent: Monday, June 15, 2015 12:55 PM
To: ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
Subject: [ceph-users] RADOS Bench

Hi,
I have a few machines in my Ceph Cluster. I have another machine that I use to 
run RADOS Bench to get the performance.
I am now seeing numbers around 1100 MB/Sec, which is quite close to saturation 
point of the 10Gbps link.

I'd like to understand what does the total bandwidth number represent after I 
run the Rados bench test? Is this cumulative bandwidth of the Ceph Cluster or 
does it represent the
Bandwidth to the client machine?

I'd like to understand if I'm now being limited by my network.

Thanks
Pankaj



PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] RADOS Bench

2015-06-15 Thread Garg, Pankaj
Hi,
I have a few machines in my Ceph Cluster. I have another machine that I use to 
run RADOS Bench to get the performance.
I am now seeing numbers around 1100 MB/Sec, which is quite close to saturation 
point of the 10Gbps link.

I'd like to understand what does the total bandwidth number represent after I 
run the Rados bench test? Is this cumulative bandwidth of the Ceph Cluster or 
does it represent the
Bandwidth to the client machine?

I'd like to understand if I'm now being limited by my network.

Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy for Hammer

2015-05-28 Thread Garg, Pankaj
Hi Travis,

These binaries are hosted on Canonical servers and are only for Ubuntu. Until 
the latest FireFly patch release 0.80.9, everything worked fine. I just tried 
the hammer binaries, and they seem to be failing in loading up erasure coding 
libraries.
I have now built my own binaries and I was able to get the cluster up and 
running using ceph-deploy. 
You just have to skip the ceph installation step with ceph-deploy and rather do 
a manual install from deb files. Rest worked fine.

Thanks
Pankaj

-Original Message-
From: Travis Rhoden [mailto:trho...@gmail.com] 
Sent: Thursday, May 28, 2015 8:02 AM
To: Garg, Pankaj
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] ceph-deploy for Hammer

Hi Pankaj,

While there have been times in the past where ARM binaries were hosted on 
ceph.com, there is not currently any ARM hardware for builds.  I don't think 
you will see any ARM binaries in 
http://ceph.com/debian-hammer/pool/main/c/ceph/, for example.

Combine that with the fact that ceph-deploy is not intended to work with 
locally compiled binaries (only packages, as it relies on paths, conventions, 
and service definitions from the packages), and it is a very tricky combo to 
use ceph-deploy and ARM together.

Your most recent error is indicative of the ceph-mon service not coming up 
successfully.  when ceph-mon (the service, not the daemon) is started, it also 
calls ceph-create-keys, which waits for the monitor daemon to come up and the 
creates keys that are necessary for all clusters to run when using cephx (the 
admin key, the bootsraps keys).

 - Travis

On Wed, May 27, 2015 at 8:27 PM, Garg, Pankaj pankaj.g...@caviumnetworks.com 
wrote:
 Actually the ARM binaries do exist and I have been using for previous 
 releases. Somehow this library is the one that doesn’t load.

 Anyway I did compile my own Ceph for ARM, and now getting the 
 following
 issue:



 [ceph_deploy.gatherkeys][WARNIN] Unable to find 
 /etc/ceph/ceph.client.admin.keyring on ceph1

 [ceph_deploy][ERROR ] KeyNotFoundError: Could not find keyring file:
 /etc/ceph/ceph.client.admin.keyring on host ceph1





 From: Somnath Roy [mailto:somnath@sandisk.com]
 Sent: Wednesday, May 27, 2015 4:29 PM
 To: Garg, Pankaj


 Cc: ceph-users@lists.ceph.com
 Subject: RE: ceph-deploy for Hammer



 If you are trying to install the ceph repo hammer binaries, I don’t 
 think it is built for ARM. Both binary and the .so needs to be built 
 in ARM to make this work I guess.

 Try to build hammer code base in your ARM server and then retry.



 Thanks  Regards

 Somnath



 From: Pankaj Garg [mailto:pankaj.g...@caviumnetworks.com]
 Sent: Wednesday, May 27, 2015 4:17 PM
 To: Somnath Roy
 Cc: ceph-users@lists.ceph.com
 Subject: RE: ceph-deploy for Hammer



 Yes I am on ARM.

 -Pankaj

 On May 27, 2015 3:58 PM, Somnath Roy somnath@sandisk.com wrote:

 Are you running this on ARM ?

 If not, it should not go for loading this library.



 Thanks  Regards

 Somnath



 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf 
 Of Garg, Pankaj
 Sent: Wednesday, May 27, 2015 2:26 PM
 To: Garg, Pankaj; ceph-users@lists.ceph.com
 Subject: Re: [ceph-users] ceph-deploy for Hammer



 I seem to be getting these errors in the Monitor Log :

 2015-05-27 21:17:41.908839 3ff907368e0 -1
 erasure_code_init(jerasure,/usr/lib/aarch64-linux-gnu/ceph/erasure-code):
 (5) Input/output error

 2015-05-27 21:17:41.978113 3ff969168e0  0 ceph version 0.94.1 
 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff), process ceph-mon, pid 
 16592

 2015-05-27 21:17:41.984383 3ff969168e0 -1 ErasureCodePluginSelectJerasure:
 load
 dlopen(/usr/lib/aarch64-linux-gnu/ceph/erasure-code/libec_jerasure_neon.so):
 /usr/lib/aarch64-linux-gnu/ceph/erasure-code/libec_jerasure_neon.so: 
 cannot open shared object file: No such file or directory

 2015-05-27 21:17:41.98 3ff969168e0 -1
 erasure_code_init(jerasure,/usr/lib/aarch64-linux-gnu/ceph/erasure-code):
 (5) Input/output error

 2015-05-27 21:17:42.052415 3ff90cf68e0  0 ceph version 0.94.1 
 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff), process ceph-mon, pid 
 16604

 2015-05-27 21:17:42.058656 3ff90cf68e0 -1 ErasureCodePluginSelectJerasure:
 load
 dlopen(/usr/lib/aarch64-linux-gnu/ceph/erasure-code/libec_jerasure_neon.so):
 /usr/lib/aarch64-linux-gnu/ceph/erasure-code/libec_jerasure_neon.so: 
 cannot open shared object file: No such file or directory

 2015-05-27 21:17:42.058715 3ff90cf68e0 -1
 erasure_code_init(jerasure,/usr/lib/aarch64-linux-gnu/ceph/erasure-code):
 (5) Input/output error

 2015-05-27 21:17:42.125279 3ffac4368e0  0 ceph version 0.94.1 
 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff), process ceph-mon, pid 
 16616

 2015-05-27 21:17:42.131666 3ffac4368e0 -1 ErasureCodePluginSelectJerasure:
 load
 dlopen(/usr/lib/aarch64-linux-gnu/ceph/erasure-code/libec_jerasure_neon.so):
 /usr/lib/aarch64-linux-gnu/ceph/erasure-code/libec_jerasure_neon.so: 
 cannot open shared object file: No such file or directory

[ceph-users] TCP or UDP

2015-05-28 Thread Garg, Pankaj
Hi,
Does ceph typically use TCP or UDP or something else for data path for 
connection to clients and inter OSD cluster traffic?

Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-deploy for Hammer

2015-05-27 Thread Garg, Pankaj
Hi,
Is there a particular verion of Ceph-Deploy that should be used with Hammer 
release? This is a brand new cluster.
I'm getting the following error when running command : ceph-deploy mon 
create-initial

[ceph_deploy.conf][DEBUG ] found configuration file at: 
/home/cephuser/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.25): /usr/local/bin/ceph-deploy mon 
create-initial
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph1
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph1 ...
[ceph1][DEBUG ] connection detected need for sudo
[ceph1][DEBUG ] connected to host: ceph1
[ceph1][DEBUG ] detect platform information from remote host
[ceph1][DEBUG ] detect machine type
[ceph_deploy.mon][INFO  ] distro info: Ubuntu 14.04 trusty
[ceph1][DEBUG ] determining if provided host has same hostname in remote
[ceph1][DEBUG ] get remote short hostname
[ceph1][DEBUG ] deploying mon to ceph1
[ceph1][DEBUG ] get remote short hostname
[ceph1][DEBUG ] remote hostname: ceph1
[ceph1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph1][DEBUG ] create the mon path if it does not exist
[ceph1][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph1/done
[ceph1][DEBUG ] create a done file to avoid re-doing the mon deployment
[ceph1][DEBUG ] create the init path if it does not exist
[ceph1][DEBUG ] locating the `service` executable...
[ceph1][INFO  ] Running command: sudo initctl emit ceph-mon cluster=ceph 
id=ceph1
[ceph1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.ceph1.asok mon_status
[ceph1][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] 
No such file or directory
[ceph1][WARNIN] monitor: mon.ceph1, might not be running yet
[ceph1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.ceph1.asok mon_status
[ceph1][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] 
No such file or directory
[ceph1][WARNIN] monitor ceph1 does not exist in monmap
[ceph_deploy.mon][INFO  ] processing monitor mon.ceph1
[ceph1][DEBUG ] connection detected need for sudo
[ceph1][DEBUG ] connected to host: ceph1
[ceph1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.ceph1.asok mon_status
[ceph1][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] 
No such file or directory
[ceph_deploy.mon][WARNIN] mon.ceph1 monitor is not yet in quorum, tries left: 5
[ceph_deploy.mon][WARNIN] waiting 5 seconds before retrying
[ceph1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.ceph1.asok mon_status
[ceph1][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] 
No such file or directory
[ceph_deploy.mon][WARNIN] mon.ceph1 monitor is not yet in quorum, tries left: 4
[ceph_deploy.mon][WARNIN] waiting 10 seconds before retrying
[ceph1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.ceph1.asok mon_status
[ceph1][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] 
No such file or directory
[ceph_deploy.mon][WARNIN] mon.ceph1 monitor is not yet in quorum, tries left: 3
[ceph_deploy.mon][WARNIN] waiting 10 seconds before retrying
[ceph1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.ceph1.asok mon_status
[ceph1][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] 
No such file or directory
[ceph_deploy.mon][WARNIN] mon.ceph1 monitor is not yet in quorum, tries left: 2
[ceph_deploy.mon][WARNIN] waiting 15 seconds before retrying
[ceph1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.ceph1.asok mon_status
[ceph1][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] 
No such file or directory
[ceph_deploy.mon][WARNIN] mon.ceph1 monitor is not yet in quorum, tries left: 1
[ceph_deploy.mon][WARNIN] waiting 20 seconds before retrying
[ceph_deploy.mon][ERROR ] Some monitors have still not reached quorum:
[ceph_deploy.mon][ERROR ] ceph1

Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy for Hammer

2015-05-27 Thread Garg, Pankaj
I seem to be getting these errors in the Monitor Log :
2015-05-27 21:17:41.908839 3ff907368e0 -1 
erasure_code_init(jerasure,/usr/lib/aarch64-linux-gnu/ceph/erasure-code): (5) 
Input/output error
2015-05-27 21:17:41.978113 3ff969168e0  0 ceph version 0.94.1 
(e4bfad3a3c51054df7e537a724c8d0bf9be972ff), process ceph-mon, pid 16592
2015-05-27 21:17:41.984383 3ff969168e0 -1 ErasureCodePluginSelectJerasure: load 
dlopen(/usr/lib/aarch64-linux-gnu/ceph/erasure-code/libec_jerasure_neon.so): 
/usr/lib/aarch64-linux-gnu/ceph/erasure-code/libec_jerasure_neon.so: cannot 
open shared object file: No such file or directory
2015-05-27 21:17:41.98 3ff969168e0 -1 
erasure_code_init(jerasure,/usr/lib/aarch64-linux-gnu/ceph/erasure-code): (5) 
Input/output error
2015-05-27 21:17:42.052415 3ff90cf68e0  0 ceph version 0.94.1 
(e4bfad3a3c51054df7e537a724c8d0bf9be972ff), process ceph-mon, pid 16604
2015-05-27 21:17:42.058656 3ff90cf68e0 -1 ErasureCodePluginSelectJerasure: load 
dlopen(/usr/lib/aarch64-linux-gnu/ceph/erasure-code/libec_jerasure_neon.so): 
/usr/lib/aarch64-linux-gnu/ceph/erasure-code/libec_jerasure_neon.so: cannot 
open shared object file: No such file or directory
2015-05-27 21:17:42.058715 3ff90cf68e0 -1 
erasure_code_init(jerasure,/usr/lib/aarch64-linux-gnu/ceph/erasure-code): (5) 
Input/output error
2015-05-27 21:17:42.125279 3ffac4368e0  0 ceph version 0.94.1 
(e4bfad3a3c51054df7e537a724c8d0bf9be972ff), process ceph-mon, pid 16616
2015-05-27 21:17:42.131666 3ffac4368e0 -1 ErasureCodePluginSelectJerasure: load 
dlopen(/usr/lib/aarch64-linux-gnu/ceph/erasure-code/libec_jerasure_neon.so): 
/usr/lib/aarch64-linux-gnu/ceph/erasure-code/libec_jerasure_neon.so: cannot 
open shared object file: No such file or directory
2015-05-27 21:17:42.131726 3ffac4368e0 -1 
erasure_code_init(jerasure,/usr/lib/aarch64-linux-gnu/ceph/erasure-code): (5) 
Input/output error


The lib file exists, so not sure why this is happening. Any help appreciated.

Thanks
Pankaj

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Garg, 
Pankaj
Sent: Wednesday, May 27, 2015 1:37 PM
To: ceph-users@lists.ceph.com
Subject: [ceph-users] ceph-deploy for Hammer

Hi,
Is there a particular verion of Ceph-Deploy that should be used with Hammer 
release? This is a brand new cluster.
I'm getting the following error when running command : ceph-deploy mon 
create-initial

[ceph_deploy.conf][DEBUG ] found configuration file at: 
/home/cephuser/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.25): /usr/local/bin/ceph-deploy mon 
create-initial
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph1
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph1 ...
[ceph1][DEBUG ] connection detected need for sudo
[ceph1][DEBUG ] connected to host: ceph1
[ceph1][DEBUG ] detect platform information from remote host
[ceph1][DEBUG ] detect machine type
[ceph_deploy.mon][INFO  ] distro info: Ubuntu 14.04 trusty
[ceph1][DEBUG ] determining if provided host has same hostname in remote
[ceph1][DEBUG ] get remote short hostname
[ceph1][DEBUG ] deploying mon to ceph1
[ceph1][DEBUG ] get remote short hostname
[ceph1][DEBUG ] remote hostname: ceph1
[ceph1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph1][DEBUG ] create the mon path if it does not exist
[ceph1][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph1/done
[ceph1][DEBUG ] create a done file to avoid re-doing the mon deployment
[ceph1][DEBUG ] create the init path if it does not exist
[ceph1][DEBUG ] locating the `service` executable...
[ceph1][INFO  ] Running command: sudo initctl emit ceph-mon cluster=ceph 
id=ceph1
[ceph1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.ceph1.asok mon_status
[ceph1][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] 
No such file or directory
[ceph1][WARNIN] monitor: mon.ceph1, might not be running yet
[ceph1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.ceph1.asok mon_status
[ceph1][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] 
No such file or directory
[ceph1][WARNIN] monitor ceph1 does not exist in monmap
[ceph_deploy.mon][INFO  ] processing monitor mon.ceph1
[ceph1][DEBUG ] connection detected need for sudo
[ceph1][DEBUG ] connected to host: ceph1
[ceph1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.ceph1.asok mon_status
[ceph1][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] 
No such file or directory
[ceph_deploy.mon][WARNIN] mon.ceph1 monitor is not yet in quorum, tries left: 5
[ceph_deploy.mon][WARNIN] waiting 5 seconds before retrying
[ceph1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.ceph1.asok mon_status
[ceph1][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] 
No such file or directory
[ceph_deploy.mon][WARNIN] mon.ceph1

Re: [ceph-users] ceph-deploy for Hammer

2015-05-27 Thread Garg, Pankaj
Actually the ARM binaries do exist and I have been using for previous releases. 
Somehow this library is the one that doesn’t load.
Anyway I did compile my own Ceph for ARM, and now getting the following issue:

[ceph_deploy.gatherkeys][WARNIN] Unable to find 
/etc/ceph/ceph.client.admin.keyring on ceph1
[ceph_deploy][ERROR ] KeyNotFoundError: Could not find keyring file: 
/etc/ceph/ceph.client.admin.keyring on host ceph1


From: Somnath Roy [mailto:somnath@sandisk.com]
Sent: Wednesday, May 27, 2015 4:29 PM
To: Garg, Pankaj
Cc: ceph-users@lists.ceph.com
Subject: RE: ceph-deploy for Hammer

If you are trying to install the ceph repo hammer binaries, I don’t think it is 
built for ARM. Both binary and the .so needs to be built in ARM to make this 
work I guess.
Try to build hammer code base in your ARM server and then retry.

Thanks  Regards
Somnath

From: Pankaj Garg [mailto:pankaj.g...@caviumnetworks.com]
Sent: Wednesday, May 27, 2015 4:17 PM
To: Somnath Roy
Cc: ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
Subject: RE: ceph-deploy for Hammer


Yes I am on ARM.

-Pankaj
On May 27, 2015 3:58 PM, Somnath Roy 
somnath@sandisk.commailto:somnath@sandisk.com wrote:

Are you running this on ARM ?

If not, it should not go for loading this library.



Thanks  Regards

Somnath



From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Garg, 
Pankaj
Sent: Wednesday, May 27, 2015 2:26 PM
To: Garg, Pankaj; ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
Subject: Re: [ceph-users] ceph-deploy for Hammer



I seem to be getting these errors in the Monitor Log :

2015-05-27 21:17:41.908839 3ff907368e0 -1 
erasure_code_init(jerasure,/usr/lib/aarch64-linux-gnu/ceph/erasure-code): (5) 
Input/output error

2015-05-27 21:17:41.978113 3ff969168e0  0 ceph version 0.94.1 
(e4bfad3a3c51054df7e537a724c8d0bf9be972ff), process ceph-mon, pid 16592

2015-05-27 21:17:41.984383 3ff969168e0 -1 ErasureCodePluginSelectJerasure: load 
dlopen(/usr/lib/aarch64-linux-gnu/ceph/erasure-code/libec_jerasure_neon.so): 
/usr/lib/aarch64-linux-gnu/ceph/erasure-code/libec_jerasure_neon.so: cannot 
open shared object file: No such file or directory

2015-05-27 21:17:41.98 3ff969168e0 -1 
erasure_code_init(jerasure,/usr/lib/aarch64-linux-gnu/ceph/erasure-code): (5) 
Input/output error

2015-05-27 21:17:42.052415 3ff90cf68e0  0 ceph version 0.94.1 
(e4bfad3a3c51054df7e537a724c8d0bf9be972ff), process ceph-mon, pid 16604

2015-05-27 21:17:42.058656 3ff90cf68e0 -1 ErasureCodePluginSelectJerasure: load 
dlopen(/usr/lib/aarch64-linux-gnu/ceph/erasure-code/libec_jerasure_neon.so): 
/usr/lib/aarch64-linux-gnu/ceph/erasure-code/libec_jerasure_neon.so: cannot 
open shared object file: No such file or directory

2015-05-27 21:17:42.058715 3ff90cf68e0 -1 
erasure_code_init(jerasure,/usr/lib/aarch64-linux-gnu/ceph/erasure-code): (5) 
Input/output error

2015-05-27 21:17:42.125279 3ffac4368e0  0 ceph version 0.94.1 
(e4bfad3a3c51054df7e537a724c8d0bf9be972ff), process ceph-mon, pid 16616

2015-05-27 21:17:42.131666 3ffac4368e0 -1 ErasureCodePluginSelectJerasure: load 
dlopen(/usr/lib/aarch64-linux-gnu/ceph/erasure-code/libec_jerasure_neon.so): 
/usr/lib/aarch64-linux-gnu/ceph/erasure-code/libec_jerasure_neon.so: cannot 
open shared object file: No such file or directory

2015-05-27 21:17:42.131726 3ffac4368e0 -1 
erasure_code_init(jerasure,/usr/lib/aarch64-linux-gnu/ceph/erasure-code): (5) 
Input/output error





The lib file exists, so not sure why this is happening. Any help appreciated.



Thanks

Pankaj



From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Garg, 
Pankaj
Sent: Wednesday, May 27, 2015 1:37 PM
To: ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
Subject: [ceph-users] ceph-deploy for Hammer



Hi,

Is there a particular verion of Ceph-Deploy that should be used with Hammer 
release? This is a brand new cluster.

I’m getting the following error when running command : ceph-deploy mon 
create-initial



[ceph_deploy.conf][DEBUG ] found configuration file at: 
/home/cephuser/.cephdeploy.conf

[ceph_deploy.cli][INFO  ] Invoked (1.5.25): /usr/local/bin/ceph-deploy mon 
create-initial

[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph1

[ceph_deploy.mon][DEBUG ] detecting platform for host ceph1 ...

[ceph1][DEBUG ] connection detected need for sudo

[ceph1][DEBUG ] connected to host: ceph1

[ceph1][DEBUG ] detect platform information from remote host

[ceph1][DEBUG ] detect machine type

[ceph_deploy.mon][INFO  ] distro info: Ubuntu 14.04 trusty

[ceph1][DEBUG ] determining if provided host has same hostname in remote

[ceph1][DEBUG ] get remote short hostname

[ceph1][DEBUG ] deploying mon to ceph1

[ceph1][DEBUG ] get remote short hostname

[ceph1][DEBUG ] remote hostname: ceph1

[ceph1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf

[ceph1][DEBUG ] create the mon path if it does not exist

[ceph1][DEBUG ] checking for done path: /var

[ceph-users] Block Size

2015-05-26 Thread Garg, Pankaj
Hi,

What block size does ceph use, and what is the most optimal size? I'm assuming 
it uses whatever the file system has been formatted with.

Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Firefly to Hammer

2015-04-24 Thread Garg, Pankaj
Hi,

Can I simply do apt-get upgrade on my FireFly cluster and move to Hammer? I'm 
assuming Monitor nodes should be done first.
Any particular sequence or any other procedures that I need to follow? Any 
information is appreciated.

Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Erasure Coding : gf-Complete

2015-04-23 Thread Garg, Pankaj
Hi,

I would like to use the gf-complete library for Erasure coding since it has 
some ARM v8 based optimizations. I see that the code is part of my tree, but 
not sure if these libraries are included in the final build.
I only see the libec_jerasure*.so in my libs folder after installation.
Are the gf-complete based optimizations part of this already? Or do I build 
them separately and then install them.
I am using the latest Firefly release (0.80.9).

Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Erasure Coding : gf-Complete

2015-04-23 Thread Garg, Pankaj
Thanks Loic. I was just looking at the source trees for gf-complete and saw 
that v2-ceph tag has the optimizations and that's associated with Hammer.

One more question, on Hammer, will the Optimizations kick in automatically for 
ARM. Do all of the different techniques have ARM optimizations or do I have to 
select a particular one to take advantage of them?

-Pankaj

-Original Message-
From: Loic Dachary [mailto:l...@dachary.org] 
Sent: Thursday, April 23, 2015 2:47 PM
To: Garg, Pankaj; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Erasure Coding : gf-Complete

Hi,

The ARMv8 optimizations for gf-complete are in Hammer, not in Firefly. The 
libec_jerasure*.so plugin contains gf-complete.

Cheers

On 23/04/2015 23:29, Garg, Pankaj wrote:
 Hi,
 
  
 
 I would like to use the gf-complete library for Erasure coding since it has 
 some ARM v8 based optimizations. I see that the code is part of my tree, but 
 not sure if these libraries are included in the final build.
 
 I only see the libec_jerasure*.so in my libs folder after installation.
 
 Are the gf-complete based optimizations part of this already? Or do I build 
 them separately and then install them.
 
 I am using the latest Firefly release (0.80.9).
 
  
 
 Thanks
 
 Pankaj
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 

-- 
Loïc Dachary, Artisan Logiciel Libre

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Upgrade from Firefly to Hammer

2015-04-14 Thread Garg, Pankaj
Hi,

I have a small cluster of 7 machines. Can I just individually upgrade each of 
them (using apt-get upgrade) from Firefly to Hammer release, or there more to 
it than that?

Thank
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Building Ceph

2015-04-02 Thread Garg, Pankaj
Hi,
I am building Ceph Debian Packages off of the 0.80.9 (latest firefly) and on 
top of that I am applying an optimization patch.
I am following the standard instructions from the README file and effectively 
running commands in this order:

$ ./autogen.sh
$ ./configure
$ make
$ dpkg-buildpackage

This builds deb packages for me with 0.80.9-1 , which is what I want.
However when I install this build, I end up with Ceph binaries, that were in 
the 0.80.9 build that came off of the Ubuntu repo.
Also the Ceph -version command shows 0.80.9 ( I would expect it to show 
0.80.9-1).

Do I have to change some versioning files, or do some other steps in order for 
the debian packages to work properly.


Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD Journaling

2015-03-31 Thread Garg, Pankaj
Hi Mark,

Yes my reads are consistently slower. I have testes both Random and Sequential 
and various block sizes.

Thanks
Pankaj

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mark 
Nelson
Sent: Monday, March 30, 2015 1:07 PM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] SSD Journaling

On 03/30/2015 03:01 PM, Garg, Pankaj wrote:
 Hi,

 I'm benchmarking my small cluster with HDDs vs HDDs with SSD Journaling.
 I am using both RADOS bench and Block device (using fio) for testing.

 I am seeing significant Write performance improvements, as expected. I 
 am however seeing the Reads coming out a bit slower on the SSD 
 Journaling side. They are not terribly different, but sometimes 10% slower.

 Is that something other folks have also seen, or do I need some 
 settings to be tuned properly? I'm wondering if accessing 2 drives for 
 reads, adds latency and hence the throughput suffers.

Hi,

What kind of reads are you seeing the degradation with?  Is it consistent with 
different sizes and random/seq?  Any interesting spikes or valleys during the 
tests?


 Thanks

 Pankaj



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] SSD Journaling

2015-03-30 Thread Garg, Pankaj
Hi,
I'm benchmarking my small cluster with HDDs vs HDDs with SSD Journaling. I am 
using both RADOS bench and Block device (using fio) for testing.
I am seeing significant Write performance improvements, as expected. I am 
however seeing the Reads coming out a bit slower on the SSD Journaling side. 
They are not terribly different, but sometimes 10% slower.
Is that something other folks have also seen, or do I need some settings to be 
tuned properly? I'm wondering if accessing 2 drives for reads, adds latency and 
hence the throughput suffers.

Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Server Specific Pools

2015-03-19 Thread Garg, Pankaj
Hi,

I have a Ceph cluster with both ARM and x86 based servers in the same cluster. 
Is there a way for me to define Pools or some logical separation that would 
allow me to use only 1 set of machines for a particular test.
That way it makes easy for me to run tests either on x86 or ARM and do some 
comparison testing.

Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph Cluster Address

2015-03-03 Thread Garg, Pankaj
Hi,
I have ceph cluster that is contained within a rack (1 Monitor and 5 OSD 
nodes). I kept the same public and private address for configuration.
I do have 2 NICS and 2 valid IP addresses (one internal only and one external) 
for each machine.

Is it possible now, to change the Public Network address, after the cluster is 
up and running?
I had used Ceph-deploy for the cluster. If I change the address of the public 
network in Ceph.conf, do I need to propagate to all the machines in the cluster 
or just the Monitor Node is enough?

Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Calamari Reconfiguration

2015-03-02 Thread Garg, Pankaj
Hi,
I had a cluster that was working correctly with Calamari and I was able to see 
and manage from the Dashboard.
I had to reinstall the cluster and change IP Addresses etc. so I built my 
cluster back up, with same name, but mainly network changes.
When I went to calamari, it shows some stale information about the old cluster.
I cleaned the server side by calamari-ctl clear and then calamari-ctl 
initialize command. I also deleted all salt keys, and restarted salt and 
diamond on al client machines.
Accepted new keys on the servers and though it would clean everything.

It did, but now I basically get a message that This appears to be the first 
time you have started Calamari and there are no clusters currently configured.
I have rebooted the server many times and restarted services. The server says 
that x no of clients are connected to it, but No cluster has been created.

What do I need to clean or reinstall for it to see my cluster information 
again. Clearly they are talking to each other, but somehow the server doesn't 
pick up the new cluster info.
Any help is appreciated.

Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph-deploy issues

2015-02-25 Thread Garg, Pankaj
Hi,
I had a successful ceph cluster that I am rebuilding. I have completely 
uninstalled ceph and any remnants and directories and config files.
While setting up the new cluster, I follow the Ceph-deploy documentation as 
described before. I seem to get an error now (tried many times) :

ceph-deploy mon create-initial command fails in gather keys step. This never 
happened before, and I'm not sure why its failing now.



cephuser@ceph1:~/my-cluster$ ceph-deploy mon create-initial
[ceph_deploy.cli][INFO  ] Invoked (1.4.0): /usr/bin/ceph-deploy mon 
create-initial
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph1
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph1 ...
[ceph1][DEBUG ] connected to host: ceph1
[ceph1][DEBUG ] detect platform information from remote host
[ceph1][DEBUG ] detect machine type
[ceph_deploy.mon][INFO  ] distro info: Ubuntu 14.04 trusty
[ceph1][DEBUG ] determining if provided host has same hostname in remote
[ceph1][DEBUG ] get remote short hostname
[ceph1][DEBUG ] deploying mon to ceph1
[ceph1][DEBUG ] get remote short hostname
[ceph1][DEBUG ] remote hostname: ceph1
[ceph1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph1][DEBUG ] create the mon path if it does not exist
[ceph1][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph1/done
[ceph1][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-ceph1/done
[ceph1][INFO  ] creating keyring file: /var/lib/ceph/tmp/ceph-ceph1.mon.keyring
[ceph1][DEBUG ] create the monitor keyring file
[ceph1][INFO  ] Running command: sudo ceph-mon --cluster ceph --mkfs -i ceph1 
--keyring /var/lib/ceph/tmp/ceph-ceph1.mon.keyring
[ceph1][DEBUG ] ceph-mon: set fsid to 099013d5-126d-45b4-a98e-5f0c386805a4
[ceph1][DEBUG ] ceph-mon: created monfs at /var/lib/ceph/mon/ceph-ceph1 for 
mon.ceph1
[ceph1][INFO  ] unlinking keyring file /var/lib/ceph/tmp/ceph-ceph1.mon.keyring
[ceph1][DEBUG ] create a done file to avoid re-doing the mon deployment
[ceph1][DEBUG ] create the init path if it does not exist
[ceph1][DEBUG ] locating the `service` executable...
[ceph1][INFO  ] Running command: sudo initctl emit ceph-mon cluster=ceph 
id=ceph1
[ceph1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.ceph1.asok mon_status
[ceph1][DEBUG ] 

[ceph1][DEBUG ] status for monitor: mon.ceph1
[ceph1][DEBUG ] {
[ceph1][DEBUG ]   election_epoch: 2,
[ceph1][DEBUG ]   extra_probe_peers: [
[ceph1][DEBUG ] 192.168.240.101:6789/0
[ceph1][DEBUG ]   ],
[ceph1][DEBUG ]   monmap: {
[ceph1][DEBUG ] created: 0.00,
[ceph1][DEBUG ] epoch: 1,
[ceph1][DEBUG ] fsid: 099013d5-126d-45b4-a98e-5f0c386805a4,
[ceph1][DEBUG ] modified: 0.00,
[ceph1][DEBUG ] mons: [
[ceph1][DEBUG ]   {
[ceph1][DEBUG ] addr: 10.18.240.101:6789/0,
[ceph1][DEBUG ] name: ceph1,
[ceph1][DEBUG ] rank: 0
[ceph1][DEBUG ]   }
[ceph1][DEBUG ] ]
[ceph1][DEBUG ]   },
[ceph1][DEBUG ]   name: ceph1,
[ceph1][DEBUG ]   outside_quorum: [],
[ceph1][DEBUG ]   quorum: [
[ceph1][DEBUG ] 0
[ceph1][DEBUG ]   ],
[ceph1][DEBUG ]   rank: 0,
[ceph1][DEBUG ]   state: leader,
[ceph1][DEBUG ]   sync_provider: []
[ceph1][DEBUG ] }
[ceph1][DEBUG ] 

[ceph1][INFO  ] monitor: mon.ceph1 is running
[ceph1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.ceph1.asok mon_status
[ceph_deploy.mon][INFO  ] processing monitor mon.ceph1
[ceph1][DEBUG ] connected to host: ceph1
[ceph1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon 
/var/run/ceph/ceph-mon.ceph1.asok mon_status
[ceph_deploy.mon][INFO  ] mon.ceph1 monitor has reached quorum!
[ceph_deploy.mon][INFO  ] all initial monitors are running and have formed 
quorum
[ceph_deploy.mon][INFO  ] Running gatherkeys...
[ceph_deploy.gatherkeys][DEBUG ] Checking ceph1 for 
/etc/ceph/ceph.client.admin.keyring
[ceph1][DEBUG ] connected to host: ceph1
[ceph1][DEBUG ] detect platform information from remote host
[ceph1][DEBUG ] detect machine type
[ceph1][DEBUG ] fetch remote file
[ceph_deploy.gatherkeys][WARNIN] Unable to find 
/etc/ceph/ceph.client.admin.keyring on ['ceph1']
[ceph_deploy.gatherkeys][DEBUG ] Have ceph.mon.keyring
[ceph_deploy.gatherkeys][DEBUG ] Checking ceph1 for 
/var/lib/ceph/bootstrap-osd/ceph.keyring
[ceph1][DEBUG ] connected to host: ceph1
[ceph1][DEBUG ] detect platform information from remote host
[ceph1][DEBUG ] detect machine type
[ceph1][DEBUG ] fetch remote file
[ceph_deploy.gatherkeys][WARNIN] Unable to find 
/var/lib/ceph/bootstrap-osd/ceph.keyring on ['ceph1']
[ceph_deploy.gatherkeys][DEBUG ] Checking ceph1 for 
/var/lib/ceph/bootstrap-mds/ceph.keyring
[ceph1][DEBUG ] connected to host: ceph1
[ceph1][DEBUG ] detect platform information from remote host
[ceph1][DEBUG ] detect machine type
[ceph1][DEBUG ] 

Re: [ceph-users] Ceph-deploy issues

2015-02-25 Thread Garg, Pankaj
I figured it out.at least first hurdle.
I have 2 networks, 10.18.240.x. and 192.168.240.xx.
I was specifying different public and cluster addresses. Somehow it doesn’t 
like it.
Maybe the issue really is the ceph-deploy is old. I am on ARM64 and this is the 
latest I have for Ubuntu.

After I got past the first hurdle, now I get this message :

2015-02-26 00:03:31.642166 3ff94c7f1f0 -1 monclient(hunting): ERROR: missing 
keyring, cannot use cephx for authentication
2015-02-26 00:03:31.642390 3ff94c7f1f0  0 librados: client.admin initialization 
error (2) No such file or directory
Error connecting to cluster: ObjectNotFound


Thanks
Pankaj

-Original Message-
From: Travis Rhoden [mailto:trho...@gmail.com] 
Sent: Wednesday, February 25, 2015 3:55 PM
To: Garg, Pankaj
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Ceph-deploy issues

Hi Pankaj,

I can't say that it will fix the issue, but the first thing I would encourage 
is to use the latest ceph-deploy.

you are using 1.4.0, which is quite old.  The latest is 1.5.21.

 - Travis

On Wed, Feb 25, 2015 at 3:38 PM, Garg, Pankaj pankaj.g...@caviumnetworks.com 
wrote:
 Hi,

 I had a successful ceph cluster that I am rebuilding. I have 
 completely uninstalled ceph and any remnants and directories and config files.

 While setting up the new cluster, I follow the Ceph-deploy 
 documentation as described before. I seem to get an error now (tried many 
 times) :



 ceph-deploy mon create-initial command fails in gather keys step. This 
 never happened before, and I’m not sure why its failing now.







 cephuser@ceph1:~/my-cluster$ ceph-deploy mon create-initial

 [ceph_deploy.cli][INFO  ] Invoked (1.4.0): /usr/bin/ceph-deploy mon 
 create-initial

 [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph1

 [ceph_deploy.mon][DEBUG ] detecting platform for host ceph1 ...

 [ceph1][DEBUG ] connected to host: ceph1

 [ceph1][DEBUG ] detect platform information from remote host

 [ceph1][DEBUG ] detect machine type

 [ceph_deploy.mon][INFO  ] distro info: Ubuntu 14.04 trusty

 [ceph1][DEBUG ] determining if provided host has same hostname in 
 remote

 [ceph1][DEBUG ] get remote short hostname

 [ceph1][DEBUG ] deploying mon to ceph1

 [ceph1][DEBUG ] get remote short hostname

 [ceph1][DEBUG ] remote hostname: ceph1

 [ceph1][DEBUG ] write cluster configuration to 
 /etc/ceph/{cluster}.conf

 [ceph1][DEBUG ] create the mon path if it does not exist

 [ceph1][DEBUG ] checking for done path: 
 /var/lib/ceph/mon/ceph-ceph1/done

 [ceph1][DEBUG ] done path does not exist: 
 /var/lib/ceph/mon/ceph-ceph1/done

 [ceph1][INFO  ] creating keyring file:
 /var/lib/ceph/tmp/ceph-ceph1.mon.keyring

 [ceph1][DEBUG ] create the monitor keyring file

 [ceph1][INFO  ] Running command: sudo ceph-mon --cluster ceph --mkfs 
 -i
 ceph1 --keyring /var/lib/ceph/tmp/ceph-ceph1.mon.keyring

 [ceph1][DEBUG ] ceph-mon: set fsid to 
 099013d5-126d-45b4-a98e-5f0c386805a4

 [ceph1][DEBUG ] ceph-mon: created monfs at 
 /var/lib/ceph/mon/ceph-ceph1 for
 mon.ceph1

 [ceph1][INFO  ] unlinking keyring file 
 /var/lib/ceph/tmp/ceph-ceph1.mon.keyring

 [ceph1][DEBUG ] create a done file to avoid re-doing the mon 
 deployment

 [ceph1][DEBUG ] create the init path if it does not exist

 [ceph1][DEBUG ] locating the `service` executable...

 [ceph1][INFO  ] Running command: sudo initctl emit ceph-mon 
 cluster=ceph
 id=ceph1

 [ceph1][INFO  ] Running command: sudo ceph --cluster=ceph 
 --admin-daemon /var/run/ceph/ceph-mon.ceph1.asok mon_status

 [ceph1][DEBUG ]
 **
 **

 [ceph1][DEBUG ] status for monitor: mon.ceph1

 [ceph1][DEBUG ] {

 [ceph1][DEBUG ]   election_epoch: 2,

 [ceph1][DEBUG ]   extra_probe_peers: [

 [ceph1][DEBUG ] 192.168.240.101:6789/0

 [ceph1][DEBUG ]   ],

 [ceph1][DEBUG ]   monmap: {

 [ceph1][DEBUG ] created: 0.00,

 [ceph1][DEBUG ] epoch: 1,

 [ceph1][DEBUG ] fsid: 099013d5-126d-45b4-a98e-5f0c386805a4,

 [ceph1][DEBUG ] modified: 0.00,

 [ceph1][DEBUG ] mons: [

 [ceph1][DEBUG ]   {

 [ceph1][DEBUG ] addr: 10.18.240.101:6789/0,

 [ceph1][DEBUG ] name: ceph1,

 [ceph1][DEBUG ] rank: 0

 [ceph1][DEBUG ]   }

 [ceph1][DEBUG ] ]

 [ceph1][DEBUG ]   },

 [ceph1][DEBUG ]   name: ceph1,

 [ceph1][DEBUG ]   outside_quorum: [],

 [ceph1][DEBUG ]   quorum: [

 [ceph1][DEBUG ] 0

 [ceph1][DEBUG ]   ],

 [ceph1][DEBUG ]   rank: 0,

 [ceph1][DEBUG ]   state: leader,

 [ceph1][DEBUG ]   sync_provider: []

 [ceph1][DEBUG ] }

 [ceph1][DEBUG ]
 **
 **

 [ceph1][INFO  ] monitor: mon.ceph1 is running

 [ceph1][INFO  ] Running command: sudo ceph --cluster=ceph 
 --admin-daemon /var/run/ceph/ceph-mon.ceph1.asok mon_status

 [ceph_deploy.mon][INFO  ] processing monitor mon.ceph1

 [ceph1][DEBUG ] connected to host: ceph1

 [ceph1][INFO  ] Running

Re: [ceph-users] Ceph-deploy issues

2015-02-25 Thread Garg, Pankaj
Hi Alan,
Thanks. Worked like magic.
Why did this happen though? I have deployed on the same machine using same 
ceph-deploy and it was fine. 
Not sure if anything is different this time, except my network, which shouldn’t 
affect this.

Thakns
Pankaj

-Original Message-
From: Alan Johnson [mailto:al...@supermicro.com] 
Sent: Wednesday, February 25, 2015 4:24 PM
To: Garg, Pankaj; Travis Rhoden
Cc: ceph-users@lists.ceph.com
Subject: RE: [ceph-users] Ceph-deploy issues

Try sudo chmod +r /etc/ceph/ceph.client.admin.keyring for the error below?

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Garg, 
Pankaj
Sent: Wednesday, February 25, 2015 4:04 PM
To: Travis Rhoden
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Ceph-deploy issues

I figured it out.at least first hurdle.
I have 2 networks, 10.18.240.x. and 192.168.240.xx.
I was specifying different public and cluster addresses. Somehow it doesn’t 
like it.
Maybe the issue really is the ceph-deploy is old. I am on ARM64 and this is the 
latest I have for Ubuntu.

After I got past the first hurdle, now I get this message :

2015-02-26 00:03:31.642166 3ff94c7f1f0 -1 monclient(hunting): ERROR: missing 
keyring, cannot use cephx for authentication
2015-02-26 00:03:31.642390 3ff94c7f1f0  0 librados: client.admin initialization 
error (2) No such file or directory Error connecting to cluster: ObjectNotFound


Thanks
Pankaj

-Original Message-
From: Travis Rhoden [mailto:trho...@gmail.com]
Sent: Wednesday, February 25, 2015 3:55 PM
To: Garg, Pankaj
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Ceph-deploy issues

Hi Pankaj,

I can't say that it will fix the issue, but the first thing I would encourage 
is to use the latest ceph-deploy.

you are using 1.4.0, which is quite old.  The latest is 1.5.21.

 - Travis

On Wed, Feb 25, 2015 at 3:38 PM, Garg, Pankaj pankaj.g...@caviumnetworks.com 
wrote:
 Hi,

 I had a successful ceph cluster that I am rebuilding. I have 
 completely uninstalled ceph and any remnants and directories and config files.

 While setting up the new cluster, I follow the Ceph-deploy 
 documentation as described before. I seem to get an error now (tried many 
 times) :



 ceph-deploy mon create-initial command fails in gather keys step. This 
 never happened before, and I’m not sure why its failing now.







 cephuser@ceph1:~/my-cluster$ ceph-deploy mon create-initial

 [ceph_deploy.cli][INFO  ] Invoked (1.4.0): /usr/bin/ceph-deploy mon 
 create-initial

 [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph1

 [ceph_deploy.mon][DEBUG ] detecting platform for host ceph1 ...

 [ceph1][DEBUG ] connected to host: ceph1

 [ceph1][DEBUG ] detect platform information from remote host

 [ceph1][DEBUG ] detect machine type

 [ceph_deploy.mon][INFO  ] distro info: Ubuntu 14.04 trusty

 [ceph1][DEBUG ] determining if provided host has same hostname in 
 remote

 [ceph1][DEBUG ] get remote short hostname

 [ceph1][DEBUG ] deploying mon to ceph1

 [ceph1][DEBUG ] get remote short hostname

 [ceph1][DEBUG ] remote hostname: ceph1

 [ceph1][DEBUG ] write cluster configuration to 
 /etc/ceph/{cluster}.conf

 [ceph1][DEBUG ] create the mon path if it does not exist

 [ceph1][DEBUG ] checking for done path: 
 /var/lib/ceph/mon/ceph-ceph1/done

 [ceph1][DEBUG ] done path does not exist: 
 /var/lib/ceph/mon/ceph-ceph1/done

 [ceph1][INFO  ] creating keyring file:
 /var/lib/ceph/tmp/ceph-ceph1.mon.keyring

 [ceph1][DEBUG ] create the monitor keyring file

 [ceph1][INFO  ] Running command: sudo ceph-mon --cluster ceph --mkfs 
 -i
 ceph1 --keyring /var/lib/ceph/tmp/ceph-ceph1.mon.keyring

 [ceph1][DEBUG ] ceph-mon: set fsid to
 099013d5-126d-45b4-a98e-5f0c386805a4

 [ceph1][DEBUG ] ceph-mon: created monfs at
 /var/lib/ceph/mon/ceph-ceph1 for
 mon.ceph1

 [ceph1][INFO  ] unlinking keyring file 
 /var/lib/ceph/tmp/ceph-ceph1.mon.keyring

 [ceph1][DEBUG ] create a done file to avoid re-doing the mon 
 deployment

 [ceph1][DEBUG ] create the init path if it does not exist

 [ceph1][DEBUG ] locating the `service` executable...

 [ceph1][INFO  ] Running command: sudo initctl emit ceph-mon 
 cluster=ceph
 id=ceph1

 [ceph1][INFO  ] Running command: sudo ceph --cluster=ceph 
 --admin-daemon /var/run/ceph/ceph-mon.ceph1.asok mon_status

 [ceph1][DEBUG ]
 **
 **

 [ceph1][DEBUG ] status for monitor: mon.ceph1

 [ceph1][DEBUG ] {

 [ceph1][DEBUG ]   election_epoch: 2,

 [ceph1][DEBUG ]   extra_probe_peers: [

 [ceph1][DEBUG ] 192.168.240.101:6789/0

 [ceph1][DEBUG ]   ],

 [ceph1][DEBUG ]   monmap: {

 [ceph1][DEBUG ] created: 0.00,

 [ceph1][DEBUG ] epoch: 1,

 [ceph1][DEBUG ] fsid: 099013d5-126d-45b4-a98e-5f0c386805a4,

 [ceph1][DEBUG ] modified: 0.00,

 [ceph1][DEBUG ] mons: [

 [ceph1][DEBUG ]   {

 [ceph1][DEBUG ] addr: 10.18.240.101:6789/0,

 [ceph1][DEBUG

[ceph-users] Ceph Block Device

2015-02-17 Thread Garg, Pankaj
Hi,
I have a Ceph cluster and I am trying to create a block device. I execute the 
following command, and get errors:


è sudo rbd map cephblockimage --pool rbd -k /etc/ceph/ceph.client.admin.keyring
libkmod: ERROR ../libkmod/libkmod.c:556 kmod_search_moddep: could not open 
moddep file '/lib/modules/3.18.0-02094-gab62ac9/modules.dep.bin'
modinfo: ERROR: Module alias rbd not found.
modprobe: ERROR: ../libkmod/libkmod.c:556 kmod_search_moddep() could not open 
moddep file '/lib/modules/3.18.0-02094-gab62ac9/modules.dep.bin'
rbd: modprobe rbd failed! (256)


Need help with what is wrong. I installed the Ceph package on the machine where 
I execute the command. This is on ARM BTW.  Is there something I am missing?
I am able to run Object storage and rados bench just fine on the cluster.


Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Block Device

2015-02-17 Thread Garg, Pankaj
Hi Brad,

This is Ubuntu 14.04, running on ARM.
/lib/modules/3.18.0-02094-gab62ac9/modules.dep.bin doesn't exist. 
Rmmod rbd command says rmmod: ERROR: Module rbd is not currently loaded.

Running as Root doesn't make any difference. I was running as sudo anyway.

Thanks
Pankaj

-Original Message-
From: Brad Hubbard [mailto:bhubb...@redhat.com] 
Sent: Tuesday, February 17, 2015 5:06 PM
To: Garg, Pankaj; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Ceph Block Device

On 02/18/2015 09:56 AM, Garg, Pankaj wrote:
 Hi,

 I have a Ceph cluster and I am trying to create a block device. I execute the 
 following command, and get errors:

 èsudo rbd map cephblockimage --pool rbd -k /etc/ceph/ceph.client.admin.keyring

 libkmod: ERROR ../libkmod/libkmod.c:556 kmod_search_moddep: could not open 
 moddep file '/lib/modules/3.18.0-02094-gab62ac9/modules.dep.bin'

 modinfo: ERROR: Module alias rbd not found.

 modprobe: ERROR: ../libkmod/libkmod.c:556 kmod_search_moddep() could not open 
 moddep file '/lib/modules/3.18.0-02094-gab62ac9/modules.dep.bin'

 rbd: modprobe rbd failed! (256)

What distro/release is this?

Does /lib/modules/3.18.0-02094-gab62ac9/modules.dep.bin exist?

Can you run the command as root?


 Need help with what is wrong. I installed the Ceph package on the machine 
 where I execute the command. This is on ARM BTW.  Is there something I am 
 missing?

 I am able to run Object storage and rados bench just fine on the cluster.

 Thanks

 Pankaj



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 


Kindest Regards,

Brad Hubbard
Senior Software Maintenance Engineer
Red Hat Global Support Services
Asia Pacific Region
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph Minimum Cluster Install (ARM)

2015-01-07 Thread Garg, Pankaj
Hi,
I am trying to get a very minimal Ceph cluster up and running (on ARM) and I'm 
wondering what is the smallest unit that I can run rados-bench on ?
Documentation at (http://ceph.com/docs/next/start/quick-ceph-deploy/) seems to 
refer to 4 different nodes. Admin Node, Monitor Node and 2 OSD only nodes.

Can the Admin node be an x86 machine even if the deployment is ARM based?

Or can the Admin Node and Monitor node co-exist.

Finally, I'm assuming I can get by with only 1 independent OSD node.

If that's possible, I can get by with 2 ARM systems only. Can someone please 
shed some light on whether this will work?

Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Building Ceph

2015-01-05 Thread Garg, Pankaj
Hi Ken,
Spot-on. After much googling I just figured out the name and yes it is very 
un-intuitive named keyutils-libs-devel.
And yes the names for debian etc are libkeyutils-dev.

I'm not a linux expert and this stuff does drive me crazy.

Thanks
Pankaj

-Original Message-
From: Ken Dreyer [mailto:kdre...@redhat.com] 
Sent: Monday, January 05, 2015 11:23 AM
To: Garg, Pankaj
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Building Ceph

On 01/05/2015 11:26 AM, Garg, Pankaj wrote:
 I'm trying to build Ceph on my RHEL (Scientific Linux 7 - Nitrogen), 
 with 3.10.0.
 
 I am using the configure script and I am now stuck on libkeyutils not 
 found.
 
 I can't seem to find the right library for this. What Is the right yum 
 update name for this library?


The package name is not exactly intuitive: it's keyutils-libs-devel

Some of the autoconf AC_CHECK_LIB functions fail with messages that are 
slightly more helpful when you're trying to figure this stuff out. I've altered 
the libkeytuils check to do the same:

https://github.com/ceph/ceph/pull/3293

(I'm not a Debian/Ubuntu expert yet, but I'm guessing the package name is just 
libkeyutils-dev, right? That's what's in ./debian/control,
anyway.)

- Ken
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Building Ceph

2015-01-05 Thread Garg, Pankaj
Hi,
I'm trying to build Ceph on my RHEL (Scientific Linux 7 - Nitrogen), with 
3.10.0.
I am using the configure script and I am now stuck on libkeyutils not found.
I can't seem to find the right library for this. What Is the right yum update 
name for this library?
Any help appreciated.
Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ARM v8

2014-12-22 Thread Garg, Pankaj
Hi,
Where can I find ARMv8 binaries for Ceph Firefly for either RHEL or Ubuntu? Or 
do we just have to compile from source files to produce an installable package.

Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com