date:20160113

Re: [ceph-users] How to do quiesced rbd snapshot in libvirt?

2016-01-13 Thread Василий Ангапов

Hello again!

Unfortunately I have to raise the problem again. I have constantly
hanging snapshots on several images.
My Ceph version is now 0.94.5.
RBD CLI always giving me this:
root@slpeah001:[~]:# rbd snap create
volumes/volume-26c89a0a-be4d-45d4-85a6-e0dc134941fd --snap test
2016-01-13 12:04:39.107166 7fb70e4c2880 -1 librbd::ImageWatcher:
0x427a710 no lock owners detected
2016-01-13 12:04:44.108783 7fb70e4c2880 -1 librbd::ImageWatcher:
0x427a710 no lock owners detected
2016-01-13 12:04:49.110321 7fb70e4c2880 -1 librbd::ImageWatcher:
0x427a710 no lock owners detected
2016-01-13 12:04:54.112373 7fb70e4c2880 -1 librbd::ImageWatcher:
0x427a710 no lock owners detected

I turned "debug rbd = 20" and found this records only on one of OSDs
(on the same host as RBD client):
2016-01-13 11:44:46.076780 7fb5f05d8700  0 --
192.168.252.11:6804/407141 >> 192.168.252.11:6800/407122
pipe(0x392d2000 sd=257 :6804 s=2 pgs=17 cs=1 l=0 c=0x383b4160).fault
with nothing to send, going to standby
2016-01-13 11:58:26.261460 7fb5efbce700  0 --
192.168.252.11:6804/407141 >> 192.168.252.11:6802/407124
pipe(0x39e45000 sd=156 :6804 s=2 pgs=17 cs=1 l=0 c=0x386fbb20).fault
with nothing to send, going to standby
2016-01-13 12:04:23.948931 7fb5fede2700  0 --
192.168.254.11:6804/407141 submit_message watch-notify(notify_complete
(2) cookie 44850800 notify 99720550678667 ret -110) v3 remote,
192.168.254.11:0/1468572, failed lossy con, dropping message
0x3ab76fc0
2016-01-13 12:09:04.254329 7fb5fede2700  0 --
192.168.254.11:6804/407141 submit_message watch-notify(notify_complete
(2) cookie 69846112 notify 99720550678721 ret -110) v3 remote,
192.168.254.11:0/1509673, failed lossy con, dropping message
0x3830cb40

Here is the image properties
root@slpeah001:[~]:# rbd info
volumes/volume-26c89a0a-be4d-45d4-85a6-e0dc134941fd
rbd image 'volume-26c89a0a-be4d-45d4-85a6-e0dc134941fd':
size 200 GB in 51200 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.2f2a81562fea59
format: 2
features: layering, striping, exclusive, object map
flags:
stripe unit: 4096 kB
stripe count: 1
root@slpeah001:[~]:# rbd status
volumes/volume-26c89a0a-be4d-45d4-85a6-e0dc134941fd
Watchers:
watcher=192.168.254.17:0/2088291 client.3424561 cookie=93888518795008
root@slpeah001:[~]:# rbd lock list
volumes/volume-26c89a0a-be4d-45d4-85a6-e0dc134941fd
There is 1 exclusive lock on this image.
Locker ID  Address
client.3424561 auto 93888518795008 192.168.254.17:0/2088291

Also taking RBD snapshots from python API also is hanging...
This image is being used by libvirt.

Any suggestions?
Thanks!

Regards, Vasily.


2016-01-06 1:11 GMT+08:00 Мистер Сёма :
> Well, I believe the problem is no more valid.
> My code before was:
> virsh qemu-agent-command $INSTANCE '{"execute":"guest-fsfreeze-freeze"}'
> rbd snap create $RBD_ID --snap `date +%F-%T`
>
> and then snapshot creation was hanging forever. I inserted a 2 second sleep.
>
> My code after
> virsh qemu-agent-command $INSTANCE '{"execute":"guest-fsfreeze-freeze"}'
> sleep 2
> rbd snap create $RBD_ID --snap `date +%F-%T`
>
> And now it works perfectly. Again, I have no idea, how it solved the problem.
> Thanks :)
>
> 2016-01-06 0:49 GMT+08:00 Мистер Сёма :
>> I am very sorry, but I am not able to increase log versbosity because
>> it's a production cluster with very limited space for logs. Sounds
>> crazy, but that's it.
>> I have found out that the RBD snapshot process hangs forever only when
>> QEMU fsfreeze was issued just before the snapshot. If the guest is not
>> frozen - snapshot is taken with no problem... I have absolutely no
>> idea how these two things could be related to each other... And again
>> this issue occurs only when there is an exclusive lock on image and
>> exclusive lock feature is enabled also on it.
>>
>> Do somebody else have such a problem?
>>
>> 2016-01-05 2:55 GMT+08:00 Jason Dillaman :
>>> I am surprised by the error you are seeing with exclusive lock enabled.  
>>> The rbd CLI should be able to send the 'snap create' request to QEMU 
>>> without an error.  Are you able to provide "debug rbd = 20" logs from 
>>> shortly before and after your snapshot attempt?
>>>
>>> --
>>>
>>> Jason Dillaman
>>>
>>>
>>> - Original Message -
 From: "Мистер Сёма" 
 To: "ceph-users" 
 Sent: Monday, January 4, 2016 12:37:07 PM
 Subject: [ceph-users] How to do quiesced rbd snapshot in libvirt?

 Hello,

 Can anyone please tell me what is the right way to do quiesced RBD
 snapshots in libvirt (OpenStack)?
 My Ceph version is 0.94.3.

 I found two possible ways, none of them is working for me. Wonder if
 I'm doing something wrong:
 1) Do VM fsFreeze through QEMU guest agent, perform RBD snapshot, do
 fsThaw. Looks good but the bad thing here is that

Re: [ceph-users] Ceph cluster + Ceph client upgrade path for production environment

2016-01-13 Thread Vickey Singh

Hello Guys

Need help with this , thanks

- vickey -

On Tue, Jan 12, 2016 at 12:10 PM, Vickey Singh 
wrote:

> Hello Community , wishing you a great new year :)
>
> This is the recommended upgrade path
> http://docs.ceph.com/docs/master/install/upgrading-ceph/
>
> Ceph Deploy
> Ceph Monitors
> Ceph OSD Daemons
> Ceph Metadata Servers
> Ceph Object Gateways
>
> How about upgrading Ceph clients ( in my case openstack compute and
> controller nodes). Should i upgrade my ceph clients after upgrading entire
> ceph cluster ??
>
> Currently my Ceph cluster and Ceph client version is 0.80.8 and i am
> planning to upgrade it to  0.94.5
>
> How should i plan my Ceph client upgrade.
>
> Many Thanks in advance
>
> - vickey -
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] How to check the block device space usage

2016-01-13 Thread WD_Hwang

Thank Josh for sharing the idea.
I have try the command for calculating the block size by ' sudo rbd diff 
PoolName/ImageName |awk '{ SUM += $2 } END { print SUM/1024/1024 " MB" }''. It 
seems take much time even to 1-2 minutes.
I think it's not suitable for production environment.

WD

-Original Message-
From: Josh Durgin [mailto:jdur...@redhat.com] 
Sent: Wednesday, January 13, 2016 2:44 PM
To: Wido den Hollander; WD Hwang/WHQ/Wistron; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] How to check the block device space usage

On 01/12/2016 10:34 PM, Wido den Hollander wrote:
> On 01/13/2016 07:27 AM, wd_hw...@wistron.com wrote:
>> Thanks Wido.
>> So it seems there is no way to do this under Hammer.
>>
>
> Not very easily no. You'd have to count and stat all objects for a RBD 
> image to figure this out.

For hammer you'd need another loop and sum around this:
http://permalink.gmane.org/gmane.comp.file-systems.ceph.user/3684

Josh

>> WD
>>
>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf 
>> Of Wido den Hollander
>> Sent: Wednesday, January 13, 2016 2:19 PM
>> To: ceph-users@lists.ceph.com
>> Subject: Re: [ceph-users] How to check the block device space usage
>>
>> On 01/13/2016 06:48 AM, wd_hw...@wistron.com wrote:
>>> Hi,
>>>
>>>Is there any way to check the block device space usage under the 
>>> specified pool? I need to know the capacity usage. If the block 
>>> device is used over 80%, I will send an alert to user.
>>>
>>
>> This can be done in Infernalis / Jewel, but it requires new RBD features.
>>
>> It needs the fast-diffv2 feature iirc.
>>
>>>
>>>
>>>Thanks a lot!
>>>
>>>
>>>
>>> Best Regards,
>>>
>>> WD
>>>
>>>
>>>
>>> *---
>>> --
>>> 
>>> --
>>> *
>>>
>>> *This email contains confidential or legally privileged information 
>>> and is for the sole use of its intended recipient. *
>>>
>>> *Any unauthorized review, use, copying or distribution of this email 
>>> or the content of this email is strictly prohibited.*
>>>
>>> *If you are not the intended recipient, you may reply to the sender 
>>> and should delete this e-mail immediately.*
>>>
>>> *---
>>> --
>>> 
>>> --
>>> *
>>>
>>>
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>>
>> --
>> Wido den Hollander
>> 42on B.V.
>> Ceph trainer and consultant
>>
>> Phone: +31 (0)20 700 9902
>> Skype: contact42on
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CentOS 7.2, Infernalis, preparing osd's and partprobe issues.

2016-01-13 Thread Wade Holler

Hey All,

Not trying to hijack this thread but since you are running Cent7.2 and
Infernalis - If you get wrong results from "ceph osd tree" or
download/decompiled crushmap could you please let me know ?  I have asked
in other threads for help on this but nothing back yet that was helpful.
wade.hol...@gmail.com

Best Regards,
Wade


On Wed, Jan 13, 2016 at 1:38 AM Goncalo Borges 
wrote:

> Hi again...
>
> Regarding this issue, I just tried to use partx instead of partprobe. I
> hit a different problem...
>
> My layout is 4 partitions in an SSD device to serve as journals for 4
> differents OSDs. Something like
>
> /dev/sdb1 (journal of /dev/sdd1)
> /dev/sdb2 (journal of /dev/sd31)
> /dev/sdb3 (journal of /dev/sdf1)
> /dev/sdb4 (journal of /dev/sdg1)
>
> The 1st run of ceph-disk prepare went fine.  The osd directory was mounted
> and the osd started. However, all the remaining runs did not went ok
> because, although the remaining journal partitions were created, they were
> not listed under /dev/sdb[2,3,4].
>
> Once I run partprobe 'a posterior, the journal partitions appeared in
> /dev/sdb[2,3,4] and I was able to start all osd just running systemctl
> start ceph.target
>
> Cheers
> Goncalo
>
>
> --
> *From:* Wade Holler [wade.hol...@gmail.com]
> *Sent:* 08 January 2016 01:15
> *To:* Goncalo Borges; Loic Dachary
> *Cc:* ceph-users@lists.ceph.com
> *Subject:* Re: [ceph-users] CentOS 7.2, Infernalis, preparing osd's and
> partprobe issues.
>
> I commented out partprobe and everything seems to work just fine.
>
> *If someone has experience with why this is very bad please advise.
>
> Make sure you know about http://tracker.ceph.com/issues/13833 also.
> *ps we are running btrfs in the test jig and had to add the "-f" to the
> btrfs_args for ceph-disk as well.
>
> Best Regards,
> Wade
>
>
>
> On Thu, Jan 7, 2016 at 12:13 AM Goncalo Borges <
> goncalo.bor...@sydney.edu.au> wrote:
>
>> Hi All...
>>
>> If I can step in on this issue, I just would like to report that I am
>> experiencing the same problem.
>>
>> 1./ I am installing my infernalis OSDs in a Centos 7.2.1511, and 'ceph
>> disk prepare' fails with the following message
>>
>> # ceph-disk prepare --cluster ceph --cluster-uuid
>> a9431bc6-3ee1-4b0a-8d21-0ad883a4d2ed --fs-type xfs /dev/sdd /dev/sdb
>> WARNING:ceph-disk:OSD will not be hot-swappable if journal is not the
>> same device as the osd data
>>
>>
>> The operation has completed successfully.
>>
>> Error: Error informing the kernel about modifications to partition
>> /dev/sdb1 -- Device or resource busy.  This means Linux won't know about
>> any changes you made to /dev/sdb1 until you reboot -- so you shouldn't
>> mount it or use it in any way before rebooting.
>> Error: Failed to add partition 1 (Device or resource busy)
>> ceph-disk: Error: Command '['/usr/sbin/partprobe', '/dev/sdb']' returned
>> non-zero exit status 1
>>
>>
>> 2./ I've then followed the discussion in
>> 
>> http://tracker.ceph.com/issues/14080 , and tried the last ceph-disk
>> suggestion by Luc [1]. Sometimes it succeeds and sometimes it doesn't. But,
>> it is taking a lot more time than before since there is now a 5 time loop
>> and a sleep 60 for each loop to wait for partprobe to succeed. Besides the
>> time it is taking, when it fails, I then have to zap the partitions
>> manually because sometimes the journal partition is ok but the data
>> partition is the one where partprobe is timing out.
>>
>>
>> 3./ In the cases that ceph-disk succeeds, the partition was not mounted
>> nor the daemon started. This was because python-setuptools was not
>> installed and ceph-disk depends on it. It would be worthwhile to make an
>> explicit rpm dependence for it.
>>
>>
>> I am not sure why this behavior is showing up much more on the new
>> servers I am configuring. Some weeks ago, the same exercise with other
>> servers (but using a different storage controller) succeeded without
>> problems.
>>
>> Is there a clear idea of how to improve this behavior?
>>
>>
>> Cheers
>> Goncalo
>>
>>
>>
>>
>>
>>
>> On 12/17/2015 10:02 AM, Matt Taylor wrote:
>>
>> Hi Loic,
>>
>> No problems, I'll add my my report on your bug report.
>>
>> I also tried adding the sleep prior to invoking partprobe, but it didn't
>> work (same error).
>>
>> See pastebin for complete output:
>>
>> http://pastebin.com/Q26CeUge
>>
>> Cheers,
>> Matt.
>>
>>
>> On 16/12/2015 19:57, Loic Dachary wrote:
>>
>> Hi Matt,
>>
>> Could you please add your report to http://tracker.ceph.com/issues/14080
>> ? I think what you're seeing is a partprobe timeout because things get too
>> long to complete (that's also why adding sleep as mentionned in the mail
>> thread sometime helps). There is a variant of that problem where udevadm
>> settle also timesout (but it is less common on real hardware). I'm testing
>> a fix to make this more robust.
>>
>> Cheers
>>
>> On 16/12/2015 07:17, Matt Taylor

Re: [ceph-users] Ceph cluster + Ceph client upgrade path for production environment

2016-01-13 Thread Kostis Fardelas

Hi Vickey,
under "Upgrade procedures", you will see that it is recommended to
upgrade clients after having upgraded your cluster [1]
[1] http://docs.ceph.com/docs/master/install/upgrading-ceph/#upgrading-a-client

Regards

On 13 January 2016 at 12:44, Vickey Singh  wrote:
> Hello Guys
>
> Need help with this , thanks
>
> - vickey -
>
> On Tue, Jan 12, 2016 at 12:10 PM, Vickey Singh 
> wrote:
>>
>> Hello Community , wishing you a great new year :)
>>
>> This is the recommended upgrade path
>> http://docs.ceph.com/docs/master/install/upgrading-ceph/
>>
>> Ceph Deploy
>> Ceph Monitors
>> Ceph OSD Daemons
>> Ceph Metadata Servers
>> Ceph Object Gateways
>>
>> How about upgrading Ceph clients ( in my case openstack compute and
>> controller nodes). Should i upgrade my ceph clients after upgrading entire
>> ceph cluster ??
>>
>> Currently my Ceph cluster and Ceph client version is 0.80.8 and i am
>> planning to upgrade it to  0.94.5
>>
>> How should i plan my Ceph client upgrade.
>>
>> Many Thanks in advance
>>
>> - vickey -
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] lost OSD due to failing disk

2016-01-13 Thread Magnus Hagdorn


Hi there,
we recently had a problem with two OSDs failing because of I/O errors of 
the underlying disks. We run a small ceph cluster with 3 nodes and 18 
OSDs in total. All 3 nodes are dell poweredge r515 servers with PERC 
H700 (MegaRAID SAS 2108) RAID controllers. All disks are configured as 
single disk RAID 0 arrays. A disk on two separate nodes started showing 
I/O errors reported by SMART, with one of the disks reporting pre 
failure SMART error. The node with the failing disk also reported XFS 
I/O errors. In both cases the OSD daemons kept running although ceph 
reported that they were slow to respond.  When we started to look into 
this we first tried restarted the OSDs. They then failed straight away. 
We ended up with data loss. We are running ceph 0.80.5 on Scientific 
Linux 6.6 with a replication level of 2. We had hoped that loosing disks 
due to hardware failure would be recoverable.


Is this a known issue with the RAID controllers, version of ceph?

Regards
magnus


--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] lost OSD due to failing disk

2016-01-13 Thread Mihai Gheorghe

So let me get this straight!
You have 3 hosts with 6 drives each in raid 0. So you have set 3 OSDs in
crushmap, right?
You said replication level is 2, so you have 2 copies of the original data!
So the pool size is 3, right?
You said 2 out of 3 OSD are down. So you are left with only one copy of the
data. As i know ceph locks acces to the remaining data to prevent changes
to it (if 2 out of 3 OSD were up then you should have had access to your
data)
You can try setting pool min_size to 1 and see if you can access it.
Although you should bring back up the lost OSDs.

And i don't think running RAID0 (stripping) is a good idea. When a drive in
the array goes down it takes the whole array down with it, as opposed to
having every single drive be an OSD and group them by host in crushmap. Or
setup 3 RAID0 arrays on every host. I might be mistaken though!

Anyway, someone with a better experience than me should have the right
answer for you.

Hope i understood correctly!

2016-01-13 14:26 GMT+02:00 Magnus Hagdorn :

> Hi there,
> we recently had a problem with two OSDs failing because of I/O errors of
> the underlying disks. We run a small ceph cluster with 3 nodes and 18 OSDs
> in total. All 3 nodes are dell poweredge r515 servers with PERC H700
> (MegaRAID SAS 2108) RAID controllers. All disks are configured as single
> disk RAID 0 arrays. A disk on two separate nodes started showing I/O errors
> reported by SMART, with one of the disks reporting pre failure SMART error.
> The node with the failing disk also reported XFS I/O errors. In both cases
> the OSD daemons kept running although ceph reported that they were slow to
> respond.  When we started to look into this we first tried restarted the
> OSDs. They then failed straight away. We ended up with data loss. We are
> running ceph 0.80.5 on Scientific Linux 6.6 with a replication level of 2.
> We had hoped that loosing disks due to hardware failure would be
> recoverable.
>
> Is this a known issue with the RAID controllers, version of ceph?
>
> Regards
> magnus
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph cache tier and rbd volumes/SSD primary, HDD replica crush rule!

2016-01-13 Thread Nick Fisk

Check this blog post

http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/

Intel wise, you really want dc3700’s or 3500’s if you won’t be write heavy.

IO’s tend to be generally more important, but flat out write bandwidth can be 
important if you are working with shifting large amounts of data around. But 
saying that, most SSD’s that can do high bandwidth properly as a Ceph journal 
will also have good IO. Be suspicious of an SSD which does high bandwidth but 
low random IO, or does both but it still cheap.

If there is one recurring theme on this mailing list, people that have tried to 
cut corners on the journals have nearly  always ended up replacing them at some 
point, either due to failure or poor performance.

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mihai 
Gheorghe
Sent: 13 January 2016 11:48
To: Robert LeBlanc 
Cc: ceph-users@lists.ceph.com; Nick Fisk 
Subject: Re: [ceph-users] Ceph cache tier and rbd volumes/SSD primary, HDD 
replica crush rule!

What are the recommanded specs of a SSD for journaling. It's a little bit 
tricky now to move the journals for spinners on them, because i have data on 
them. 

I now have all HDD journals on separate SSD. The problem is when i first made 
the cluster i assigned one journal SSD to 8x4TB HDD. Now i see there are too 
many spinners for one SSD. 

So i am planning to assign a journal SSD to 4 OSDs, so i have an extra 
redundancy-ish (if one journal crashes it only takes 4 OSD with it not 8). 

Do read/write specs matter or do the IOPS matter more? The journal SSDs i have 
now are, i belive, intel 520 (240GB, not that great write speeds but high 
IOPS). And i have a couple of spares that i can use for jorunaling (same type).

Also, what size should the journal partition be for one 4TB OSD. Now i have 
them set at 5GB? (it's the default ceph-deploy creates) .

2016-01-12 21:43 GMT+02:00 Robert LeBlanc  >:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

We are using cache tiering in two production clusters at the moment.
One cluster is running in forward mode at the moment due to the
excessive promotion/demotion. I've got Nick's patch backported to
Hammer and am going through the test suite at the moment. Once it
passes, I'll create a PR so it hopefully makes into the next Hammer
version.

In response to your question about journals. Once we introduced the
SSD cache tier, we moved or spindle journals off of the SSDs and onto
the spindles. We found that the load on the spindles were a fraction
of what it was before the cache tier. When we started up a host (five
spindle journals on the same SSD as the cache pool) we would have very
long start up times for the OSDs because the SSD was a bottleneck on
recovery of so many OSDs. We are also finding that even though the
Micron M600 drives perform "good enough" under steady state, there
isn't as much headroom as there is on the Intel S3610s that we are
also evaluating (6-8x less io time for the same IOPs on the S3610s
compared to the M600s). Being on the limits of the M600 may also
contribute to the inability of our busiest production clusters to run
in writeback mode permanently.

If your spindle pool is completely fronted by an SSD pool (or your
busy pools, we don't front our glance pool for instance), I'd say
leave the configuration simpler and co-locate the journal on the
spindle.
-BEGIN PGP SIGNATURE-
Version: Mailvelope v1.3.3
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJWlVdvCRDmVDuy+mK58QAAo/gP+QGVClmCb3Jut6PdXc25
B1bGawfOunsp+1c1iFfYi6BGvsf8saObq8FFZB471Yv/wQ0Y6MtuqLsiG85I
Mzy/3rbaS4YoWrcrhwGdaXKmmSOvAy58ZFSMM1fXjd8gNSzoNFsIZL0peorH
94If4+o18Hpc1oUmDLO3pj2XUrbO7RXgzQFT74xJdOfgo8ozlGNF1xfvsjJI
P5c2hVHdUfrnLoL0VFRRxVGTVmFKE6a1MSH4EiJbUDEGNNuxgztKUirBDfmV
SyFmRryrsy/1mulminDiRsjWEjzH8YpTKw/9E212NN0BR+eXbVH2d9uiYYYc
KeWYarxTg+09Ak9bP0IKoCP7ZgbBgrJQBnrMeFbIhM8i6OWpMNqRdniCs1nH
/q6PzYcytYMFdAzOq3HTi8ydxli/lJ1rv7eavvjMupfJQk6JJGvDITUi0phO
Ct7Cqu4qXLzCkQVFxmEo7grO68DTR1E26GEoINv7q3UhpQGsXzrnvYJwZoVv
cabSHDm9TIF0hlorQRTdvzElALuoxrB/rfpxsGhC3FFlrkfgsNA4QottF+dv
AbcxnIfMD+HoOxLadL4xKiRUVHOtgUtKuRCFGqEL7FaagyE165PiiTmR0tJk
H+Cz6wz/fW7CnoaxBE+M733hkdTb4QYnf0wqdkJIQ7Flec988/Ds+fxPcE8o
Hx7S
=3WAd
-END PGP SIGNATURE-

Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1

On Tue, Jan 12, 2016 at 10:27 AM, Mihai Gheorghe  > wrote:
> One more question. Seeing that cache tier holds data on it untill it reaches
> % ratio, i suppose i must set replication to 2 or higher on the cache pool
> to not lose hot data not writen to the cold storage in case of a drive
> failure, right?
>
> Also, will there be any perfomance penalty if i set the osd journal on the
> same SSD as the OSD. I now have one SSD

Re: [ceph-users] pg is stuck stale (osd.21 still removed)

2016-01-13 Thread Daniel Schwager

Hi ceph-users,

any idea to fix my cluster? OSD.21 removed, but still some (staled) PG's 
pointing to OSD.21...

I don't know how to proceed... Help is very welcome!

Best regards
Daniel


> -Original Message-
> From: Daniel Schwager
> Sent: Friday, January 08, 2016 3:10 PM
> To: 'ceph-us...@ceph.com'
> Subject: pg is stuck stale (osd.21 still removed)
> 
> Hi,
> 
> we had a HW-problem with OSD.21 today. The OSD daemon was down and "smartctl" 
> told me about some
> hardware errors.
> 
> I decided to remove the HDD:
> 
>   ceph osd out 21
>   ceph osd crush remove osd.21
>   ceph auth del osd.21
>   ceph osd rm osd.21
> 
> But afterwards I saw that I have some stucked pg's for osd.21:
> 
>   root@ceph-admin:~# ceph -w
>   cluster c7b12656-15a6-41b0-963f-4f47c62497dc
>health HEALTH_WARN
> 50 pgs stale
>   50 pgs stuck stale
>monmap e4: 3 mons at 
> {ceph-mon1=192.168.135.31:6789/0,ceph-mon2=192.168.135.32:6789/0,ceph-
> mon3=192.168.135.33:6789/0}
> election epoch 404, quorum 0,1,2 
> ceph-mon1,ceph-mon2,ceph-mon3
>mdsmap e136: 1/1/1 up {0=ceph-mon1=up:active}
>osdmap e18259: 23 osds: 23 up, 23 in
> pgmap v47879105: 6656 pgs, 10 pools, 23481 GB data, 6072 kobjects
> 54974 GB used, 30596 GB / 85571 GB avail
>   6605 active+clean
>   50 stale+active+clean
>  1 active+clean+scrubbing+deep
> 
>   root@ceph-admin:~# ceph health
>   HEALTH_WARN 50 pgs stale; 50 pgs stuck stale
> 
>   root@ceph-admin:~# ceph health detail
>   HEALTH_WARN 50 pgs stale; 50 pgs stuck stale; noout flag(s) set
>   pg 34.225 is stuck stale for 98780.399254, current state 
> stale+active+clean, last acting [21]
>   pg 34.186 is stuck stale for 98780.399195, current state 
> stale+active+clean, last acting [21]
>   ...
> 
>   root@ceph-admin:~# ceph pg 34.225   query
>   Error ENOENT: i don't have pgid 34.225
> 
>   root@ceph-admin:~# ceph pg 34.225  list_missing
>   Error ENOENT: i don't have pgid 34.225
> 
>   root@ceph-admin:~# ceph osd lost 21  --yes-i-really-mean-it
>   osd.21 is not down or doesn't exist
> 
>   # checking the crushmap
>   ceph osd getcrushmap -o crush.map
>   crushtool -d crush.map  -o crush.txt
>   root@ceph-admin:~# grep 21 crush.txt
>   -> nothing here
> 
> 
> Of course, I cannot start OSD.21, because it's not available anymore - I 
> removed it.
> 
> Is there a way to remap the stucked pg's to other OSD's than osd.21



> 
> One more - I tried to recreate the pg but now this pg this "stuck inactive":
> 
>   root@ceph-admin:~# ceph pg force_create_pg 34.225
>   pg 34.225 now creating, ok
> 
>   root@ceph-admin:~# ceph health detail
>   HEALTH_WARN 49 pgs stale; 1 pgs stuck inactive; 49 pgs stuck stale; 1 
> pgs stuck unclean
>   pg 34.225 is stuck inactive since forever, current state creating, last 
> acting []
>   pg 34.225 is stuck unclean since forever, current state creating, last 
> acting []
>   pg 34.186 is stuck stale for 118481.013632, current state 
> stale+active+clean, last acting [21]
>   ...
> 
> Maybe somebody has an idea how to fix this situation?


smime.p7s
Description: S/MIME cryptographic signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] lost OSD due to failing disk

2016-01-13 Thread Andy Allan

On 13 January 2016 at 12:26, Magnus Hagdorn  wrote:
> Hi there,
> we recently had a problem with two OSDs failing because of I/O errors of the
> underlying disks. We run a small ceph cluster with 3 nodes and 18 OSDs in
> total. All 3 nodes are dell poweredge r515 servers with PERC H700 (MegaRAID
> SAS 2108) RAID controllers. All disks are configured as single disk RAID 0
> arrays. A disk on two separate nodes started showing I/O errors reported by
> SMART, with one of the disks reporting pre failure SMART error. The node
> with the failing disk also reported XFS I/O errors. In both cases the OSD
> daemons kept running although ceph reported that they were slow to respond.
> When we started to look into this we first tried restarted the OSDs. They
> then failed straight away. We ended up with data loss. We are running ceph
> 0.80.5 on Scientific Linux 6.6 with a replication level of 2. We had hoped
> that loosing disks due to hardware failure would be recoverable.
>
> Is this a known issue with the RAID controllers, version of ceph?

If you have a replication level of 2, and lose 2 disks from different
nodes simultaneously, you're going to get data loss. Some portion of
your data will have its primary copy on disk A (in node 1) and the
backup copy on disk B (in node 2) (and some more data will have the
primary copy on B and backup on A) - if you lose A and B at the same
time then there's no other copies for those bits of data.

If you only lost one disk (e.g. A) then ceph would shuffle things
around and duplicate the data from the backup copy, so that (after
recovery) you have two copies again. Ceph also makes sure that the
copies are on different nodes, in case you lose an entire node - but
in this case, you've lost two disks on separate nodes.

If you want to tolerate two simultaneous disk failures across your
cluster, then you need to have 3 copies of your data (or an
appropriate erasure-coding setup).

Thanks,
Andy
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-13 Thread Gregory Farnum

On Wed, Jan 13, 2016 at 11:24 AM, Mike Carlson  wrote:
> Hello.
>
> Since we upgraded to Infernalis last, we have noticed a severe problem with
> cephfs when we have it shared over Samba and NFS
>
> Directory listings are showing an inconsistent view of the files:
>
>
> $ ls /lts-mon/BD/xmlExport/ | wc -l
>  100
> $ sudo umount /lts-mon
> $ sudo mount /lts-mon
> $ ls /lts-mon/BD/xmlExport/ | wc -l
> 3507
>
>
> The only work around I have found is un-mounting and re-mounting the nfs
> share, that seems to clear it up
> Same with samba, I'd post it here but its thousands of lines. I can add
> additional details on request.
>
> This happened after our upgrade to infernalis. Is it possible the MDS is in
> an inconsistent state?

So this didn't happen to you until after you upgraded? Are you seeing
missing files when looking at cephfs directly, or only over the
NFS/Samba re-exports? Are you also sharing Samba by re-exporting the
kernel cephfs mount?

Zheng, any ideas about kernel issues which might cause this or be more
visible under infernalis?
-Greg

>
> We have cephfs mounted on a server using the built in cephfs kernel module:
>
> lts-mon:6789:/ /ceph ceph
> name=admin,secretfile=/etc/ceph/admin.secret,noauto,_netdev
>
>
> We are running all of our ceph nodes on ubuntu 14.04 LTS. Samba is up to
> date, 4.1.6, and we export nfsv3 to linux and freebsd systems. All seem to
> exhibit the same behavior.
>
> system info:
>
> # uname -a
> Linux lts-osd1 3.13.0-63-generic #103-Ubuntu SMP Fri Aug 14 21:42:59 UTC
> 2015 x86_64 x86_64 x86_64 GNU/Linux
> root@lts-osd1:~# lsb
> lsblklsb_release
> root@lts-osd1:~# lsb_release -a
> No LSB modules are available.
> Distributor ID: Ubuntu
> Description: Ubuntu 14.04.3 LTS
> Release: 14.04
> Codename: trusty
>
>
> package info:
>
>  # dpkg -l|grep ceph
> ii  ceph 9.2.0-1trusty
> amd64distributed storage and file system
> ii  ceph-common  9.2.0-1trusty
> amd64common utilities to mount and interact with a ceph storage
> cluster
> ii  ceph-fs-common   9.2.0-1trusty
> amd64common utilities to mount and interact with a ceph file system
> ii  ceph-mds 9.2.0-1trusty
> amd64metadata server for the ceph distributed file system
> ii  libcephfs1   9.2.0-1trusty
> amd64Ceph distributed file system client library
> ii  python-ceph  9.2.0-1trusty
> amd64Meta-package for python libraries for the Ceph libraries
> ii  python-cephfs9.2.0-1trusty
> amd64Python libraries for the Ceph libcephfs library
>
>
> What is interesting, is a directory or file will not show up in a listing,
> however, if we directly access the file, it shows up in that instance:
>
>
> # ls -al |grep SCHOOL
> # ls -alnd SCHOOL667055
> drwxrwsr-x  1 21695  21183  2962751438 Jan 13 09:33 SCHOOL667055
>
>
> Any tips are appreciated!
>
> Thanks,
> Mike C
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] RBD export format for start and end snapshots

2016-01-13 Thread Alex Gorbachev

On Tue, Jan 12, 2016 at 12:09 PM, Josh Durgin  wrote:

> On 01/12/2016 06:10 AM, Alex Gorbachev wrote:
>
>> Good day!  I am working on a robust backup script for RBD and ran into a
>> need to reliably determine start and end snapshots for differential
>> exports (done with rbd export-diff).
>>
>> I can clearly see these if dumping the ASCII header of the export file,
>> e.g.:
>>
>> iss@lab2-b1:/data/volume1$ strings
>> exp-tst1-spin1-sctst1-0111-174938-2016-cons-thin.scexp|head -3
>> rbd diff v1
>> auto-0111-083856-2016-tst1t
>> auto-0111-174856-2016-tst1s
>>
>> It appears that "auto-0111-083856-2016-tst1" is the start snapshot
>> (followed by t) and "auto-0111-174856-2016-tst1" is the end snapshot
>> (followed by s).
>>
>> Is this the best way to determine snapshots and are letters "s" and "t"
>> going to stay the same?
>>
>
> The format won't change in an incompatible way, so we won't use those
> fields for other purposes, but 'strings | head -3' might not always
> work if we add fields or have longer strings later.
>
> The format is documented here:
>
> http://docs.ceph.com/docs/master/dev/rbd-diff/
>
> It'd be more reliable if you decoded it by unpacking the diff with
> your language of choice (e.g. using
> https://docs.python.org/2/library/struct.html for python.)
>
> Josh
>

Thanks Josh.  I implemented in Perl as:

my $buffer = 4096; # read first 4KB
open(FILE,'<',"$file");
sysread(FILE,$data,$buffer);
@str = unpack("A12 a1 L/A* a1 L/A* a1 Q",$data);

This assumes that the first characted line "rbd diff v1\n” is always 12
bytes long

Best regards,
Alex
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-13 Thread Mike Carlson

Hey Greg,

The inconsistent view is only over nfs/smb on top of our /ceph mount.

When I look directly on the /ceph mount (which is using the cephfs kernel
module), everything looks fine

It is possible that this issue just went unnoticed, and it only being a
infernalis problem is just a red herring. With that, it is oddly
coincidental that we just started seeing issues.

On Wed, Jan 13, 2016 at 11:30 AM, Gregory Farnum  wrote:

> On Wed, Jan 13, 2016 at 11:24 AM, Mike Carlson  wrote:
> > Hello.
> >
> > Since we upgraded to Infernalis last, we have noticed a severe problem
> with
> > cephfs when we have it shared over Samba and NFS
> >
> > Directory listings are showing an inconsistent view of the files:
> >
> >
> > $ ls /lts-mon/BD/xmlExport/ | wc -l
> >  100
> > $ sudo umount /lts-mon
> > $ sudo mount /lts-mon
> > $ ls /lts-mon/BD/xmlExport/ | wc -l
> > 3507
> >
> >
> > The only work around I have found is un-mounting and re-mounting the nfs
> > share, that seems to clear it up
> > Same with samba, I'd post it here but its thousands of lines. I can add
> > additional details on request.
> >
> > This happened after our upgrade to infernalis. Is it possible the MDS is
> in
> > an inconsistent state?
>
> So this didn't happen to you until after you upgraded? Are you seeing
> missing files when looking at cephfs directly, or only over the
> NFS/Samba re-exports? Are you also sharing Samba by re-exporting the
> kernel cephfs mount?
>
> Zheng, any ideas about kernel issues which might cause this or be more
> visible under infernalis?
> -Greg
>
> >
> > We have cephfs mounted on a server using the built in cephfs kernel
> module:
> >
> > lts-mon:6789:/ /ceph ceph
> > name=admin,secretfile=/etc/ceph/admin.secret,noauto,_netdev
> >
> >
> > We are running all of our ceph nodes on ubuntu 14.04 LTS. Samba is up to
> > date, 4.1.6, and we export nfsv3 to linux and freebsd systems. All seem
> to
> > exhibit the same behavior.
> >
> > system info:
> >
> > # uname -a
> > Linux lts-osd1 3.13.0-63-generic #103-Ubuntu SMP Fri Aug 14 21:42:59 UTC
> > 2015 x86_64 x86_64 x86_64 GNU/Linux
> > root@lts-osd1:~# lsb
> > lsblklsb_release
> > root@lts-osd1:~# lsb_release -a
> > No LSB modules are available.
> > Distributor ID: Ubuntu
> > Description: Ubuntu 14.04.3 LTS
> > Release: 14.04
> > Codename: trusty
> >
> >
> > package info:
> >
> >  # dpkg -l|grep ceph
> > ii  ceph 9.2.0-1trusty
> > amd64distributed storage and file system
> > ii  ceph-common  9.2.0-1trusty
> > amd64common utilities to mount and interact with a ceph storage
> > cluster
> > ii  ceph-fs-common   9.2.0-1trusty
> > amd64common utilities to mount and interact with a ceph file
> system
> > ii  ceph-mds 9.2.0-1trusty
> > amd64metadata server for the ceph distributed file system
> > ii  libcephfs1   9.2.0-1trusty
> > amd64Ceph distributed file system client library
> > ii  python-ceph  9.2.0-1trusty
> > amd64Meta-package for python libraries for the Ceph libraries
> > ii  python-cephfs9.2.0-1trusty
> > amd64Python libraries for the Ceph libcephfs library
> >
> >
> > What is interesting, is a directory or file will not show up in a
> listing,
> > however, if we directly access the file, it shows up in that instance:
> >
> >
> > # ls -al |grep SCHOOL
> > # ls -alnd SCHOOL667055
> > drwxrwsr-x  1 21695  21183  2962751438 Jan 13 09:33 SCHOOL667055
> >
> >
> > Any tips are appreciated!
> >
> > Thanks,
> > Mike C
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-13 Thread Mike Carlson

Hello.

Since we upgraded to Infernalis last, we have noticed a severe problem with
cephfs when we have it shared over Samba and NFS

Directory listings are showing an inconsistent view of the files:


$ ls /lts-mon/BD/xmlExport/ | wc -l
 100
$ sudo umount /lts-mon
$ sudo mount /lts-mon
$ ls /lts-mon/BD/xmlExport/ | wc -l
3507


The only work around I have found is un-mounting and re-mounting the nfs
share, that seems to clear it up
Same with samba, I'd post it here but its thousands of lines. I can add
additional details on request.

This happened after our upgrade to infernalis. Is it possible the MDS is in
an inconsistent state?

We have cephfs mounted on a server using the built in cephfs kernel module:

lts-mon:6789:/ /ceph ceph
name=admin,secretfile=/etc/ceph/admin.secret,noauto,_netdev


We are running all of our ceph nodes on ubuntu 14.04 LTS. Samba is up to
date, 4.1.6, and we export nfsv3 to linux and freebsd systems. All seem to
exhibit the same behavior.

system info:

# uname -a
Linux lts-osd1 3.13.0-63-generic #103-Ubuntu SMP Fri Aug 14 21:42:59 UTC
2015 x86_64 x86_64 x86_64 GNU/Linux
root@lts-osd1:~# lsb
lsblklsb_release
root@lts-osd1:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.3 LTS
Release: 14.04
Codename: trusty


package info:

 # dpkg -l|grep ceph
ii  ceph 9.2.0-1trusty
 amd64distributed storage and file system
ii  ceph-common  9.2.0-1trusty
 amd64common utilities to mount and interact with a ceph storage
cluster
ii  ceph-fs-common   9.2.0-1trusty
 amd64common utilities to mount and interact with a ceph file system
ii  ceph-mds 9.2.0-1trusty
 amd64metadata server for the ceph distributed file system
ii  libcephfs1   9.2.0-1trusty
 amd64Ceph distributed file system client library
ii  python-ceph  9.2.0-1trusty
 amd64Meta-package for python libraries for the Ceph libraries
ii  python-cephfs9.2.0-1trusty
 amd64Python libraries for the Ceph libcephfs library


What is interesting, is a directory or file will not show up in a listing,
however, if we directly access the file, it shows up in that instance:


# ls -al |grep SCHOOL
# ls -alnd SCHOOL667055
drwxrwsr-x  1 21695  21183  2962751438 Jan 13 09:33 SCHOOL667055


Any tips are appreciated!

Thanks,
Mike C
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Securing/Mitigating brute force attacks, Rados Gateway + Keystone

2016-01-13 Thread Jerico Revote

Hello Ceph Users,

We've recently deployed a RGW service (0.94.3),
We've also integrated this RGW instance to an external OpenStack Keystone 
identity service,
RGW + Keystone integration/service are working well,
On a high-level, our RGW service looks like:

---

+---+
|Clients++
+---+|
 |
  S3, Swift (HTTPS)  |  ++
 |  |Keystone|
   +-+-+++---+
  ++RGW+--+  |
  |+---+  |  |
  | ++
  | | DNS Round Robin |  |
   +--+-++   
++-+
   | ++   ++ |   | ++++ 
|
   | |RGW1|HA1+---+RGW1|HA2| |   | |RGW2|HA1++RGW2|HA2| 
|
   | ++   ++ |   | ++++ 
|
   +--+--+   
+---+--+
  |HAProxy + Keepalived, SSL termination |
  |  |
  |  |
 ++
  |   +-+|
  |   |   civetweb  ||
  |   | ||
  +---+ ++ ++++ ++
  | |RGW1| |RGW2||RGW3| |
  | ++ ++++ |
  +-+
 |
 |
   +-+--+
   |Ceph|
   ++

---

Now, we're interested to learn how other RGW (+ Keystone) users
are preventing/mitigating brute force attacks on their RGWs?
OpenStack Keystone itself doesn't implement/limit auto-blocking,
HAproxy can be configured to do some auto blocking/mitigation though.

Regards,

Jerico___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] How to do quiesced rbd snapshot in libvirt?

2016-01-13 Thread Jason Dillaman

Definitely would like to see the "debug rbd = 20" logs from 192.168.254.17 when 
this occurs.  If you are co-locating your OSDs, MONs, and qemu-kvm processes, 
make sure your ceph.conf has "log file = " defined in the 
[global] or [client] section.

-- 

Jason Dillaman 


- Original Message -
> From: "Василий Ангапов" 
> To: "Jason Dillaman" , "ceph-users" 
> 
> Sent: Wednesday, January 13, 2016 4:22:02 AM
> Subject: Re: [ceph-users] How to do quiesced rbd snapshot in libvirt?
> 
> Hello again!
> 
> Unfortunately I have to raise the problem again. I have constantly
> hanging snapshots on several images.
> My Ceph version is now 0.94.5.
> RBD CLI always giving me this:
> root@slpeah001:[~]:# rbd snap create
> volumes/volume-26c89a0a-be4d-45d4-85a6-e0dc134941fd --snap test
> 2016-01-13 12:04:39.107166 7fb70e4c2880 -1 librbd::ImageWatcher:
> 0x427a710 no lock owners detected
> 2016-01-13 12:04:44.108783 7fb70e4c2880 -1 librbd::ImageWatcher:
> 0x427a710 no lock owners detected
> 2016-01-13 12:04:49.110321 7fb70e4c2880 -1 librbd::ImageWatcher:
> 0x427a710 no lock owners detected
> 2016-01-13 12:04:54.112373 7fb70e4c2880 -1 librbd::ImageWatcher:
> 0x427a710 no lock owners detected
> 
> I turned "debug rbd = 20" and found this records only on one of OSDs
> (on the same host as RBD client):
> 2016-01-13 11:44:46.076780 7fb5f05d8700  0 --
> 192.168.252.11:6804/407141 >> 192.168.252.11:6800/407122
> pipe(0x392d2000 sd=257 :6804 s=2 pgs=17 cs=1 l=0 c=0x383b4160).fault
> with nothing to send, going to standby
> 2016-01-13 11:58:26.261460 7fb5efbce700  0 --
> 192.168.252.11:6804/407141 >> 192.168.252.11:6802/407124
> pipe(0x39e45000 sd=156 :6804 s=2 pgs=17 cs=1 l=0 c=0x386fbb20).fault
> with nothing to send, going to standby
> 2016-01-13 12:04:23.948931 7fb5fede2700  0 --
> 192.168.254.11:6804/407141 submit_message watch-notify(notify_complete
> (2) cookie 44850800 notify 99720550678667 ret -110) v3 remote,
> 192.168.254.11:0/1468572, failed lossy con, dropping message
> 0x3ab76fc0
> 2016-01-13 12:09:04.254329 7fb5fede2700  0 --
> 192.168.254.11:6804/407141 submit_message watch-notify(notify_complete
> (2) cookie 69846112 notify 99720550678721 ret -110) v3 remote,
> 192.168.254.11:0/1509673, failed lossy con, dropping message
> 0x3830cb40
> 
> Here is the image properties
> root@slpeah001:[~]:# rbd info
> volumes/volume-26c89a0a-be4d-45d4-85a6-e0dc134941fd
> rbd image 'volume-26c89a0a-be4d-45d4-85a6-e0dc134941fd':
> size 200 GB in 51200 objects
> order 22 (4096 kB objects)
> block_name_prefix: rbd_data.2f2a81562fea59
> format: 2
> features: layering, striping, exclusive, object map
> flags:
> stripe unit: 4096 kB
> stripe count: 1
> root@slpeah001:[~]:# rbd status
> volumes/volume-26c89a0a-be4d-45d4-85a6-e0dc134941fd
> Watchers:
> watcher=192.168.254.17:0/2088291 client.3424561 cookie=93888518795008
> root@slpeah001:[~]:# rbd lock list
> volumes/volume-26c89a0a-be4d-45d4-85a6-e0dc134941fd
> There is 1 exclusive lock on this image.
> Locker ID  Address
> client.3424561 auto 93888518795008 192.168.254.17:0/2088291
> 
> Also taking RBD snapshots from python API also is hanging...
> This image is being used by libvirt.
> 
> Any suggestions?
> Thanks!
> 
> Regards, Vasily.
> 
> 
> 2016-01-06 1:11 GMT+08:00 Мистер Сёма :
> > Well, I believe the problem is no more valid.
> > My code before was:
> > virsh qemu-agent-command $INSTANCE '{"execute":"guest-fsfreeze-freeze"}'
> > rbd snap create $RBD_ID --snap `date +%F-%T`
> >
> > and then snapshot creation was hanging forever. I inserted a 2 second
> > sleep.
> >
> > My code after
> > virsh qemu-agent-command $INSTANCE '{"execute":"guest-fsfreeze-freeze"}'
> > sleep 2
> > rbd snap create $RBD_ID --snap `date +%F-%T`
> >
> > And now it works perfectly. Again, I have no idea, how it solved the
> > problem.
> > Thanks :)
> >
> > 2016-01-06 0:49 GMT+08:00 Мистер Сёма :
> >> I am very sorry, but I am not able to increase log versbosity because
> >> it's a production cluster with very limited space for logs. Sounds
> >> crazy, but that's it.
> >> I have found out that the RBD snapshot process hangs forever only when
> >> QEMU fsfreeze was issued just before the snapshot. If the guest is not
> >> frozen - snapshot is taken with no problem... I have absolutely no
> >> idea how these two things could be related to each other... And again
> >> this issue occurs only when there is an exclusive lock on image and
> >> exclusive lock feature is enabled also on it.
> >>
> >> Do somebody else have such a problem?
> >>
> >> 2016-01-05 2:55 GMT+08:00 Jason Dillaman :
> >>> I am surprised by the error you are seeing with exclusive lock enabled.
> >>> The rbd CLI should be able to send the 'snap create' request to QEMU
> >>> without an error.  Are you able to provide

Re: [ceph-users] How to do quiesced rbd snapshot in libvirt?

2016-01-13 Thread Василий Ангапов

Thanks, Jason, I forgot about this trick!

These are the qemu rbd logs (last 200 lines). These lines are
endlessly repeating when snapshot taking hangs:
2016-01-14 04:56:34.469568 7ff80e93e700 20 librbd::AioCompletion:
AioCompletion::finalize() copied resulting 4096 bytes to
0x55bcc86c6000
2016-01-14 04:56:34.469576 7ff80e93e700 20 librbd::AsyncOperation:
0x55bccafd3eb0 finish_op
2016-01-14 04:56:34.469719 7ff810942700 20 librbdwriteback: aio_cb completing
2016-01-14 04:56:34.469732 7ff810942700 20 librbdwriteback: aio_cb finished
2016-01-14 04:56:34.469739 7ff80e93e700 20 librbd::AioRequest:
should_complete 0x55bcc41a13c0
rbd_data.2f31e252fa88e4.0130 1634304~36864 r = 36864
2016-01-14 04:56:34.469745 7ff80e93e700 20 librbd::AioRequest:
should_complete 0x55bcc41a13c0 READ_FLAT
2016-01-14 04:56:34.469747 7ff80e93e700 20 librbd::AioRequest:
complete 0x55bcc41a13c0
2016-01-14 04:56:34.469748 7ff80e93e700 10 librbd::AioCompletion:
C_AioRead::finish() 0x55bcd00c0700 r = 36864
2016-01-14 04:56:34.469750 7ff80e93e700 10 librbd::AioCompletion:  got
{} for [0,36864] bl 36864
2016-01-14 04:56:34.469769 7ff80e93e700 20 librbd::AioCompletion:
AioCompletion::complete_request() 0x55bccafd3000
complete_cb=0x55bcbee4f440 pending 1
2016-01-14 04:56:34.469772 7ff80e93e700 20 librbd::AioCompletion:
AioCompletion::finalize() 0x55bccafd3000 rval 36864 read_buf
0x55bcc4f8a000 read_bl 0
2016-01-14 04:56:34.469787 7ff80e93e700 20 librbd::AioCompletion:
AioCompletion::finalize() copied resulting 36864 bytes to
0x55bcc4f8a000
2016-01-14 04:56:34.469789 7ff80e93e700 20 librbd::AsyncOperation:
0x55bccafd3130 finish_op
2016-01-14 04:56:34.469847 7ff810942700 20 librbdwriteback: aio_cb completing
2016-01-14 04:56:34.469865 7ff810942700 20 librbdwriteback: aio_cb finished
2016-01-14 04:56:34.469869 7ff80e93e700 20 librbd::AioRequest:
should_complete 0x55bcc41a00a0
rbd_data.2f31e252fa88e4.0130 1888256~4096 r = 4096
2016-01-14 04:56:34.469874 7ff80e93e700 20 librbd::AioRequest:
should_complete 0x55bcc41a00a0 READ_FLAT
2016-01-14 04:56:34.469876 7ff80e93e700 20 librbd::AioRequest:
complete 0x55bcc41a00a0
2016-01-14 04:56:34.469877 7ff80e93e700 10 librbd::AioCompletion:
C_AioRead::finish() 0x55bcd00c2aa0 r = 4096
2016-01-14 04:56:34.469880 7ff80e93e700 10 librbd::AioCompletion:  got
{} for [0,4096] bl 4096
2016-01-14 04:56:34.469884 7ff80e93e700 20 librbd::AioCompletion:
AioCompletion::complete_request() 0x55bccafd0d80
complete_cb=0x55bcbee4f440 pending 1
2016-01-14 04:56:34.469886 7ff80e93e700 20 librbd::AioCompletion:
AioCompletion::finalize() 0x55bccafd0d80 rval 4096 read_buf
0x55bcc45c8000 read_bl 0
2016-01-14 04:56:34.469890 7ff80e93e700 20 librbd::AioCompletion:
AioCompletion::finalize() copied resulting 4096 bytes to
0x55bcc45c8000
2016-01-14 04:56:34.469892 7ff80e93e700 20 librbd::AsyncOperation:
0x55bccafd0eb0 finish_op
2016-01-14 04:56:34.470023 7ff810942700 20 librbdwriteback: aio_cb completing
2016-01-14 04:56:34.470032 7ff810942700 20 librbdwriteback: aio_cb finished
2016-01-14 04:56:34.470038 7ff80e93e700 20 librbd::AioRequest:
should_complete 0x55bcc419f320
rbd_data.2f31e252fa88e4.0130 1900544~20480 r = 20480
2016-01-14 04:56:34.470044 7ff80e93e700 20 librbd::AioRequest:
should_complete 0x55bcc419f320 READ_FLAT
2016-01-14 04:56:34.470045 7ff80e93e700 20 librbd::AioRequest:
complete 0x55bcc419f320
2016-01-14 04:56:34.470046 7ff80e93e700 10 librbd::AioCompletion:
C_AioRead::finish() 0x55bcd00c2bc0 r = 20480
2016-01-14 04:56:34.470047 7ff80e93e700 10 librbd::AioCompletion:  got
{} for [0,20480] bl 20480
2016-01-14 04:56:34.470051 7ff80e93e700 20 librbd::AioCompletion:
AioCompletion::complete_request() 0x55bccafd0900
complete_cb=0x55bcbee4f440 pending 1
2016-01-14 04:56:34.470052 7ff80e93e700 20 librbd::AioCompletion:
AioCompletion::finalize() 0x55bccafd0900 rval 20480 read_buf
0x55bcc6741000 read_bl 0
2016-01-14 04:56:34.470062 7ff80e93e700 20 librbd::AioCompletion:
AioCompletion::finalize() copied resulting 20480 bytes to
0x55bcc6741000
2016-01-14 04:56:34.470064 7ff80e93e700 20 librbd::AsyncOperation:
0x55bccafd0a30 finish_op
2016-01-14 04:56:34.470176 7ff810942700 20 librbdwriteback: aio_cb completing
2016-01-14 04:56:34.470191 7ff810942700 20 librbdwriteback: aio_cb finished
2016-01-14 04:56:34.470193 7ff810942700 20 librbdwriteback: aio_cb completing
2016-01-14 04:56:34.470197 7ff80e93e700 20 librbd::AioRequest:
should_complete 0x55bcc419f8c0
rbd_data.2f31e252fa88e4.0130 2502656~4096 r = 4096
2016-01-14 04:56:34.470201 7ff80e93e700 20 librbd::AioRequest:
should_complete 0x55bcc419f8c0 READ_FLAT
2016-01-14 04:56:34.470202 7ff810942700 20 librbdwriteback: aio_cb finished
2016-01-14 04:56:34.470203 7ff80e93e700 20 librbd::AioRequest:
complete 0x55bcc419f8c0
2016-01-14 04:56:34.470205 7ff80e93e700 10 librbd::AioCompletion:
C_AioRead::finish() 0x55bcd00c03e0 r = 4096
2016-01-14 04:56:34.470208 7ff80e93e700 10 librbd::AioCompletion:  got
{} for [0,4096] bl 4096
2016-01-14

Re: [ceph-users] pg is stuck stale (osd.21 still removed)

2016-01-13 Thread Alex Gorbachev

Hi Daniel,

On Friday, January 8, 2016, Daniel Schwager 
wrote:

> One more - I tried to recreate the pg but now this pg this "stuck
> inactive":
>
> root@ceph-admin:~# ceph pg force_create_pg 34.225
> pg 34.225 now creating, ok
>
> root@ceph-admin:~# ceph health detail
> HEALTH_WARN 49 pgs stale; 1 pgs stuck inactive; 49 pgs stuck
> stale; 1 pgs stuck unclean
> pg 34.225 is stuck inactive since forever, current state creating,
> last acting []
> pg 34.225 is stuck unclean since forever, current state creating,
> last acting []
> pg 34.186 is stuck stale for 118481.013632, current state
> stale+active+clean, last acting [21]
> ...
>
> Maybe somebody has an idea how to fix this situation?


I don't unfortunately have the answers, but maybe the following links will
help you make some progress:

http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/

https://www.mail-archive.com/ceph-users@lists.ceph.com/msg17820.html

https://ceph.com/community/incomplete-pgs-oh-my/

Good luck,
Alex


>
> regards
> Danny
>
>
>

-- 
--
Alex Gorbachev
Storcium
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

2016-01-13 Thread Yan, Zheng

On Thu, Jan 14, 2016 at 3:37 AM, Mike Carlson  wrote:
> Hey Greg,
>
> The inconsistent view is only over nfs/smb on top of our /ceph mount.
>
> When I look directly on the /ceph mount (which is using the cephfs kernel
> module), everything looks fine
>
> It is possible that this issue just went unnoticed, and it only being a
> infernalis problem is just a red herring. With that, it is oddly
> coincidental that we just started seeing issues.

This seems like seekdir bugs in kernel client, could you try 4.0+ kernel.

Besides, do you enable "mds bal frag" for ceph-mds


Regards
Yan, Zheng



>
> On Wed, Jan 13, 2016 at 11:30 AM, Gregory Farnum  wrote:
>>
>> On Wed, Jan 13, 2016 at 11:24 AM, Mike Carlson  wrote:
>> > Hello.
>> >
>> > Since we upgraded to Infernalis last, we have noticed a severe problem
>> > with
>> > cephfs when we have it shared over Samba and NFS
>> >
>> > Directory listings are showing an inconsistent view of the files:
>> >
>> >
>> > $ ls /lts-mon/BD/xmlExport/ | wc -l
>> >  100
>> > $ sudo umount /lts-mon
>> > $ sudo mount /lts-mon
>> > $ ls /lts-mon/BD/xmlExport/ | wc -l
>> > 3507
>> >
>> >
>> > The only work around I have found is un-mounting and re-mounting the nfs
>> > share, that seems to clear it up
>> > Same with samba, I'd post it here but its thousands of lines. I can add
>> > additional details on request.
>> >
>> > This happened after our upgrade to infernalis. Is it possible the MDS is
>> > in
>> > an inconsistent state?
>>
>> So this didn't happen to you until after you upgraded? Are you seeing
>> missing files when looking at cephfs directly, or only over the
>> NFS/Samba re-exports? Are you also sharing Samba by re-exporting the
>> kernel cephfs mount?
>>
>> Zheng, any ideas about kernel issues which might cause this or be more
>> visible under infernalis?
>> -Greg
>>
>> >
>> > We have cephfs mounted on a server using the built in cephfs kernel
>> > module:
>> >
>> > lts-mon:6789:/ /ceph ceph
>> > name=admin,secretfile=/etc/ceph/admin.secret,noauto,_netdev
>> >
>> >
>> > We are running all of our ceph nodes on ubuntu 14.04 LTS. Samba is up to
>> > date, 4.1.6, and we export nfsv3 to linux and freebsd systems. All seem
>> > to
>> > exhibit the same behavior.
>> >
>> > system info:
>> >
>> > # uname -a
>> > Linux lts-osd1 3.13.0-63-generic #103-Ubuntu SMP Fri Aug 14 21:42:59 UTC
>> > 2015 x86_64 x86_64 x86_64 GNU/Linux
>> > root@lts-osd1:~# lsb
>> > lsblklsb_release
>> > root@lts-osd1:~# lsb_release -a
>> > No LSB modules are available.
>> > Distributor ID: Ubuntu
>> > Description: Ubuntu 14.04.3 LTS
>> > Release: 14.04
>> > Codename: trusty
>> >
>> >
>> > package info:
>> >
>> >  # dpkg -l|grep ceph
>> > ii  ceph 9.2.0-1trusty
>> > amd64distributed storage and file system
>> > ii  ceph-common  9.2.0-1trusty
>> > amd64common utilities to mount and interact with a ceph storage
>> > cluster
>> > ii  ceph-fs-common   9.2.0-1trusty
>> > amd64common utilities to mount and interact with a ceph file
>> > system
>> > ii  ceph-mds 9.2.0-1trusty
>> > amd64metadata server for the ceph distributed file system
>> > ii  libcephfs1   9.2.0-1trusty
>> > amd64Ceph distributed file system client library
>> > ii  python-ceph  9.2.0-1trusty
>> > amd64Meta-package for python libraries for the Ceph libraries
>> > ii  python-cephfs9.2.0-1trusty
>> > amd64Python libraries for the Ceph libcephfs library
>> >
>> >
>> > What is interesting, is a directory or file will not show up in a
>> > listing,
>> > however, if we directly access the file, it shows up in that instance:
>> >
>> >
>> > # ls -al |grep SCHOOL
>> > # ls -alnd SCHOOL667055
>> > drwxrwsr-x  1 21695  21183  2962751438 Jan 13 09:33 SCHOOL667055
>> >
>> >
>> > Any tips are appreciated!
>> >
>> > Thanks,
>> > Mike C
>> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] How to do quiesced rbd snapshot in libvirt?

2016-01-13 Thread Василий Ангапов

And here is my ceph.conf:

[global]
fsid = 78eef61a-3e9c-447c-a3ec-ce84c617d728
mon initial members = slpeah001,slpeah002,slpeah007
mon host = 192.168.254.11:6780,192.168.254.12:6780,192.168.254.17
public network = 192.168.254.0/23
cluster network = 192.168.252.0/23
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
osd journal size = 5000
osd pool default size = 2
osd pool default min size = 1
osd pool default pg num = 128
osd pool default pgp num = 128
mon_pg_warn_max_per_osd = 0
mon_osd_down_out_subtree_limit = host
log_to_syslog = false
log_to_stderr = false
mon_cluster_log_to_syslog = false
osd_max_backfills = 1
osd_recovery_max_active = 1
osd_recovery_op_priority = 1
osd_recovery_max_single_start = 1
rbd default format = 2
rbd default features = 15
debug lockdep = 0/0
debug context = 0/0
debug buffer = 0/0
debug timer = 0/0
debug osd = 0/0
debug optracker = 0/0
debug objclass = 0/0
debug filestore = 0/0
debug ms = 0/0
debug monc = 0/0
debug throttle = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug_rgw = 0/0
debug civetweb = 0/0
debug javaclient = 0/0
debug rbd = 20
mon data avail warn = 10
mon data avail crit = 5

[client.glance]
keyring = /etc/ceph/ceph.client.glance.keyring

[client.cinder-backup]
keyring = /etc/ceph/ceph.client.cinder-backup.keyring

[client.cinder]
rbd default format = 2
rbd default features = 15
rbd cache = true
rbd cache writethrough until flush = true
keyring = /etc/ceph/ceph.client.cinder.keyring
#admin socket = /var/run/ceph/$id.$pid.$cctid.asok
log file = /var/log/qemu/qemu-guest-$pid.log

2016-01-14 10:00 GMT+08:00 Василий Ангапов :
> Thanks, Jason, I forgot about this trick!
>
> These are the qemu rbd logs (last 200 lines). These lines are
> endlessly repeating when snapshot taking hangs:
> 2016-01-14 04:56:34.469568 7ff80e93e700 20 librbd::AioCompletion:
> AioCompletion::finalize() copied resulting 4096 bytes to
> 0x55bcc86c6000
> 2016-01-14 04:56:34.469576 7ff80e93e700 20 librbd::AsyncOperation:
> 0x55bccafd3eb0 finish_op
> 2016-01-14 04:56:34.469719 7ff810942700 20 librbdwriteback: aio_cb completing
> 2016-01-14 04:56:34.469732 7ff810942700 20 librbdwriteback: aio_cb finished
> 2016-01-14 04:56:34.469739 7ff80e93e700 20 librbd::AioRequest:
> should_complete 0x55bcc41a13c0
> rbd_data.2f31e252fa88e4.0130 1634304~36864 r = 36864
> 2016-01-14 04:56:34.469745 7ff80e93e700 20 librbd::AioRequest:
> should_complete 0x55bcc41a13c0 READ_FLAT
> 2016-01-14 04:56:34.469747 7ff80e93e700 20 librbd::AioRequest:
> complete 0x55bcc41a13c0
> 2016-01-14 04:56:34.469748 7ff80e93e700 10 librbd::AioCompletion:
> C_AioRead::finish() 0x55bcd00c0700 r = 36864
> 2016-01-14 04:56:34.469750 7ff80e93e700 10 librbd::AioCompletion:  got
> {} for [0,36864] bl 36864
> 2016-01-14 04:56:34.469769 7ff80e93e700 20 librbd::AioCompletion:
> AioCompletion::complete_request() 0x55bccafd3000
> complete_cb=0x55bcbee4f440 pending 1
> 2016-01-14 04:56:34.469772 7ff80e93e700 20 librbd::AioCompletion:
> AioCompletion::finalize() 0x55bccafd3000 rval 36864 read_buf
> 0x55bcc4f8a000 read_bl 0
> 2016-01-14 04:56:34.469787 7ff80e93e700 20 librbd::AioCompletion:
> AioCompletion::finalize() copied resulting 36864 bytes to
> 0x55bcc4f8a000
> 2016-01-14 04:56:34.469789 7ff80e93e700 20 librbd::AsyncOperation:
> 0x55bccafd3130 finish_op
> 2016-01-14 04:56:34.469847 7ff810942700 20 librbdwriteback: aio_cb completing
> 2016-01-14 04:56:34.469865 7ff810942700 20 librbdwriteback: aio_cb finished
> 2016-01-14 04:56:34.469869 7ff80e93e700 20 librbd::AioRequest:
> should_complete 0x55bcc41a00a0
> rbd_data.2f31e252fa88e4.0130 1888256~4096 r = 4096
> 2016-01-14 04:56:34.469874 7ff80e93e700 20 librbd::AioRequest:
> should_complete 0x55bcc41a00a0 READ_FLAT
> 2016-01-14 04:56:34.469876 7ff80e93e700 20 librbd::AioRequest:
> complete 0x55bcc41a00a0
> 2016-01-14 04:56:34.469877 7ff80e93e700 10 librbd::AioCompletion:
> C_AioRead::finish() 0x55bcd00c2aa0 r = 4096
> 2016-01-14 04:56:34.469880 7ff80e93e700 10 librbd::AioCompletion:  got
> {} for [0,4096] bl 4096
> 2016-01-14 04:56:34.469884 7ff80e93e700 20 librbd::AioCompletion:
> AioCompletion::complete_request() 0x55bccafd0d80
> complete_cb=0x55bcbee4f440 pending 1
> 2016-01-14 04:56:34.469886 7ff80e93e700 20 librbd::AioCompletion:
> AioCompletion::finalize() 0x55bccafd0d80 rval 4096 read_buf
> 0x55bcc45c8000 read_bl 0
> 2016-01-14 04:56:34.469890 7ff80e93e700 20 librbd::AioCompletion:
> AioCompletion::finalize() copied resulting 4096 bytes to
> 0x55bcc45c8000
> 2016-01-14 04:56:34.469892 7ff80e93e700 20 librbd::AsyncOperation:
> 0x55bccafd0eb0 finish_op
> 2016-01-14 04:56:34.470023 7ff810942700 20 librbdwriteback: aio_cb completing
> 2016-01-14 04:56:34.470032 7ff810942700 20 librbdwriteback: aio_cb finished
> 2016-01-14 04:56:34.470038 7ff80e93e700 20 librbd::AioRequest:
> should_complete 0x55bcc419f320
> rbd_data.2f31e252fa88e4.0130 1900544~20480 r = 20480
> 2016-01-14 04:56:34.470044

[ceph-users] Ceph node stats back to calamari

2016-01-13 Thread Daniel Rolfe

I have calamari setup and running, but I'm only getting node stats from the
node running calamari and ceph together (docker.test.com)

The other nodes show the below error


*ceph2.test.com :*
*'ceph.get_heartbeats' is not available.*
*ceph3.test.com :*
*'ceph.get_heartbeats' is not available.*



minion and diamond running on the other two nodes

root@docker:~# salt '*' test.ping
docker.test.com:
True
ceph3.test.com:
True
ceph2.test.com:
True
root@docker:~#

root@docker:~# salt '*' test.ping; salt '*' ceph.get_heartbeats


- boot_time:
1452682403
- ceph_version:
0.80.10-0ubuntu1.14.04.3
- services:
--
ceph-mon.docker:
--
cluster:
ceph
fsid:
b1c46f90-f3f7-4ee4-a479-7ba88c1b126a
id:
docker
status:
--
election_epoch:
1
extra_probe_peers:
monmap:
--
created:
0.00
epoch:
1
fsid:
b1c46f90-f3f7-4ee4-a479-7ba88c1b126a
modified:
0.00
mons:
--
- addr:
X:6789/0
- name:
docker
- rank:
0
name:
docker
outside_quorum:
quorum:
- 0
rank:
0
state:
leader
sync_provider:
type:
mon
version:
0.80.10
ceph-osd.0:
--
cluster:
ceph
fsid:
b1c46f90-f3f7-4ee4-a479-7ba88c1b126a
id:
0
status:
None
type:
osd
version:
0.80.10
--
- b1c46f90-f3f7-4ee4-a479-7ba88c1b126a:
--
fsid:
b1c46f90-f3f7-4ee4-a479-7ba88c1b126a
name:
ceph
versions:
--
config:
addc7da6d4b387975091b1263c400e4a
health:
b7a77ba414aa19a9a6af7604b50dfd01
mds_map:
1
mon_map:
1
mon_status:
1
osd_map:
22
pg_summary:
7844b359975693b1754df331a478774a
*ceph2.test.com :*
*'ceph.get_heartbeats' is not available.*
*ceph3.test.com :*
*'ceph.get_heartbeats' is not available.*
root@docker:~#


Any help would be great


root@docker:~# dpkg-query -l | egrep -i "diamond|salt|calamari|ceph" | awk
'{print $2 "\t" $3}'
calamari-clients1.3-rc-12-g7d36e29
calamari-server 1.3.0.1-11-g9fb65ae
ceph  0.80.10-0ubuntu1.14.04.3
ceph-common 0.80.10-0ubuntu1.14.04.3
ceph-deploy 1.5.22trusty
ceph-fs-common  0.80.10-0ubuntu1.14.04.3
ceph-mds0.80.10-0ubuntu1.14.04.3
diamond  3.4.67
libcephfs1  0.80.10-0ubuntu1.14.04.3
python-ceph 0.80.10-0ubuntu1.14.04.3
salt-common 0.17.5+ds-1
salt-master 0.17.5+ds-1
salt-minion 0.17.5+ds-1
root@docker:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:Ubuntu 14.04.3 LTS
Release:14.04
Codename:   trusty
root@docker:~#
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] pg is stuck stale (osd.21 still removed) - SOLVED.

2016-01-13 Thread Daniel Schwager

Well, ok - I found the solution:

ceph health detail
HEALTH_WARN 50 pgs stale; 50 pgs stuck stale
pg 34.225 is stuck inactive since forever, current state 
creating, last acting []
pg 34.225 is stuck unclean since forever, current state 
creating, last acting []
pg 34.226 is stuck stale for 77328.923060, current state 
stale+active+clean, last acting [21]
pg 34.3cb is stuck stale for 77328.923213, current state 
stale+active+clean, last acting [21]


root@ceph-admin:~# ceph pg map 34.225
osdmap e18263 pg 34.225 (34.225) -> up [16] acting [16]

After restart osd.16, pg 34.225 is fine.

So, I recreate all the broken PG's:
for pg in `ceph health detail | grep stale | cut -d' ' -f2`; do ceph pg 
force_create_pg $pg; done

and restart all (or the necessary) OSD's..

Now, the cluster is HEALTH_OK again.
root@ceph-admin:~# ceph  health
HEALTH_OK

Best regards
Danny


smime.p7s
Description: S/MIME cryptographic signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] How to do quiesced rbd snapshot in libvirt?

Re: [ceph-users] Ceph cluster + Ceph client upgrade path for production environment

Re: [ceph-users] How to check the block device space usage

Re: [ceph-users] CentOS 7.2, Infernalis, preparing osd's and partprobe issues.

Re: [ceph-users] Ceph cluster + Ceph client upgrade path for production environment

[ceph-users] lost OSD due to failing disk

Re: [ceph-users] lost OSD due to failing disk

Re: [ceph-users] Ceph cache tier and rbd volumes/SSD primary, HDD replica crush rule!

Re: [ceph-users] pg is stuck stale (osd.21 still removed)

Re: [ceph-users] lost OSD due to failing disk

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

Re: [ceph-users] RBD export format for start and end snapshots

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

[ceph-users] cephfs - inconsistent nfs and samba directory listings

[ceph-users] Securing/Mitigating brute force attacks, Rados Gateway + Keystone

Re: [ceph-users] How to do quiesced rbd snapshot in libvirt?

Re: [ceph-users] How to do quiesced rbd snapshot in libvirt?

Re: [ceph-users] pg is stuck stale (osd.21 still removed)

Re: [ceph-users] cephfs - inconsistent nfs and samba directory listings

Re: [ceph-users] How to do quiesced rbd snapshot in libvirt?

[ceph-users] Ceph node stats back to calamari

Re: [ceph-users] pg is stuck stale (osd.21 still removed) - SOLVED.

22 matches

Site Navigation

Mail list logo

Footer information