Re: [ceph-users] Guests not getting an IP

2016-08-08 Thread Asanka Gunasekara
I am sorry the inconvenience, I have cross send two mails in the middle of the night, Best Regards *Asanka Gunasekara* *Asst. Engineering Manager - Systems Infra Services* *Global IT Infrastructure Services Division* | Informatics International Limited | 89/57 | Jampettah Lane | Colombo 13 |

Re: [ceph-users] Guests not getting an IP

2016-08-08 Thread Asanka Gunasekara
Hi I am sorry, please ignore this!! Best Regards *Asanka Gunasekara* *Asst. Engineering Manager - Systems Infra Services* *Global IT Infrastructure Services Division* | Informatics International Limited | 89/57 | Jampettah Lane | Colombo 13 | Sri Lanka | | T: +94-115-794-942 (Dir)| F:

Re: [ceph-users] how to debug pg inconsistent state - no ioerrors seen

2016-08-08 Thread Goncalo Borges
Hi Kenneth... The previous default behavior of 'ceph pg repair' was to copy the pg objects from the primary osd to others. Not sure if it is till the case in Jewel. For this reason, once we get these kind of errors in a data pool, the best practice is to compare the md5 checksums of the

Re: [ceph-users] Fast Ceph a Cluster with PB storage

2016-08-08 Thread Christian Balzer
Hello, On Mon, 08 Aug 2016 17:39:07 +0300 Александр Пивушков wrote: > > Hello dear community! > I'm new to the Ceph and not long ago took up the theme of building clusters. > Therefore it is very important to your opinion. > It is necessary to create a cluster from 1.2 PB storage and very

Re: [ceph-users] Advice on migrating from legacy tunables to Jewel tunables.

2016-08-08 Thread Gregory Farnum
On Mon, Aug 8, 2016 at 5:14 PM, Goncalo Borges wrote: > Thanks for replying Greg. > > I am trying to figure oout what parameters should I tune to mitigate the > impact of the data movement. For now, I've set > >osd max backfills = 1 > > Are there others you think

Re: [ceph-users] Advice on migrating from legacy tunables to Jewel tunables.

2016-08-08 Thread Goncalo Borges
Thanks for replying Greg. I am trying to figure oout what parameters should I tune to mitigate the impact of the data movement. For now, I've set osd max backfills = 1 Are there others you think we should set? What do you reckon? Cheers Goncalo On 08/09/2016 09:26 AM, Gregory Farnum

Re: [ceph-users] Advice on migrating from legacy tunables to Jewel tunables.

2016-08-08 Thread Gregory Farnum
On Thu, Aug 4, 2016 at 8:57 PM, Goncalo Borges wrote: > Dear cephers... > > I am looking for some advice on migrating from legacy tunables to Jewel > tunables. > > What would be the best strategy? > > 1) A step by step approach? > - starting with the transition

Re: [ceph-users] Recover Data from Deleted RBD Volume

2016-08-08 Thread Jason Dillaman
On Mon, Aug 8, 2016 at 5:39 PM, Jason Dillaman wrote: > Unfortunately, for v2 RBD images, this image name to image id mapping > is stored in the LevelDB database within the OSDs and I don't know, > offhand, how to attempt to recover deleted values from there. Actually, to

Re: [ceph-users] Recover Data from Deleted RBD Volume

2016-08-08 Thread Jason Dillaman
All RBD images use a backing RADOS object to facilitate mapping between the external image name and the internal image id. For v1 images this object would be named ".rbd" and for v2 images this object would be named "rbd_id.". You would need to find this deleted object first in order to start

Re: [ceph-users] Recover Data from Deleted RBD Volume

2016-08-08 Thread George Mihaiescu
Look in the cinder db, the volumes table to find the Uuid of the deleted volume. If you go through yours OSDs and look for the directories for PG index 20, you might find some fragments from the deleted volume, but it's a long shot... > On Aug 8, 2016, at 4:39 PM, Georgios Dimitrakakis

Re: [ceph-users] Recover Data from Deleted RBD Volume

2016-08-08 Thread Georgios Dimitrakakis
Dear David (and all), the data are considered very critical therefore all this attempt to recover them. Although the cluster hasn't been fully stopped all users actions have. I mean services are running but users are not able to read/write/delete. The deleted image was the exact same size

Re: [ceph-users] Guests not getting an IP

2016-08-08 Thread Oliver Dzombic
Hi, i dont see how this is ceph related. You should ask your question in a cloudstack mailinglist/forum/webseite. -- Mit freundlichen Gruessen / Best regards Oliver Dzombic IP-Interactive mailto:i...@ip-interactive.de Anschrift: IP Interactive UG ( haftungsbeschraenkt ) Zum Sonnenberg 1-3

[ceph-users] Guests not getting an IP

2016-08-08 Thread Asanka Gunasekara
Hi Hope someone can help me on this Below is my env a) CloudStack 4.9 on CentOs 7.2, b) running on 2 compute nodes and one controller node c) NFS as shared storage. d) Basic networking f) I am able to ping hosts and all buests, system VMs and hosts are in the same

Re: [ceph-users] Recover Data from Deleted RBD Volume

2016-08-08 Thread Georgios Dimitrakakis
Hi, On 08.08.2016 10:50, Georgios Dimitrakakis wrote: Hi, On 08.08.2016 09:58, Georgios Dimitrakakis wrote: Dear all, I would like your help with an emergency issue but first let me describe our environment. Our environment consists of 2OSD nodes with 10x 2TB HDDs each and 3MON nodes

Re: [ceph-users] Recovering full OSD

2016-08-08 Thread Eric Eastman
Under Jewel 10.2.2 I have also had to delete PG directories to get very full OSDs to restart. I first use "du -sh *" under the "current" directory to find which OSD directories are the fullest on the full OSD disk, and pick 1 of the fullest. I then look at the PG map and verify the PG is

Re: [ceph-users] rbd cache influence data's consistency?

2016-08-08 Thread Jason Dillaman
librbd / QEMU advertise to the guest OS that the disk has writeback cache enabled so that the guest OS will send any necessary flush requests to inject write barriers and ensure data consistency. As a safety precaution, librbd will treat the cache as writethrough until it received the first flush

[ceph-users] Best practices for extending a ceph cluster with minimal client impact data movement

2016-08-08 Thread Martin Palma
Hi all, we are in the process of expanding our cluster and I would like to know if there are some best practices in doing so. Our current cluster is composted as follows: - 195 OSDs (14 Storage Nodes) - 3 Monitors - Total capacity 620 TB - Used 360 TB We will expand the cluster by other 14

[ceph-users] Fast Ceph a Cluster with PB storage

2016-08-08 Thread Александр Пивушков
Hello dear community! I'm new to the Ceph and not long ago took up the theme of building clusters. Therefore it is very important to your opinion. It is necessary to create a cluster from 1.2 PB storage and very rapid access to data. Earlier disks of "Intel® SSD DC P3608 Series 1.6TB NVMe PCIe

Re: [ceph-users] Recovering full OSD

2016-08-08 Thread Gerd Jakobovitsch
I got to this situation several times, due to a strange behavior in the xfs filesystem - I initially ran on debian, afterwards reinstalled the nodes to centos7, kernel 3.10.0-229.14.1.el7.x86_64, package xfsprogs-3.2.1-6.el7.x86_64. Around 75-80% of usage shown with df, the disk is already

Re: [ceph-users] Recover Data from Deleted RBD Volume

2016-08-08 Thread Georgios Dimitrakakis
Hi, On 08.08.2016 10:50, Georgios Dimitrakakis wrote: Hi, On 08.08.2016 09:58, Georgios Dimitrakakis wrote: Dear all, I would like your help with an emergency issue but first let me describe our environment. Our environment consists of 2OSD nodes with 10x 2TB HDDs each and 3MON nodes

Re: [ceph-users] MDS in read-only mode

2016-08-08 Thread Dmitriy Lysenko
08.08.2016 13:51, Wido den Hollander пишет: > >> Op 8 augustus 2016 om 12:49 schreef John Spray : >> >> >> On Mon, Aug 8, 2016 at 9:26 AM, Dmitriy Lysenko wrote: >>> Good day. >>> >>> My CephFS switched to read only >>> This problem was previously on Hammer,

Re: [ceph-users] Giant to Jewel poor read performance with Rados bench

2016-08-08 Thread Mark Nelson
Hi David, We haven't done any direct giant to jewel comparisons, but I wouldn't expect a drop that big, even for cached tests. How long are you running the test for, and how large are the IOs? Did you upgrade anything else at the same time Ceph was updated? Mark On 08/06/2016 03:38 PM,

Re: [ceph-users] [Scst-devel] Thin Provisioning and Ceph RBD's

2016-08-08 Thread Ilya Dryomov
On Sun, Aug 7, 2016 at 7:57 PM, Alex Gorbachev wrote: >> I'm confused. How can a 4M discard not free anything? It's either >> going to hit an entire object or two adjacent objects, truncating the >> tail of one and zeroing the head of another. Using rbd diff: >> >> $

[ceph-users] how to debug pg inconsistent state - no ioerrors seen

2016-08-08 Thread Kenneth Waegeman
Hi all, Since last week, some pg's are going in the inconsistent state after a scrub error. Last week we had 4 pgs in that state, They were on different OSDS, but all of the metadata pool. I did a pg repair on them, and all were healthy again. But now again one pg is inconsistent. with

Re: [ceph-users] Recovering full OSD

2016-08-08 Thread Shinobu Kinjo
So I am wondering ``was`` is the recommended way to fix this issue for the cluster running Jewel release (10.2.2)? So I am wondering ``what`` is the recommended way to fix this issue for the cluster running Jewel release (10.2.2)? typo? On Mon, Aug 8, 2016 at 8:19 PM, Mykola Dvornik

Re: [ceph-users] Recovering full OSD

2016-08-08 Thread Mykola Dvornik
@Shinobu According to http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/ "If you cannot start an OSD because it is full, you may delete some data by deleting some placement group directories in the full OSD." On 8 August 2016 at 13:16, Shinobu Kinjo

Re: [ceph-users] Recovering full OSD

2016-08-08 Thread Shinobu Kinjo
On Mon, Aug 8, 2016 at 8:01 PM, Mykola Dvornik wrote: > Dear ceph community, > > One of the OSDs in my cluster cannot start due to the > > ERROR: osd init failed: (28) No space left on device > > A while ago it was recommended to manually delete PGs on the OSD to let it

[ceph-users] Recovering full OSD

2016-08-08 Thread Mykola Dvornik
Dear ceph community, One of the OSDs in my cluster cannot start due to the *ERROR: osd init failed: (28) No space left on device* A while ago it was recommended to manually delete PGs on the OSD to let it start. So I am wondering was is the recommended way to fix this issue for the cluster

Re: [ceph-users] MDS in read-only mode

2016-08-08 Thread Wido den Hollander
> Op 8 augustus 2016 om 12:49 schreef John Spray : > > > On Mon, Aug 8, 2016 at 9:26 AM, Dmitriy Lysenko wrote: > > Good day. > > > > My CephFS switched to read only > > This problem was previously on Hammer, but i recreated cephfs, upgraded to > > Jewel

Re: [ceph-users] MDS in read-only mode

2016-08-08 Thread John Spray
On Mon, Aug 8, 2016 at 9:26 AM, Dmitriy Lysenko wrote: > Good day. > > My CephFS switched to read only > This problem was previously on Hammer, but i recreated cephfs, upgraded to > Jewel and problem was solved, but appeared after some time. > > ceph.log > 2016-08-07

Re: [ceph-users] OSDs going down when we bring down some OSD nodes Or cut-off the cluster network link between OSD nodes

2016-08-08 Thread Venkata Manojawa Paritala
Hi Christian, Thank you very much for the reply. Please find my comments in-line. Thanks & Regards, Manoj On Sun, Aug 7, 2016 at 3:26 PM, Christian Balzer wrote: > > [Reduced to ceph-users, this isn't community related] > > Hello, > > On Sat, 6 Aug 2016 20:23:41 +0530 Venkata

Re: [ceph-users] Recover Data from Deleted RBD Volume

2016-08-08 Thread Burkhard Linke
Hi, On 08.08.2016 10:50, Georgios Dimitrakakis wrote: Hi, On 08.08.2016 09:58, Georgios Dimitrakakis wrote: Dear all, I would like your help with an emergency issue but first let me describe our environment. Our environment consists of 2OSD nodes with 10x 2TB HDDs each and 3MON nodes

Re: [ceph-users] Recover Data from Deleted RBD Volume

2016-08-08 Thread David
That will be down to the pool the rbd was in, the crush rule for that pool will dictate which osd's store objects. In a standard config that rbd will likely have objects on every osd in your cluster. On 8 Aug 2016 9:51 a.m., "Georgios Dimitrakakis" wrote: > Hi, >> >> >> On

Re: [ceph-users] Recover Data from Deleted RBD Volume

2016-08-08 Thread Georgios Dimitrakakis
Hi, On 08.08.2016 09:58, Georgios Dimitrakakis wrote: Dear all, I would like your help with an emergency issue but first let me describe our environment. Our environment consists of 2OSD nodes with 10x 2TB HDDs each and 3MON nodes (2 of them are the OSD nodes as well) all with ceph

[ceph-users] rbd cache influence data's consistency?

2016-08-08 Thread Ops Cloud
Hello, I read from a blog article that with rbd cache open, it will influence the data's consistency. is this true? for better consistency, should we disable rbd cache? thanks. -- Ops Cloud o...@19cloud.net  ___ ceph-users mailing list

Re: [ceph-users] Recover Data from Deleted RBD Volume

2016-08-08 Thread Burkhard Linke
Hi, On 08.08.2016 09:58, Georgios Dimitrakakis wrote: Dear all, I would like your help with an emergency issue but first let me describe our environment. Our environment consists of 2OSD nodes with 10x 2TB HDDs each and 3MON nodes (2 of them are the OSD nodes as well) all with ceph

[ceph-users] MDS in read-only mode

2016-08-08 Thread Dmitriy Lysenko
Good day. My CephFS switched to read only This problem was previously on Hammer, but i recreated cephfs, upgraded to Jewel and problem was solved, but appeared after some time. ceph.log 2016-08-07 18:11:31.226960 mon.0 192.168.13.100:6789/0 148601 : cluster [INF] HEALTH_WARN; mds0: MDS in

[ceph-users] Recover Data from Deleted RBD Volume

2016-08-08 Thread Georgios Dimitrakakis
Dear all, I would like your help with an emergency issue but first let me describe our environment. Our environment consists of 2OSD nodes with 10x 2TB HDDs each and 3MON nodes (2 of them are the OSD nodes as well) all with ceph version 0.80.9 (b5a67f0e1d15385bc0d60a6da6e7fc810bde6047)

[ceph-users] Better late than never, some XFS versus EXT4 test results

2016-08-08 Thread Christian Balzer
Hello, Re-cap of my new test and staging cluster: 4 nodes running latest Hammer under Debian Jessie (with sysvinit, kernel 4.6) and manually created OSDs. Infiniband (IPoIB) QDR (40Gb/s, about 30Gb/s effective) between all nodes. 2 HDD OSD nodes with 32GB RAM, fast enough CPU (E5-2620 v3), 2x