>>Regarding rbd cache, is something I will try -today I was thinking about it-
>>but I did not try it yet because I don't want to reduce write speed.
note that rbd_cache only work for sequential writes. so it don't help for
random writes.
also, internaly, qemu force use of aio=threads with
Hello,
I am trying to add a new OSD to my CEPH Cluster, I am running Proxmox so
attempted to as normal via the GUI as normal however received an error output
at the following command:
ceph-disk prepare --zap-disk --fs-type xfs --cluster ceph --cluster-uuid
51c1b5c5-e510-4ed3-8b09-417214edb3f4
On Sat, Mar 11, 2017 at 10:29 AM, PR PR wrote:
> Thanks for the quick reply. I tried it with master as well. Followed
> instructions on this link - https://community.mellanox.com/docs/DOC-2721
>
> Ceph mon fails to start with error "unrecognized ms_type 'async+rdma'"
it
Thanks for the quick reply. I tried it with master as well. Followed
instructions on this link - https://community.mellanox.com/docs/DOC-2721
Ceph mon fails to start with error "unrecognized ms_type 'async+rdma'"
Appreciate any pointers.
On Thu, Mar 9, 2017 at 5:56 PM, Haomai Wang
So this is why it happened I guess.
pool 3 'volumes' replicated size 3 min_size 1
min_size = 1 is a recipe for disasters like this and there are plenty
of ML threads about not setting it below 2.
The past intervals in the pg query show several intervals where a
single OSD may have gone rw.
How
> As long as you don?t nuke the OSDs or the journals, you should be OK.
This. Most HBA failures I’ve experienced don’t corrupt data on the drives, bit
it can happen.
Assuming the data is okay, you should be able to just install the OS, install
the *same version* of Ceph packages, reboot, and
Hi Shain,
As long as you don’t nuke the OSDs or the journals, you should be OK. I think
the keyring and such are typically stored on the OSD itself. If you have lost
track of what physical device maps to what OSD, you can always mount the OSDs
in a temporary spot and cat the “whoami” file.
Hi Shain,
Not talking from experience, but as far as I now -from how ceph works- I guess
is enough if you reinstall the system, install ceph again, add ceph.conf and
keys, and udev will do the rest. Maybe you'll need to restart the server after
you've done everything, but ceph should find the
Hi Jason,
Just to add more information:
- The issue doesn't seem to be fio or glibc (guest) related, as it is working
properly on other environments using the same software versions. Also I've
tried using Ubuntu 14.04 and 16.04 and I'm getting really similar results, but
I'll ran more tests
On Thu, Mar 9, 2017 at 7:20 PM 许雪寒 wrote:
> Thanks for your reply.
>
> As the log shows, in our test, a READ that come after a WRITE did finished
> before that WRITE.
This is where you've gone astray. Any storage system is perfectly free to
reorder simultaneous requests --
On Tue, Mar 7, 2017 at 10:18 AM Alejandro Comisario
wrote:
> Gregory, thanks for the response, what you've said is by far, the most
> enlightneen thing i know about ceph in a long time.
>
> What brings even greater doubt, which is, this "non-functional" pool, was
> only
librbd doesn't know that you are using libaio vs POSIX AIO. Therefore,
the best bet is that the issue is in fio or glibc. As a first step, I
would recommend using blktrace (or similar) within your VM to
determine if there is a delta between libaio and POSIX AIO at the
block level.
On Fri, Mar 10,
Hi Alexandre,
Debugging is disabled in client and osds.
Regarding rbd cache, is something I will try -today I was thinking about it-
but I did not try it yet because I don't want to reduce write speed.
I also tried iothreads, but no benefit.
I tried as well with virtio-blk and virtio-scsi,
Hi,
Am 08.03.17 um 13:11 schrieb Abhishek L:
This point release fixes several important bugs in RBD mirroring, RGW
multi-site, CephFS, and RADOS.
We recommend that all v10.2.x users upgrade.
For more detailed information, see the complete changelog[1] and the release
notes[2]
I hope you
The OSDs are all there.
$ sudo ceph osd stat
osdmap e60609: 72 osds: 72 up, 72 in
an I have attached the result of ceph osd tree, and ceph osd dump commands.
I got some extra info about the network problem. A faulty network device has
flooded the network eating up all the bandwidth so the
Any thoughts ?
On Tue, Mar 7, 2017 at 3:17 PM, Alejandro Comisario
wrote:
> Gregory, thanks for the response, what you've said is by far, the most
> enlightneen thing i know about ceph in a long time.
>
> What brings even greater doubt, which is, this "non-functional"
On Fri, Mar 10, 2017 at 9:11 AM, Eneko Lacunza wrote:
> Hi Martin,
>
> Take a look at
> http://ceph.com/pgcalc/
As a rough guide, use the "RBD" example to work out how many PGs your
CephFS data pool should have.
The metadata pool can almost certainly have far fewer, maybe
To me it looks like someone may have done an "rm" on these OSDs but
not removed them from the crushmap. This does not happen
automatically.
Do these OSDs show up in "ceph osd tree" and "ceph osd dump" ? If so,
paste the output.
Without knowing what exactly happened here it may be difficult to
Hi Martin,
Take a look at
http://ceph.com/pgcalc/
Cheers
Eneko
El 10/03/17 a las 09:54, Martin Wittwer escribió:
Hi List
I am creating a POC cluster with CephFS as a backend for our backup
infrastructure. The backups are rsyncs of whole servers.
I have 4 OSD nodes with 10 4TB disks and 2
Hi List
I am creating a POC cluster with CephFS as a backend for our backup
infrastructure. The backups are rsyncs of whole servers.
I have 4 OSD nodes with 10 4TB disks and 2 SSDs for journaling per node.
My question is now how to calculate the PG count for that scenario? Is
there a way to
Hello,
I was informed that due to a networking issue the ceph cluster network was
affected. There was a huge packet loss, and network interfaces were flipping.
That's all I got.
This outage has lasted a longer period of time. So I assume that some OSD may
have been considered dead and the
Hello,
This night, same effect, new freeze... BUT, I found this morning maybe
why !
A stupid boy added "vm.vfs_cache_pressure=1" for tuning and forget to
remove after on first OSD node... bad boy :)
There is always an explanation. It could not be otherwise.
This was maybe fast good before
22 matches
Mail list logo