Re: [ceph-users] [slightly OT] XFS vs. BTRFS vs. others as root/usr/var/tmp filesystems ?
Le dimanche 23 septembre 2018 à 20:28 +0200, mj a écrit : > XFS has *always* treated us nicely, and we have been using it for a > VERY > long time, ever since the pre-2000 suse 5.2 days on pretty much all > our > machines. > > We have seen only very few corruptions on xfs, and the few times we > tried btrfs, (almost) always 'something' happened. (same for the few > times we tried reiserfs, btw) > > So, while my story may be very anecdotical (and you will probably > find > many others here claiming the opposite) our own conclusion is very > clear: we love xfs, and do not like btrfs very much. Thanks for your anecdote ;-) Could it be that I stack too many things (XFS in LVM in md-RAID in SSD 's FTL)? -- Nicolas Huillard ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] [slightly OT] XFS vs. BTRFS vs. others as root/usr/var/tmp filesystems ?
Le dimanche 23 septembre 2018 à 17:49 -0700, solarflow99 a écrit : > ya, sadly it looks like btrfs will never materialize as the next > filesystem > of the future. Redhat as an example even dropped it from its future, > as > others probably will and have too. Too bad, since this FS have a lot of very promising features. I view it as the single-host-ceph-like FS, and do not see any equivalent (apart from ZFS which will also never included in the kernel). -- Nicolas Huillard ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Filter out RGW keep-alive HTTP and usage log
Hi list, We have a 2 Luminous RGW running behind an F5 balancer. Every couple of seconds the F5 balancer send a keep-alive request to the RGW and saturate the Civetweb log with http entries, making it very difficult to troubleshoot users connection. Example: 172.16.212.86 - - [24/Sep/2018:11:58:52 +1000] "GET / HTTP/1.1" 200 0 - - 172.16.212.85 - - [24/Sep/2018:11:58:55 +1000] "GET / HTTP/1.1" 200 0 - - 172.16.212.86 - - [24/Sep/2018:11:58:57 +1000] "GET / HTTP/1.1" 200 0 - - 172.16.212.85 - - [24/Sep/2018:11:59:00 +1000] "GET / HTTP/1.1" 200 0 - - 172.16.212.86 - - [24/Sep/2018:11:59:03 +1000] "GET / HTTP/1.1" 200 0 - - Is there a way to stop RGW from logging entries from these specific monitoring IP? In addition, when enable `usage stat`, the stats in `usage show` also records `anonymous` user. $radosgw-admin usage show { "entries": [ { "user": "anonymous", "buckets": [ { "bucket": "", "time": "2018-09-24 00:00:00.00Z", "epoch": 1537747200, "owner": "anonymous", "categories": [ { "category": "list_buckets", "bytes_sent": 716686, "bytes_received": 0, "ops": 3349, "successful_ops": 3349 } ] } ] } ], "summary": [ { "user": "anonymous", "categories": [ { "category": "list_buckets", "bytes_sent": 716686, "bytes_received": 0, "ops": 3349, "successful_ops": 3349 } ], "total": { "bytes_sent": 716686, "bytes_received": 0, "ops": 3349, "successful_ops": 3349 } } ] } Is there a config settings I can specific to disable usage stat for `anonymous`? I also couldn't use `usage trim --uid=anonymous` because that user does not exist in the system. Best regards, Nhat Ngo | DevOps Engineer University of Melbourne, VIC ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] data-pool option for qemu-img / ec pool
Is it possible to set data-pool for ec-pools on qemu-img? For repl-pools I used "qemu-img convert" to convert from e.g. vmdk to raw and write to rbd/ceph directly. The rbd utility is able to do this for raw or empty images but without convert (converting 800G and writing it again would now take at least twice the time). Do I miss a parameter for qemu-kvm? Just in case: You don't need to setup ceph.conf for data-pool work if your rbd image is already created. Luminous+ client (librbd) will be find data-pool automatically when connected to rbd image and fetch headers. This qemu-img command for work with pre created rbd image, i.e. image should exist in pool - only in this case qemu-img will be start convert. [vdsm@rhev-host ~]# qemu-img convert -m 16 -W -p -n -f raw -O raw rbd:/volume-:id=cinder:key=:mon_host=172.16.16.2,172.16.16.3,172.16.16.4 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] [slightly OT] XFS vs. BTRFS vs. others as root/usr/var/tmp filesystems ?
ya, sadly it looks like btrfs will never materialize as the next filesystem of the future. Redhat as an example even dropped it from its future, as others probably will and have too. On Sun, Sep 23, 2018 at 11:28 AM mj wrote: > Hi, > > Just a very quick and simple reply: > > XFS has *always* treated us nicely, and we have been using it for a VERY > long time, ever since the pre-2000 suse 5.2 days on pretty much all our > machines. > > We have seen only very few corruptions on xfs, and the few times we > tried btrfs, (almost) always 'something' happened. (same for the few > times we tried reiserfs, btw) > > So, while my story may be very anecdotical (and you will probably find > many others here claiming the opposite) our own conclusion is very > clear: we love xfs, and do not like btrfs very much. > > MJ > > On 09/22/2018 10:58 AM, Nicolas Huillard wrote: > > Hi all, > > > > I don't have a good track record with XFS since I got rid of ReiserFS a > > long time ago. I decided XFS was a good idea on servers, while I tested > > BTRFS on various less important devices. > > So far, XFS betrayed me far more often (a few times) than BTRFS > > (never). > > Last time was yesterday, on a root filesystem with "Block out of range: > > block 0x17b9814b0, EOFS 0x12a000" "I/O Error Detected. Shutting down > > filesystem" (shutting down the root filesystem is pretty hard). > > > > Some threads on this ML discuss a similar problem, related to > > partitioning and logical sectors located just after the end of the > > partition. The problem here does not seem to be the same, as the > > requested block is very far out of bound (2 orders of magnitude too > > far), and I use a recent Debian stock kernel with every security patch. > > > > My question is : should I trust XFS for small root filesystems (/, > > /tmp, /var on LVM sitting within md-RAID1 smallish partition), or is > > BTRFS finally trusty enough for a general purpose cluster (still root > > et al. filesystems), or do you guys just use the distro-recommended > > setup (typically Ext4 on plain disks) ? > > > > Debian stretch with 4.9.110-3+deb9u4 kernel. > > Ceph 12.2.8 on bluestore (not related to the question). > > > > Partial output of lsblk /dev/sdc /dev/nvme0n1: > > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT > > sdc 8:32 0 447,1G 0 disk > > ├─sdc1 8:33 0 55,9G 0 part > > │ └─md0 9:00 55,9G 0 raid1 > > │ ├─oxygene_system-root 253:40 9,3G 0 lvm / > > │ ├─oxygene_system-tmp 253:50 9,3G 0 lvm /tmp > > │ └─oxygene_system-var 253:60 4,7G 0 lvm /var > > └─sdc2 8:34 0 29,8G 0 part [SWAP] > > nvme0n1 259:00 477G 0 disk > > ├─nvme0n1p1 259:10 55,9G 0 part > > │ └─md0 9:00 55,9G 0 raid1 > > │ ├─oxygene_system-root 253:40 9,3G 0 lvm / > > │ ├─oxygene_system-tmp 253:50 9,3G 0 lvm /tmp > > │ └─oxygene_system-var 253:60 4,7G 0 lvm /var > > ├─nvme0n1p2 259:20 29,8G 0 part [SWAP] > > > > TIA ! > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] [slightly OT] XFS vs. BTRFS vs. others as root/usr/var/tmp filesystems ?
Hi, Just a very quick and simple reply: XFS has *always* treated us nicely, and we have been using it for a VERY long time, ever since the pre-2000 suse 5.2 days on pretty much all our machines. We have seen only very few corruptions on xfs, and the few times we tried btrfs, (almost) always 'something' happened. (same for the few times we tried reiserfs, btw) So, while my story may be very anecdotical (and you will probably find many others here claiming the opposite) our own conclusion is very clear: we love xfs, and do not like btrfs very much. MJ On 09/22/2018 10:58 AM, Nicolas Huillard wrote: Hi all, I don't have a good track record with XFS since I got rid of ReiserFS a long time ago. I decided XFS was a good idea on servers, while I tested BTRFS on various less important devices. So far, XFS betrayed me far more often (a few times) than BTRFS (never). Last time was yesterday, on a root filesystem with "Block out of range: block 0x17b9814b0, EOFS 0x12a000" "I/O Error Detected. Shutting down filesystem" (shutting down the root filesystem is pretty hard). Some threads on this ML discuss a similar problem, related to partitioning and logical sectors located just after the end of the partition. The problem here does not seem to be the same, as the requested block is very far out of bound (2 orders of magnitude too far), and I use a recent Debian stock kernel with every security patch. My question is : should I trust XFS for small root filesystems (/, /tmp, /var on LVM sitting within md-RAID1 smallish partition), or is BTRFS finally trusty enough for a general purpose cluster (still root et al. filesystems), or do you guys just use the distro-recommended setup (typically Ext4 on plain disks) ? Debian stretch with 4.9.110-3+deb9u4 kernel. Ceph 12.2.8 on bluestore (not related to the question). Partial output of lsblk /dev/sdc /dev/nvme0n1: NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sdc 8:32 0 447,1G 0 disk ├─sdc1 8:33 0 55,9G 0 part │ └─md0 9:00 55,9G 0 raid1 │ ├─oxygene_system-root 253:40 9,3G 0 lvm / │ ├─oxygene_system-tmp 253:50 9,3G 0 lvm /tmp │ └─oxygene_system-var 253:60 4,7G 0 lvm /var └─sdc2 8:34 0 29,8G 0 part [SWAP] nvme0n1 259:00 477G 0 disk ├─nvme0n1p1 259:10 55,9G 0 part │ └─md0 9:00 55,9G 0 raid1 │ ├─oxygene_system-root 253:40 9,3G 0 lvm / │ ├─oxygene_system-tmp 253:50 9,3G 0 lvm /tmp │ └─oxygene_system-var 253:60 4,7G 0 lvm /var ├─nvme0n1p2 259:20 29,8G 0 part [SWAP] TIA ! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] radosgw rest API to retrive rgw log entries
On Fri, Sep 21, 2018 at 04:17:35PM -0400, Jin Mao wrote: > I am looking for an API equivalent of 'radosgw-admin log list' and > 'radosgw-admin log show'. Existing /usage API only reports bucket level > numbers like 'radosgw-admin usage show' does. Does anyone know if this is > possible from rest API? /admin/log is the endpoint you want. params: REQUIRED: type=(metadata|bucket-index|data) The API is a little inconsistent. metadata & data default to an global info operation, and need an 'id' argument for listing (also if both 'info' & 'id' are passed, you get ShardInfo). bucket-index defaults to listing, but responds to the 'info' argument with info response. All types support the status argument as well. The complete list of /admin/ resources as of Luminous: /admin/usage /admin/user /admin/bucket /admin/metadata /admin/log /admin/opstat /admin/replica_log /admin/config /admin/realm -- Robin Hugh Johnson Gentoo Linux: Dev, Infra Lead, Foundation Treasurer E-Mail : robb...@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136 signature.asc Description: Digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] data-pool option for qemu-img / ec pool
Hi Paul, thanks for the hint, I just checked and it works perfectly. I found this guide: https://www.reddit.com/r/ceph/comments/72yc9m/ceph_openstack_with_ec/ The works well with one meta/data setup but not with multiple (like device-class based pools). The link above uses client-auth, is there a better way? Kevin Am So., 23. Sep. 2018 um 18:08 Uhr schrieb Paul Emmerich : > > The usual trick for clients not supporting this natively is the option > "rbd_default_data_pool" in ceph.conf which should also work here. > > > Paul > Am So., 23. Sep. 2018 um 18:03 Uhr schrieb Kevin Olbrich : > > > > Hi! > > > > Is it possible to set data-pool for ec-pools on qemu-img? > > For repl-pools I used "qemu-img convert" to convert from e.g. vmdk to raw > > and write to rbd/ceph directly. > > > > The rbd utility is able to do this for raw or empty images but without > > convert (converting 800G and writing it again would now take at least twice > > the time). > > > > Do I miss a parameter for qemu-kvm? > > > > Kind regards > > Kevin > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > -- > Paul Emmerich > > Looking for help with your Ceph cluster? Contact us at https://croit.io > > croit GmbH > Freseniusstr. 31h > 81247 München > www.croit.io > Tel: +49 89 1896585 90 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] data-pool option for qemu-img / ec pool
The usual trick for clients not supporting this natively is the option "rbd_default_data_pool" in ceph.conf which should also work here. Paul Am So., 23. Sep. 2018 um 18:03 Uhr schrieb Kevin Olbrich : > > Hi! > > Is it possible to set data-pool for ec-pools on qemu-img? > For repl-pools I used "qemu-img convert" to convert from e.g. vmdk to raw and > write to rbd/ceph directly. > > The rbd utility is able to do this for raw or empty images but without > convert (converting 800G and writing it again would now take at least twice > the time). > > Do I miss a parameter for qemu-kvm? > > Kind regards > Kevin > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] data-pool option for qemu-img / ec pool
Hi! Is it possible to set data-pool for ec-pools on qemu-img? For repl-pools I used "qemu-img convert" to convert from e.g. vmdk to raw and write to rbd/ceph directly. The rbd utility is able to do this for raw or empty images but without convert (converting 800G and writing it again would now take at least twice the time). Do I miss a parameter for qemu-kvm? Kind regards Kevin ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] BlueStore checksums all data written to disk! so, can we use two copies in the production?
Short answer: no and no. Long: 1. having size = 2 is safe *if you also keep min_size at 2*. But that's not highly available so you usually don't want this. min_size = 1 (or reducing min size on an ec pool) is basically a guarantee to lose at least some data/writes in the long run. 2. It's no longer as important as it used to be yes. We typically increase the interval to a month instead of the default week with bluestore. But a properly tuned scrubbing configuration has a neglible overhead and it can give you some confidence about the integrity of data that is rarely accessed. Paul Am So., 23. Sep. 2018 um 08:49 Uhr schrieb jython.li : > > "when using BlueStore, Ceph can ensure data integrity by conducting a > cyclical redundancy check (CRC) on write operations; then, store the CRC > value in the block database. On read operations, Ceph can retrieve the CRC > value from the block database and compare it with the generated CRC of the > retrieved data to ensure data integrity instantly" > from > https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html-single/architecture_guide/#ensuring_data_integrity > > As mentioned above, BlueStore calculates, stores, and verifies checksums for > all data and metadata it stores, then, my question are: > 1. In bulestore, is it safe enough to use two copies in the production > environment? > > In the filestore, there is no CRC, if you use two copies, there will be brain > splitting. For example, when ceph checks that the data between the two copies > is inconsistent, then ceph will not know which copy data is correct. > But in BlueStore, all write data will have CRC, then even if there is > inconsistency between the two copies, ceph will know which copy is correct > (using the CRC value saved in block database) > > > 2. Similarly, can we turn off deep-scrub at this time? > > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com