Re: [ceph-users] [slightly OT] XFS vs. BTRFS vs. others as root/usr/var/tmp filesystems ?

2018-09-23 Thread Nicolas Huillard
Le dimanche 23 septembre 2018 à 20:28 +0200, mj a écrit :
> XFS has *always* treated us nicely, and we have been using it for a
> VERY 
> long time, ever since the pre-2000 suse 5.2 days on pretty much all
> our 
> machines.
> 
> We have seen only very few corruptions on xfs, and the few times we 
> tried btrfs, (almost) always 'something' happened. (same for the few 
> times we tried reiserfs, btw)
> 
> So, while my story may be very anecdotical (and you will probably
> find 
> many others here claiming the opposite) our own conclusion is very 
> clear: we love xfs, and do not like btrfs very much.

Thanks for your anecdote ;-)
Could it be that I stack too many things (XFS in LVM in md-RAID in SSD
's FTL)?

-- 
Nicolas Huillard
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [slightly OT] XFS vs. BTRFS vs. others as root/usr/var/tmp filesystems ?

2018-09-23 Thread Nicolas Huillard
Le dimanche 23 septembre 2018 à 17:49 -0700, solarflow99 a écrit :
> ya, sadly it looks like btrfs will never materialize as the next
> filesystem
> of the future.  Redhat as an example even dropped it from its future,
> as
> others probably will and have too.

Too bad, since this FS have a lot of very promising features. I view it
as the single-host-ceph-like FS, and do not see any equivalent (apart
from ZFS which will also never included in the kernel).

-- 
Nicolas Huillard
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Filter out RGW keep-alive HTTP and usage log

2018-09-23 Thread Nhat Ngo
Hi list,


We have a 2 Luminous RGW running behind an F5 balancer. Every couple of seconds 
the F5 balancer send a keep-alive request to the RGW and saturate the Civetweb 
log with http entries, making it very difficult to troubleshoot users 
connection. Example:

172.16.212.86 - - [24/Sep/2018:11:58:52 +1000] "GET / HTTP/1.1" 200 0 - -
172.16.212.85 - - [24/Sep/2018:11:58:55 +1000] "GET / HTTP/1.1" 200 0 - -
172.16.212.86 - - [24/Sep/2018:11:58:57 +1000] "GET / HTTP/1.1" 200 0 - -
172.16.212.85 - - [24/Sep/2018:11:59:00 +1000] "GET / HTTP/1.1" 200 0 - -
172.16.212.86 - - [24/Sep/2018:11:59:03 +1000] "GET / HTTP/1.1" 200 0 - -

Is there a way to stop RGW from logging entries from these specific monitoring 
IP?

In addition, when enable `usage stat`, the stats in `usage show` also records 
`anonymous` user.
 $radosgw-admin usage show
{
"entries": [
{
"user": "anonymous",
"buckets": [
{
"bucket": "",
"time": "2018-09-24 00:00:00.00Z",
"epoch": 1537747200,
"owner": "anonymous",
"categories": [
{
"category": "list_buckets",
"bytes_sent": 716686,
"bytes_received": 0,
"ops": 3349,
"successful_ops": 3349
}
]
}
]
}
],
"summary": [
{
"user": "anonymous",
"categories": [
{
"category": "list_buckets",
"bytes_sent": 716686,
"bytes_received": 0,
"ops": 3349,
"successful_ops": 3349
}
],
"total": {
"bytes_sent": 716686,
"bytes_received": 0,
"ops": 3349,
"successful_ops": 3349
}
}
]
}

Is there a config settings I can specific to disable usage stat for 
`anonymous`? I also couldn't use `usage trim --uid=anonymous` because that user 
does not exist in the system.


Best regards,

Nhat Ngo | DevOps Engineer

University of Melbourne, VIC
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] data-pool option for qemu-img / ec pool

2018-09-23 Thread Konstantin Shalygin

Is it possible to set data-pool for ec-pools on qemu-img?
For repl-pools I used "qemu-img convert" to convert from e.g. vmdk to raw
and write to rbd/ceph directly.

The rbd utility is able to do this for raw or empty images but without
convert (converting 800G and writing it again would now take at least twice
the time).

Do I miss a parameter for qemu-kvm?


Just in case:
You don't need to setup ceph.conf for data-pool work if your rbd image 
is already created.
Luminous+ client (librbd) will be find data-pool automatically when 
connected to rbd image and fetch headers.
This qemu-img command for work with pre created rbd image, i.e. image 
should exist in pool - only in this case qemu-img will be start convert.


[vdsm@rhev-host ~]# qemu-img convert -m 16 -W -p -n -f raw -O raw 
 
rbd:/volume-:id=cinder:key=:mon_host=172.16.16.2,172.16.16.3,172.16.16.4

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [slightly OT] XFS vs. BTRFS vs. others as root/usr/var/tmp filesystems ?

2018-09-23 Thread solarflow99
ya, sadly it looks like btrfs will never materialize as the next filesystem
of the future.  Redhat as an example even dropped it from its future, as
others probably will and have too.


On Sun, Sep 23, 2018 at 11:28 AM mj  wrote:

> Hi,
>
> Just a very quick and simple reply:
>
> XFS has *always* treated us nicely, and we have been using it for a VERY
> long time, ever since the pre-2000 suse 5.2 days on pretty much all our
> machines.
>
> We have seen only very few corruptions on xfs, and the few times we
> tried btrfs, (almost) always 'something' happened. (same for the few
> times we tried reiserfs, btw)
>
> So, while my story may be very anecdotical (and you will probably find
> many others here claiming the opposite) our own conclusion is very
> clear: we love xfs, and do not like btrfs very much.
>
> MJ
>
> On 09/22/2018 10:58 AM, Nicolas Huillard wrote:
> > Hi all,
> >
> > I don't have a good track record with XFS since I got rid of ReiserFS a
> > long time ago. I decided XFS was a good idea on servers, while I tested
> > BTRFS on various less important devices.
> > So far, XFS betrayed me far more often (a few times) than BTRFS
> > (never).
> > Last time was yesterday, on a root filesystem with "Block out of range:
> > block 0x17b9814b0, EOFS 0x12a000" "I/O Error Detected. Shutting down
> > filesystem" (shutting down the root filesystem is pretty hard).
> >
> > Some threads on this ML discuss a similar problem, related to
> > partitioning and logical sectors located just after the end of the
> > partition. The problem here does not seem to be the same, as the
> > requested block is very far out of bound (2 orders of magnitude too
> > far), and I use a recent Debian stock kernel with every security patch.
> >
> > My question is : should I trust XFS for small root filesystems (/,
> > /tmp, /var on LVM sitting within md-RAID1 smallish partition), or is
> > BTRFS finally trusty enough for a general purpose cluster (still root
> > et al. filesystems), or do you guys just use the distro-recommended
> > setup (typically Ext4 on plain disks) ?
> >
> > Debian stretch with 4.9.110-3+deb9u4 kernel.
> > Ceph 12.2.8 on bluestore (not related to the question).
> >
> > Partial output of lsblk /dev/sdc /dev/nvme0n1:
> > NAME  MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
> > sdc 8:32   0 447,1G  0 disk
> > ├─sdc1  8:33   0  55,9G  0 part
> > │ └─md0 9:00  55,9G  0 raid1
> > │   ├─oxygene_system-root 253:40   9,3G  0 lvm   /
> > │   ├─oxygene_system-tmp  253:50   9,3G  0 lvm   /tmp
> > │   └─oxygene_system-var  253:60   4,7G  0 lvm   /var
> > └─sdc2  8:34   0  29,8G  0 part  [SWAP]
> > nvme0n1   259:00   477G  0 disk
> > ├─nvme0n1p1   259:10  55,9G  0 part
> > │ └─md0 9:00  55,9G  0 raid1
> > │   ├─oxygene_system-root 253:40   9,3G  0 lvm   /
> > │   ├─oxygene_system-tmp  253:50   9,3G  0 lvm   /tmp
> > │   └─oxygene_system-var  253:60   4,7G  0 lvm   /var
> > ├─nvme0n1p2   259:20  29,8G  0 part  [SWAP]
> >
> > TIA !
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [slightly OT] XFS vs. BTRFS vs. others as root/usr/var/tmp filesystems ?

2018-09-23 Thread mj

Hi,

Just a very quick and simple reply:

XFS has *always* treated us nicely, and we have been using it for a VERY 
long time, ever since the pre-2000 suse 5.2 days on pretty much all our 
machines.


We have seen only very few corruptions on xfs, and the few times we 
tried btrfs, (almost) always 'something' happened. (same for the few 
times we tried reiserfs, btw)


So, while my story may be very anecdotical (and you will probably find 
many others here claiming the opposite) our own conclusion is very 
clear: we love xfs, and do not like btrfs very much.


MJ

On 09/22/2018 10:58 AM, Nicolas Huillard wrote:

Hi all,

I don't have a good track record with XFS since I got rid of ReiserFS a
long time ago. I decided XFS was a good idea on servers, while I tested
BTRFS on various less important devices.
So far, XFS betrayed me far more often (a few times) than BTRFS
(never).
Last time was yesterday, on a root filesystem with "Block out of range:
block 0x17b9814b0, EOFS 0x12a000" "I/O Error Detected. Shutting down
filesystem" (shutting down the root filesystem is pretty hard).

Some threads on this ML discuss a similar problem, related to
partitioning and logical sectors located just after the end of the
partition. The problem here does not seem to be the same, as the
requested block is very far out of bound (2 orders of magnitude too
far), and I use a recent Debian stock kernel with every security patch.

My question is : should I trust XFS for small root filesystems (/,
/tmp, /var on LVM sitting within md-RAID1 smallish partition), or is
BTRFS finally trusty enough for a general purpose cluster (still root
et al. filesystems), or do you guys just use the distro-recommended
setup (typically Ext4 on plain disks) ?

Debian stretch with 4.9.110-3+deb9u4 kernel.
Ceph 12.2.8 on bluestore (not related to the question).

Partial output of lsblk /dev/sdc /dev/nvme0n1:
NAME  MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sdc 8:32   0 447,1G  0 disk
├─sdc1  8:33   0  55,9G  0 part
│ └─md0 9:00  55,9G  0 raid1
│   ├─oxygene_system-root 253:40   9,3G  0 lvm   /
│   ├─oxygene_system-tmp  253:50   9,3G  0 lvm   /tmp
│   └─oxygene_system-var  253:60   4,7G  0 lvm   /var
└─sdc2  8:34   0  29,8G  0 part  [SWAP]
nvme0n1   259:00   477G  0 disk
├─nvme0n1p1   259:10  55,9G  0 part
│ └─md0 9:00  55,9G  0 raid1
│   ├─oxygene_system-root 253:40   9,3G  0 lvm   /
│   ├─oxygene_system-tmp  253:50   9,3G  0 lvm   /tmp
│   └─oxygene_system-var  253:60   4,7G  0 lvm   /var
├─nvme0n1p2   259:20  29,8G  0 part  [SWAP]

TIA !


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw rest API to retrive rgw log entries

2018-09-23 Thread Robin H. Johnson
On Fri, Sep 21, 2018 at 04:17:35PM -0400, Jin Mao wrote:
> I am looking for an API equivalent of 'radosgw-admin log list' and
> 'radosgw-admin log show'. Existing /usage API only reports bucket level
> numbers like 'radosgw-admin usage show' does. Does anyone know if this is
> possible from rest API?
/admin/log is the endpoint you want.
params:
REQUIRED: type=(metadata|bucket-index|data)

The API is a little inconsistent.
metadata & data default to an global info operation, and need an 'id'
argument for listing (also if both 'info' & 'id' are passed, you get
ShardInfo).
bucket-index defaults to listing, but responds to the 'info' argument
with info response.

All types support the status argument as well.

The complete list of /admin/ resources as of Luminous:
/admin/usage
/admin/user
/admin/bucket
/admin/metadata
/admin/log
/admin/opstat
/admin/replica_log
/admin/config
/admin/realm

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail   : robb...@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136


signature.asc
Description: Digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] data-pool option for qemu-img / ec pool

2018-09-23 Thread Kevin Olbrich
Hi Paul,

thanks for the hint, I just checked and it works perfectly.

I found this guide:
https://www.reddit.com/r/ceph/comments/72yc9m/ceph_openstack_with_ec/

The works well with one meta/data setup but not with multiple (like
device-class based pools).

The link above uses client-auth, is there a better way?

Kevin

Am So., 23. Sep. 2018 um 18:08 Uhr schrieb Paul Emmerich
:
>
> The usual trick for clients not supporting this natively is the option
> "rbd_default_data_pool" in ceph.conf which should also work here.
>
>
>   Paul
> Am So., 23. Sep. 2018 um 18:03 Uhr schrieb Kevin Olbrich :
> >
> > Hi!
> >
> > Is it possible to set data-pool for ec-pools on qemu-img?
> > For repl-pools I used "qemu-img convert" to convert from e.g. vmdk to raw 
> > and write to rbd/ceph directly.
> >
> > The rbd utility is able to do this for raw or empty images but without 
> > convert (converting 800G and writing it again would now take at least twice 
> > the time).
> >
> > Do I miss a parameter for qemu-kvm?
> >
> > Kind regards
> > Kevin
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] data-pool option for qemu-img / ec pool

2018-09-23 Thread Paul Emmerich
The usual trick for clients not supporting this natively is the option
"rbd_default_data_pool" in ceph.conf which should also work here.


  Paul
Am So., 23. Sep. 2018 um 18:03 Uhr schrieb Kevin Olbrich :
>
> Hi!
>
> Is it possible to set data-pool for ec-pools on qemu-img?
> For repl-pools I used "qemu-img convert" to convert from e.g. vmdk to raw and 
> write to rbd/ceph directly.
>
> The rbd utility is able to do this for raw or empty images but without 
> convert (converting 800G and writing it again would now take at least twice 
> the time).
>
> Do I miss a parameter for qemu-kvm?
>
> Kind regards
> Kevin
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] data-pool option for qemu-img / ec pool

2018-09-23 Thread Kevin Olbrich
Hi!

Is it possible to set data-pool for ec-pools on qemu-img?
For repl-pools I used "qemu-img convert" to convert from e.g. vmdk to raw
and write to rbd/ceph directly.

The rbd utility is able to do this for raw or empty images but without
convert (converting 800G and writing it again would now take at least twice
the time).

Do I miss a parameter for qemu-kvm?

Kind regards
Kevin
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] BlueStore checksums all data written to disk! so, can we use two copies in the production?

2018-09-23 Thread Paul Emmerich
Short answer: no and no.

Long:

1. having size = 2 is safe *if you also keep min_size at 2*. But
that's not highly available so you usually don't want this. min_size =
1 (or reducing min size on an ec pool) is basically a guarantee to
lose at least some data/writes in the long run.

2. It's no longer as important as it used to be yes. We typically
increase the interval to a month instead of the default week with
bluestore.
But a properly tuned scrubbing configuration has a neglible overhead
and it can give you some confidence about the integrity of data that
is rarely accessed.


Paul

Am So., 23. Sep. 2018 um 08:49 Uhr schrieb jython.li :
>
> "when using BlueStore, Ceph can ensure data integrity by conducting a 
> cyclical redundancy check (CRC) on write operations; then, store the CRC 
> value in the block database. On read operations, Ceph can retrieve the CRC 
> value from the block database and compare it with the generated CRC of the 
> retrieved data to ensure data integrity instantly"
> from 
> https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html-single/architecture_guide/#ensuring_data_integrity
>
> As mentioned above, BlueStore calculates, stores, and verifies checksums for 
> all data and metadata it stores, then, my question are:
> 1. In bulestore, is it safe enough to use two copies in the production 
> environment?
>
> In the filestore, there is no CRC, if you use two copies, there will be brain 
> splitting. For example, when ceph checks that the data between the two copies 
> is inconsistent, then ceph will not know which copy data is correct.
> But in BlueStore, all write data will have CRC, then even if there is 
> inconsistency between the two copies, ceph will know which copy is correct 
> (using the CRC value saved in block database)
>
>
> 2. Similarly, can we turn off deep-scrub at this time?
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com