Re: [ceph-users] cephfs compression?

2018-06-29 Thread Youzhong Yang
Thanks Richard. Yes, it seems working by perf dump:

osd.6
"bluestore_compressed":   62622444,
"bluestore_compressed_allocated": 186777600,
"bluestore_compressed_original":373555200,

It's very interesting that  bluestore_compressed_allocated is approximately
50% of  bluestore_compressed_original across all OSDs, just curious - why?

On Fri, Jun 29, 2018 at 1:15 AM, Richard Bade  wrote:

> Oh, also because the compression is at the osd level you don't see it
> in ceph df. You just see that your RAW is not increasing as much as
> you'd expect. E.g.
> $ sudo ceph df
> GLOBAL:
> SIZE AVAIL RAW USED %RAW USED
> 785T  300T 485T 61.73
> POOLS:
> NAMEID USED %USED MAX AVAIL OBJECTS
> cephfs-metadata 11 185M 068692G   178
> cephfs-data 12 408T 75.26  134T 132641159
>
> You can see that we've used 408TB in the pool but only 485TB RAW -
> Rather than ~600TB RAW that I'd expect for my k4, m2 pool settings.
> On Fri, 29 Jun 2018 at 17:08, Richard Bade  wrote:
> >
> > I'm using compression on a cephfs-data pool in luminous. I didn't do
> > anything special
> >
> > $ sudo ceph osd pool get cephfs-data all | grep ^compression
> > compression_mode: aggressive
> > compression_algorithm: zlib
> >
> > You can check how much compression you're getting on the osd's
> > $ for osd in `seq 0 11`; do echo osd.$osd; sudo ceph daemon osd.$osd
> > perf dump | grep 'bluestore_compressed'; done
> > osd.0
> > "bluestore_compressed": 686487948225,
> > "bluestore_compressed_allocated": 788659830784,
> > "bluestore_compressed_original": 1660064620544,
> > 
> > osd.11
> > "bluestore_compressed": 700999601387,
> > "bluestore_compressed_allocated": 808854355968,
> > "bluestore_compressed_original": 1752045551616,
> >
> > I can't say for mimic, but definitely for luminous v12.2.5 compression
> > is working well with mostly default options.
> >
> > -Rich
> >
> > > For RGW, compression works very well. We use rgw to store crash dumps,
> in
> > > most cases, the compression ratio is about 2.0 ~ 4.0.
> >
> > > I tried to enable compression for cephfs data pool:
> >
> > > # ceph osd pool get cephfs_data all | grep ^compression
> > > compression_mode: force
> > > compression_algorithm: lz4
> > > compression_required_ratio: 0.95
> > > compression_max_blob_size: 4194304
> > > compression_min_blob_size: 4096
> >
> > > (we built ceph packages and enabled lz4.)
> >
> > > It doesn't seem to work. I copied a 8.7GB folder to cephfs, ceph df
> says it
> > > used 8.7GB:
> >
> > > root at ceph-admin:~# ceph df
> > > GLOBAL:
> > > SIZE   AVAIL  RAW USED %RAW USED
> > > 16 TiB 16 TiB  111 GiB  0.69
> > > POOLS:
> > > NAMEID USED%USED MAX AVAIL
>  OBJECTS
> > > cephfs_data 1  8.7 GiB  0.17   5.0 TiB
> 360545
> > > cephfs_metadata 2  221 MiB 0   5.0 TiB
>  77707
> >
> > > I know this folder can be compressed to ~4.0GB under zfs lz4
> compression.
> >
> > > Am I missing anything? how to make cephfs compression work? is there
> any
> > trick?
> >
> > > By the way, I am evaluating ceph mimic v13.2.0.
> >
> > > Thanks in advance,
> > > --Youzhong
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] cephfs compression?

2018-06-28 Thread Youzhong Yang
For RGW, compression works very well. We use rgw to store crash dumps, in
most cases, the compression ratio is about 2.0 ~ 4.0.

I tried to enable compression for cephfs data pool:

# ceph osd pool get cephfs_data all | grep ^compression
compression_mode: force
compression_algorithm: lz4
compression_required_ratio: 0.95
compression_max_blob_size: 4194304
compression_min_blob_size: 4096

(we built ceph packages and enabled lz4.)

It doesn't seem to work. I copied a 8.7GB folder to cephfs, ceph df says it
used 8.7GB:

root@ceph-admin:~# ceph df
GLOBAL:
SIZE   AVAIL  RAW USED %RAW USED
16 TiB 16 TiB  111 GiB  0.69
POOLS:
NAMEID USED%USED MAX AVAIL OBJECTS
cephfs_data 1  8.7 GiB  0.17   5.0 TiB  360545
cephfs_metadata 2  221 MiB 0   5.0 TiB   77707

I know this folder can be compressed to ~4.0GB under zfs lz4 compression.

Am I missing anything? how to make cephfs compression work? is there any
trick?

By the way, I am evaluating ceph mimic v13.2.0.

Thanks in advance,
--Youzhong
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to make nfs v3 work? nfs-ganesha for cephfs

2018-06-27 Thread Youzhong Yang
Thank you Paul. mount_path_pseudo does the trick.

Now nfs v3 works on Linux, but on MAC OS, it mounts successfully, with
empty folder:

bat8485maci:~ root# mount -t nfs -o vers=3 ceph-admin:/ceph /mnt/ceph
bat8485maci:~ root# ls -l /mnt/ceph/
bat8485maci:~ root#

This is how it looks like on Linux:

root@yyang-deb9-64:~# mount -t nfs ceph-admin:/ceph /mnt/ceph
mount.nfs: requested NFS version or transport protocol is not supported
root@yyang-deb9-64:~# mount -t nfs -o vers=3 ceph-admin:/ceph /mnt/ceph
root@yyang-deb9-64:~# ls -l /mnt/ceph/
total 2
drwxr-xr-x 3 root root  8566720930 Jun 27 08:32 Btnas1_stage
drwxr-xr-x 3 root root  2776623737 Jun 27 08:37 Btnas2_stage
drwxr-xr-x 6 batserve users 456266 Jun 25 21:30 include
drwxrwxrwx 2 root root   0 Jun 27 08:32 scratch

Is this a MAC OS nfs bug? or an nfs-ganesha issue? By the way,
my /etc/ganesha/ganesha.conf is as below:

EXPORT_DEFAULTS
{
SecType = sys;
Protocols = 3;
}
NFS_CORE_PARAM
{
Enable_NLM = false;
Enable_RQUOTA = false;
Protocols = 3;
mount_path_pseudo = true;
}
CACHEINODE {
Dir_Chunk = 0;
NParts = 1;
Cache_Size = 1;
}
EXPORT
{
Export_ID=100;
Protocols = 3;
Transports = TCP,UDP;
Path = /;
Pseudo = /ceph;
Tag = ceph;
Access_Type = RW;
Attr_Expiration_Time = 0;
Squash = None;
FSAL {
Name = CEPH;
}
}

Thanks,

--Youzhong

On Wed, Jun 27, 2018 at 2:55 AM, Paul Emmerich 
wrote:

> NFS3 does not use pseudo paths usually. You can enable
> the Mount_Path_Pseudo option in NFS_CORE_PARAM to
> enable usage of pseudo fsal for NFS3 clients. (Note that
> the NFS3 clients cannot mount the pseudo root itself, but
> only subdirectories due to limitations in the inode size)
>
>
> Paul
>
> 2018-06-26 18:13 GMT+02:00 Youzhong Yang :
>
>> NFS v4 works like a charm, no issue for Linux clients, but when trying to
>> mount on MAC OS X client, it doesn't work - likely due to 'mountd' not
>> registered in rpc by ganesha when it comes to v4.
>>
>> So I tried to set up v3, no luck:
>>
>> # mount -t nfs -o rw,noatime,vers=3 ceph-dev:/ceph /mnt/ceph
>> mount.nfs: access denied by server while mounting ceph-dev:/ceph
>>
>> /var/log/ganesha/ganesha.log says:
>>
>> mnt_Mnt :NFS3 :INFO :MOUNT: Export entry / does not support NFS v3 for
>> client :::172.21.24.38
>>
>> My /etc/ganesha/ganesha.conf looks like this:
>>
>> EXPORT
>> {
>> Export_ID=100;
>> Protocols = 3;
>> Transports = TCP;
>> Path = /ceph;
>> Tag = ceph;
>> Pseudo = /ceph;
>> Access_Type = RW;
>> Squash = None;
>> FSAL {
>> Name = CEPH;
>> }
>> }
>>
>> Any way to make it work? Thanks in advance.
>>
>> --Youzhong
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> <https://maps.google.com/?q=Freseniusstr.+31h+81247+M%C3%BCnchen=gmail=g>
> 81247 München
> <https://maps.google.com/?q=Freseniusstr.+31h+81247+M%C3%BCnchen=gmail=g>
> www.croit.io
> Tel: +49 89 1896585 90
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] How to make nfs v3 work? nfs-ganesha for cephfs

2018-06-26 Thread Youzhong Yang
NFS v4 works like a charm, no issue for Linux clients, but when trying to
mount on MAC OS X client, it doesn't work - likely due to 'mountd' not
registered in rpc by ganesha when it comes to v4.

So I tried to set up v3, no luck:

# mount -t nfs -o rw,noatime,vers=3 ceph-dev:/ceph /mnt/ceph
mount.nfs: access denied by server while mounting ceph-dev:/ceph

/var/log/ganesha/ganesha.log says:

mnt_Mnt :NFS3 :INFO :MOUNT: Export entry / does not support NFS v3 for
client :::172.21.24.38

My /etc/ganesha/ganesha.conf looks like this:

EXPORT
{
Export_ID=100;
Protocols = 3;
Transports = TCP;
Path = /ceph;
Tag = ceph;
Pseudo = /ceph;
Access_Type = RW;
Squash = None;
FSAL {
Name = CEPH;
}
}

Any way to make it work? Thanks in advance.

--Youzhong
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] fstrim issue in VM for cloned rbd image with fast-diff feature

2018-05-09 Thread Youzhong Yang
Thanks Jason.

Yes, my concern is that fstrim increases clones' disk usage. The VM didn't
use any additional space but fstrim caused its disk usage (in ceph) to go
up significantly. Imagine when there are hundreds of VMs, it would soon
cause space issue.

If this is expected behavior, does it mean it's better to disable fast-diff
from rbd image? I am fine with that.

There is an ugly part of this discard/fstrim feature. At one time while I
was doing dd + rm file + fstrim repeatedly, it rendered my VM root file
system corrupted. Sadly I couldn't reproduce it again.

Thanks.

On Wed, May 9, 2018 at 11:52 AM, Jason Dillaman <jdill...@redhat.com> wrote:

> On Wed, May 9, 2018 at 11:39 AM, Youzhong Yang <youzh...@gmail.com> wrote:
> > This is what I did:
> >
> > # rbd import /var/tmp/debian93-raw.img images/debian93
> > # rbd info images/debian93
> > rbd image 'debian93':
> >  size 81920 MB in 20480 objects
> >  order 22 (4096 kB objects)
> >  block_name_prefix: rbd_data.384b74b0dc51
> >  format: 2
> >  features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
> >  flags:
> >  create_timestamp: Wed May  9 09:31:24 2018
> > # rbd snap create images/debian93@snap
> > # rbd snap protect images/debian93@snap
> > # rbd clone images/debian93@snap vms/debian93.dsk
> > # rbd du vms/debian93.dsk
> > NAME PROVISIONED USED
> > debian93.dsk  81920M 336M
> >
> > --- Inside the VM ---
> > # df -h /
> > Filesystem  Size  Used Avail Use% Mounted on
> > /dev/sda179G   10G   66G  14% /
> > # fstrim -v /
> > /: 36.6 GiB (39311650816 bytes) trimmed
> >
> > --- then rbd du reports ---
> > # rbd du vms/debian93.dsk
> > NAME PROVISIONED   USED
> > debian93.dsk  81920M 76028M
> >
> > === If I disable fast-diff feature from images/debian93: ===
> > # fstrim -v /
> > /: 41 GiB (44059172864 bytes) trimmed
> >
> > # rbd du vms/debian93.dsk
> > warning: fast-diff map is not enabled for debian93.dsk. operation may be
> > slow.
> > NAME PROVISIONED  USED
> > debian93.dsk  81920M 8612M
> >
> > === or just flatten vms/debian93.dsk without disabling fast-diff ===
> > # rbd du vms/debian93.dsk
> > NAME PROVISIONED   USED
> > debian93.dsk  81920M 11992M
> >
> > # fstrim -v /
> > /: 68.7 GiB (73710755840 bytes) trimmed
> >
> > # rbd du vms/debian93.dsk
> > NAME PROVISIONED   USED
> > debian93.dsk  81920M 12000M
> >
> > Testing environment:
> > Ceph: v12.2.5
> > OS: Ubuntu 18.04
> > QEMU: 2.11
> > libvirt: 4.0.0
> >
> > Is this a known issue? or is the above behavior expected?
>
> What's your concern? I didn't see you state any potential problem. Are
> you just concerned that "fstrim" appears to increase your clone's disk
> usage? If that's the case, it's expected since "fast-diff" only tracks
> the existence of objects (not the per-object usage) and since it's a
> cloned image, a discard op results in the creation of a zero-byte
> object to "hide" the associated extent within the parent image.
>
> > Thanks,
> >
> > --Youzhong
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
>
> --
> Jason
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ubuntu 17.10 or Debian 9.3 + Luminous = random OS hang ?

2018-01-21 Thread Youzhong Yang
Thanks. I applied the workaround to .vmx and rebooted all VMs. No more
freeze!

On Sun, Jan 21, 2018 at 3:43 PM, Nick Fisk <n...@fisk.me.uk> wrote:

> How up to date is your VM environment? We saw something very similar last
> year with Linux VM’s running newish kernels. It turns out newer kernels
> supported a new feature of the vmxnet3 adapters which had a bug in ESXi.
> The fix was release last year some time in ESXi6.5 U1, or a workaround was
> to set an option in the VM config.
>
>
>
> https://kb.vmware.com/s/article/2151480
>
>
>
>
>
>
>
> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
> Of *Youzhong Yang
> *Sent:* 21 January 2018 19:50
> *To:* Brad Hubbard <bhubb...@redhat.com>
> *Cc:* ceph-users <ceph-users@lists.ceph.com>
> *Subject:* Re: [ceph-users] Ubuntu 17.10 or Debian 9.3 + Luminous =
> random OS hang ?
>
>
>
> As someone suggested, I installed linux-generic-hwe-16.04 package on
> Ubuntu 16.04 to get kernel of 17.10, and then rebooted all VMs, here is
> what I observed:
>
> - ceph monitor node froze upon reboot, in another case froze after a few
> minutes
>
> - ceph OSD hosts easily froze
>
> - ceph admin node (which runs no ceph service but ceph-deploy) never
> freezes
>
> - ceph rgw nodes and ceph mgr so far so good
>
>
>
> Here are two images I captured:
>
>
>
> https://drive.google.com/file/d/11hMJwhCF6Tj8LD3nlpokG0CB_
> oZqI506/view?usp=sharing
>
> https://drive.google.com/file/d/1tzDQ3DYTnfDHh_
> hTQb0ISZZ4WZdRxHLv/view?usp=sharing
>
>
>
> Thanks.
>
>
>
> On Sat, Jan 20, 2018 at 7:03 PM, Brad Hubbard <bhubb...@redhat.com> wrote:
>
> On Fri, Jan 19, 2018 at 11:54 PM, Youzhong Yang <youzh...@gmail.com>
> wrote:
> > I don't think it's hardware issue. All the hosts are VMs. By the way,
> using
> > the same set of VMWare hypervisors, I switched back to Ubuntu 16.04 last
> > night, so far so good, no freeze.
>
> Too little information to make any sort of assessment I'm afraid but,
> at this stage, this doesn't sound like a ceph issue.
>
>
> >
> > On Fri, Jan 19, 2018 at 8:50 AM, Daniel Baumann <daniel.baum...@bfh.ch>
> > wrote:
> >>
> >> Hi,
> >>
> >> On 01/19/18 14:46, Youzhong Yang wrote:
> >> > Just wondering if anyone has seen the same issue, or it's just me.
> >>
> >> we're using debian with our own backported kernels and ceph, works rock
> >> solid.
> >>
> >> what you're describing sounds more like hardware issues to me. if you
> >> don't fully "trust"/have confidence in your hardware (and your logs
> >> don't reveal anything), I'd recommend running some burn-in tests
> >> (memtest, cpuburn, etc.) on them for 24 hours/machine to rule out
> >> cpu/ram/etc. issues.
> >>
> >> Regards,
> >> Daniel
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
> --
> Cheers,
> Brad
>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ubuntu 17.10 or Debian 9.3 + Luminous = random OS hang ?

2018-01-21 Thread Youzhong Yang
As someone suggested, I installed linux-generic-hwe-16.04 package on Ubuntu
16.04 to get kernel of 17.10, and then rebooted all VMs, here is what I
observed:
- ceph monitor node froze upon reboot, in another case froze after a few
minutes
- ceph OSD hosts easily froze
- ceph admin node (which runs no ceph service but ceph-deploy) never freezes
- ceph rgw nodes and ceph mgr so far so good

Here are two images I captured:

https://drive.google.com/file/d/11hMJwhCF6Tj8LD3nlpokG0CB_oZqI506/view?usp=sharing
https://drive.google.com/file/d/1tzDQ3DYTnfDHh_hTQb0ISZZ4WZdRxHLv/view?usp=sharing

Thanks.

On Sat, Jan 20, 2018 at 7:03 PM, Brad Hubbard <bhubb...@redhat.com> wrote:

> On Fri, Jan 19, 2018 at 11:54 PM, Youzhong Yang <youzh...@gmail.com>
> wrote:
> > I don't think it's hardware issue. All the hosts are VMs. By the way,
> using
> > the same set of VMWare hypervisors, I switched back to Ubuntu 16.04 last
> > night, so far so good, no freeze.
>
> Too little information to make any sort of assessment I'm afraid but,
> at this stage, this doesn't sound like a ceph issue.
>
> >
> > On Fri, Jan 19, 2018 at 8:50 AM, Daniel Baumann <daniel.baum...@bfh.ch>
> > wrote:
> >>
> >> Hi,
> >>
> >> On 01/19/18 14:46, Youzhong Yang wrote:
> >> > Just wondering if anyone has seen the same issue, or it's just me.
> >>
> >> we're using debian with our own backported kernels and ceph, works rock
> >> solid.
> >>
> >> what you're describing sounds more like hardware issues to me. if you
> >> don't fully "trust"/have confidence in your hardware (and your logs
> >> don't reveal anything), I'd recommend running some burn-in tests
> >> (memtest, cpuburn, etc.) on them for 24 hours/machine to rule out
> >> cpu/ram/etc. issues.
> >>
> >> Regards,
> >> Daniel
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
>
> --
> Cheers,
> Brad
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] RGW compression causing issue for ElasticSearch

2018-01-20 Thread Youzhong Yang
I enabled compression by a command like this:

radosgw-admin zone placement modify --rgw-zone=coredumps
--placement-id=default-placement --compression=zlib

Then once the object was uploaded, elasticsearch kept dumping the following
messages:

[2018-01-20T23:13:43,587][DEBUG][o.e.a.b.TransportShardBulkAction]
[uSX41lj] [rgw-us-natick-04b97c18][6] failed to execute bulk item (index)
BulkShardRequest [[rgw-us-natick-04b97c18][6]] containing [index
{[rgw-us-natick-04b97c18][object][880837de-f383-4d5c-aa6f-8080518ca8f0.264524.1:top123:null],
source[{"bucket":"buck","name":"top123","instance":"null","versioned_epoch":0,"owner":{"id":"yyang","display_name":"yyang"},"permissions":["yyang"],"meta":{"size":109064,"mtime":"2018-01-21T04:07:55.248Z","compression":"\u0001\u00012\u\u\u\u0004\u\u\uzlib\u0008�\u0001\u\u\u\u\u\u0001\u\u\u\u0001\u0001\u0018\u\u\u\u\u\u\u\u\u\u\u\u\u\u\u\u\u\u\uH�\u\u\u\u\u","content_type":"binary/octet-stream","etag":"d08b14ff8f3c6326191d934c5668b1a0","tail_tag":"880837de-f383-4d5c-aa6f-8080518ca8f0.264519.204"}}]}]
org.elasticsearch.index.mapper.MapperParsingException: failed to parse
[meta.compression]
at
org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:298)
~[elasticsearch-5.6.6.jar:5.6.6]

It seems the 'compression' field contains some junk data, thus
elasticsearch is not happy.

Is there a quick fix for this issue? Thanks.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ubuntu 17.10 or Debian 9.3 + Luminous = random OS hang ?

2018-01-19 Thread Youzhong Yang
I don't think it's hardware issue. All the hosts are VMs. By the way, using
the same set of VMWare hypervisors, I switched back to Ubuntu 16.04 last
night, so far so good, no freeze.

On Fri, Jan 19, 2018 at 8:50 AM, Daniel Baumann <daniel.baum...@bfh.ch>
wrote:

> Hi,
>
> On 01/19/18 14:46, Youzhong Yang wrote:
> > Just wondering if anyone has seen the same issue, or it's just me.
>
> we're using debian with our own backported kernels and ceph, works rock
> solid.
>
> what you're describing sounds more like hardware issues to me. if you
> don't fully "trust"/have confidence in your hardware (and your logs
> don't reveal anything), I'd recommend running some burn-in tests
> (memtest, cpuburn, etc.) on them for 24 hours/machine to rule out
> cpu/ram/etc. issues.
>
> Regards,
> Daniel
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ubuntu 17.10 or Debian 9.3 + Luminous = random OS hang ?

2018-01-19 Thread Youzhong Yang
One month ago when I first started evaluating ceph, I chose Debian 9.3 as
the operating system. I saw random OS hang so I gave up and switched to
Ubuntu 16.04. Every thing works well using Ubuntu 16.04.

Yesterday I tried Ubuntu 17.10, again I saw random OS hang, no matter it's
mon, mgr, osd, or rgw. When it hangs, the console won't respond to keyboard
input, the host is unreachable from the network.

This is the OS vs kernel version list:
Ubuntu 16.04 -> kernel 4.4
Debian 9.3  -> kernel 4.9
Ubuntu 17.10 -> kernel 4.13

Just wondering if anyone has seen the same issue, or it's just me.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous RGW Metadata Search

2018-01-16 Thread Youzhong Yang
My bad ... Once I sent config request to us-east-1 (the master zone), it
works, and 'obo mdsearch' against "us-east-es" zone works like a charm.

May I suggest that the following page be modified to reflect this
requirement so that someone else won't run into the same issue? I
understand it may sound obvious to experienced users ...

http://ceph.com/rgw/new-luminous-rgw-metadata-search/

Thanks a lot.


On Tue, Jan 16, 2018 at 3:59 PM, Yehuda Sadeh-Weinraub <yeh...@redhat.com>
wrote:

> On Tue, Jan 16, 2018 at 12:20 PM, Youzhong Yang <youzh...@gmail.com>
> wrote:
> > Hi Yehuda,
> >
> > I can use your tool obo to create a bucket, and upload a file to the
> object
> > store, but when I tried to run the following command, it failed:
> >
> > # obo mdsearch buck --config='x-amz-meta-foo; string, x-amz-meta-bar;
> > integer'
> > ERROR: {"status": 405, "resource": null, "message": "", "error_code":
> > "MethodNotAllowed", "reason": "Method Not Allowed"}
> >
> > How to make the method 'Allowed'?
>
>
> Which rgw are you sending this request to?
>
> >
> > Thanks in advance.
> >
> > On Fri, Jan 12, 2018 at 7:25 PM, Yehuda Sadeh-Weinraub <
> yeh...@redhat.com>
> > wrote:
> >>
> >> The errors you're seeing there don't look like related to
> >> elasticsearch. It's a generic radosgw related error that says that it
> >> failed to reach the rados (ceph) backend. You can try bumping up the
> >> messenger log (debug ms =1) and see if there's any hint in there.
> >>
> >> Yehuda
> >>
> >> On Fri, Jan 12, 2018 at 12:54 PM, Youzhong Yang <youzh...@gmail.com>
> >> wrote:
> >> > So I did the exact same thing using Kraken and the same set of VMs, no
> >> > issue. What is the magic to make it work in Luminous? Anyone lucky
> >> > enough to
> >> > have this RGW ElasticSearch working using Luminous?
> >> >
> >> > On Mon, Jan 8, 2018 at 10:26 AM, Youzhong Yang <youzh...@gmail.com>
> >> > wrote:
> >> >>
> >> >> Hi Yehuda,
> >> >>
> >> >> Thanks for replying.
> >> >>
> >> >> >radosgw failed to connect to your ceph cluster. Does the rados
> command
> >> >> >with the same connection params work?
> >> >>
> >> >> I am not quite sure what to do by running rados command to test.
> >> >>
> >> >> So I tried again, could you please take a look and check what could
> >> >> have
> >> >> gone wrong?
> >> >>
> >> >> Here are what I did:
> >> >>
> >> >>  On ceph admin node, I removed installation on ceph-rgw1 and
> >> >> ceph-rgw2, reinstalled rgw on ceph-rgw1, stoped rgw service, removed
> >> >> all rgw
> >> >> pools. Elasticsearch is running on ceph-rgw2 node on port 9200.
> >> >>
> >> >> ceph-deploy purge ceph-rgw1
> >> >> ceph-deploy purge ceph-rgw2
> >> >> ceph-deploy purgedata ceph-rgw2
> >> >> ceph-deploy purgedata ceph-rgw1
> >> >> ceph-deploy install --release luminous ceph-rgw1
> >> >> ceph-deploy admin ceph-rgw1
> >> >> ceph-deploy rgw create ceph-rgw1
> >> >> ssh ceph-rgw1 sudo systemctl stop ceph-rado...@rgw.ceph-rgw1
> >> >> rados rmpool default.rgw.log default.rgw.log
> >> >> --yes-i-really-really-mean-it
> >> >> rados rmpool default.rgw.meta default.rgw.meta
> >> >> --yes-i-really-really-mean-it
> >> >> rados rmpool default.rgw.control default.rgw.control
> >> >> --yes-i-really-really-mean-it
> >> >> rados rmpool .rgw.root .rgw.root --yes-i-really-really-mean-it
> >> >>
> >> >>  On ceph-rgw1 node:
> >> >>
> >> >> export RGWHOST="ceph-rgw1"
> >> >> export ELASTICHOST="ceph-rgw2"
> >> >> export REALM="demo"
> >> >> export ZONEGRP="zone1"
> >> >> export ZONE1="zone1-a"
> >> >> export ZONE2="zone1-b"
> >> >> export SYNC_AKEY="$( cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w
> 20
> >> >> |
> >> >> head -n 1 )"
> >> >> export SYNC_SKEY="$( cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w
> 40
> >> &g

Re: [ceph-users] Luminous RGW Metadata Search

2018-01-16 Thread Youzhong Yang
Hi Yehuda,

I can use your tool obo to create a bucket, and upload a file to the object
store, but when I tried to run the following command, it failed:

# obo mdsearch buck --config='x-amz-meta-foo; string, x-amz-meta-bar;
integer'
ERROR: {"status": 405, "resource": null, "message": "", "error_code":
"MethodNotAllowed", "reason": "Method Not Allowed"}

How to make the method 'Allowed'?

Thanks in advance.

On Fri, Jan 12, 2018 at 7:25 PM, Yehuda Sadeh-Weinraub <yeh...@redhat.com>
wrote:

> The errors you're seeing there don't look like related to
> elasticsearch. It's a generic radosgw related error that says that it
> failed to reach the rados (ceph) backend. You can try bumping up the
> messenger log (debug ms =1) and see if there's any hint in there.
>
> Yehuda
>
> On Fri, Jan 12, 2018 at 12:54 PM, Youzhong Yang <youzh...@gmail.com>
> wrote:
> > So I did the exact same thing using Kraken and the same set of VMs, no
> > issue. What is the magic to make it work in Luminous? Anyone lucky
> enough to
> > have this RGW ElasticSearch working using Luminous?
> >
> > On Mon, Jan 8, 2018 at 10:26 AM, Youzhong Yang <youzh...@gmail.com>
> wrote:
> >>
> >> Hi Yehuda,
> >>
> >> Thanks for replying.
> >>
> >> >radosgw failed to connect to your ceph cluster. Does the rados command
> >> >with the same connection params work?
> >>
> >> I am not quite sure what to do by running rados command to test.
> >>
> >> So I tried again, could you please take a look and check what could have
> >> gone wrong?
> >>
> >> Here are what I did:
> >>
> >>  On ceph admin node, I removed installation on ceph-rgw1 and
> >> ceph-rgw2, reinstalled rgw on ceph-rgw1, stoped rgw service, removed
> all rgw
> >> pools. Elasticsearch is running on ceph-rgw2 node on port 9200.
> >>
> >> ceph-deploy purge ceph-rgw1
> >> ceph-deploy purge ceph-rgw2
> >> ceph-deploy purgedata ceph-rgw2
> >> ceph-deploy purgedata ceph-rgw1
> >> ceph-deploy install --release luminous ceph-rgw1
> >> ceph-deploy admin ceph-rgw1
> >> ceph-deploy rgw create ceph-rgw1
> >> ssh ceph-rgw1 sudo systemctl stop ceph-rado...@rgw.ceph-rgw1
> >> rados rmpool default.rgw.log default.rgw.log
> --yes-i-really-really-mean-it
> >> rados rmpool default.rgw.meta default.rgw.meta
> >> --yes-i-really-really-mean-it
> >> rados rmpool default.rgw.control default.rgw.control
> >> --yes-i-really-really-mean-it
> >> rados rmpool .rgw.root .rgw.root --yes-i-really-really-mean-it
> >>
> >>  On ceph-rgw1 node:
> >>
> >> export RGWHOST="ceph-rgw1"
> >> export ELASTICHOST="ceph-rgw2"
> >> export REALM="demo"
> >> export ZONEGRP="zone1"
> >> export ZONE1="zone1-a"
> >> export ZONE2="zone1-b"
> >> export SYNC_AKEY="$( cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 20
> |
> >> head -n 1 )"
> >> export SYNC_SKEY="$( cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 40
> |
> >> head -n 1 )"
> >>
> >> radosgw-admin realm create --rgw-realm=${REALM} --default
> >> radosgw-admin zonegroup create --rgw-realm=${REALM}
> >> --rgw-zonegroup=${ZONEGRP} --endpoints=http://${RGWHOST}:8000 --master
> >> --default
> >> radosgw-admin zone create --rgw-realm=${REALM}
> --rgw-zonegroup=${ZONEGRP}
> >> --rgw-zone=${ZONE1} --endpoints=http://${RGWHOST}:8000
> >> --access-key=${SYNC_AKEY} --secret=${SYNC_SKEY} --master --default
> >> radosgw-admin user create --uid=sync --display-name="zone sync"
> >> --access-key=${SYNC_AKEY} --secret=${SYNC_SKEY} --system
> >> radosgw-admin period update --commit
> >> sudo systemctl start ceph-radosgw@rgw.${RGWHOST}
> >>
> >> radosgw-admin zone create --rgw-realm=${REALM}
> --rgw-zonegroup=${ZONEGRP}
> >> --rgw-zone=${ZONE2} --access-key=${SYNC_AKEY} --secret=${SYNC_SKEY}
> >> --endpoints=http://${RGWHOST}:8002
> >> radosgw-admin zone modify --rgw-realm=${REALM}
> --rgw-zonegroup=${ZONEGRP}
> >> --rgw-zone=${ZONE2} --tier-type=elasticsearch
> >> --tier-config=endpoint=http://${ELASTICHOST}:9200,num_
> replicas=1,num_shards=10
> >> radosgw-admin period update --commit
> >>
> >> sudo systemctl restart ceph-radosgw@rgw.${RGWHOST}
> >> sudo radosgw --keyring /etc/ceph/ceph.client.admin.keyring -f
&

Re: [ceph-users] Luminous RGW Metadata Search

2018-01-15 Thread Youzhong Yang
Finally, the issue that has haunted me for quite some time turned out to be
a ceph.conf issue:

I had
osd_pool_default_pg_num = 100
osd_pool_default_pgp_num = 100

once I changed to
osd_pool_default_pg_num = 32
osd_pool_default_pgp_num = 32

then no issue to start the second rgw process.

No idea why 32 works but 100 doesn't. The debug output is useless and log
files too. Just insane.

Anyway, thanks.


On Fri, Jan 12, 2018 at 7:25 PM, Yehuda Sadeh-Weinraub <yeh...@redhat.com>
wrote:

> The errors you're seeing there don't look like related to
> elasticsearch. It's a generic radosgw related error that says that it
> failed to reach the rados (ceph) backend. You can try bumping up the
> messenger log (debug ms =1) and see if there's any hint in there.
>
> Yehuda
>
> On Fri, Jan 12, 2018 at 12:54 PM, Youzhong Yang <youzh...@gmail.com>
> wrote:
> > So I did the exact same thing using Kraken and the same set of VMs, no
> > issue. What is the magic to make it work in Luminous? Anyone lucky
> enough to
> > have this RGW ElasticSearch working using Luminous?
> >
> > On Mon, Jan 8, 2018 at 10:26 AM, Youzhong Yang <youzh...@gmail.com>
> wrote:
> >>
> >> Hi Yehuda,
> >>
> >> Thanks for replying.
> >>
> >> >radosgw failed to connect to your ceph cluster. Does the rados command
> >> >with the same connection params work?
> >>
> >> I am not quite sure what to do by running rados command to test.
> >>
> >> So I tried again, could you please take a look and check what could have
> >> gone wrong?
> >>
> >> Here are what I did:
> >>
> >>  On ceph admin node, I removed installation on ceph-rgw1 and
> >> ceph-rgw2, reinstalled rgw on ceph-rgw1, stoped rgw service, removed
> all rgw
> >> pools. Elasticsearch is running on ceph-rgw2 node on port 9200.
> >>
> >> ceph-deploy purge ceph-rgw1
> >> ceph-deploy purge ceph-rgw2
> >> ceph-deploy purgedata ceph-rgw2
> >> ceph-deploy purgedata ceph-rgw1
> >> ceph-deploy install --release luminous ceph-rgw1
> >> ceph-deploy admin ceph-rgw1
> >> ceph-deploy rgw create ceph-rgw1
> >> ssh ceph-rgw1 sudo systemctl stop ceph-rado...@rgw.ceph-rgw1
> >> rados rmpool default.rgw.log default.rgw.log
> --yes-i-really-really-mean-it
> >> rados rmpool default.rgw.meta default.rgw.meta
> >> --yes-i-really-really-mean-it
> >> rados rmpool default.rgw.control default.rgw.control
> >> --yes-i-really-really-mean-it
> >> rados rmpool .rgw.root .rgw.root --yes-i-really-really-mean-it
> >>
> >>  On ceph-rgw1 node:
> >>
> >> export RGWHOST="ceph-rgw1"
> >> export ELASTICHOST="ceph-rgw2"
> >> export REALM="demo"
> >> export ZONEGRP="zone1"
> >> export ZONE1="zone1-a"
> >> export ZONE2="zone1-b"
> >> export SYNC_AKEY="$( cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 20
> |
> >> head -n 1 )"
> >> export SYNC_SKEY="$( cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 40
> |
> >> head -n 1 )"
> >>
> >> radosgw-admin realm create --rgw-realm=${REALM} --default
> >> radosgw-admin zonegroup create --rgw-realm=${REALM}
> >> --rgw-zonegroup=${ZONEGRP} --endpoints=http://${RGWHOST}:8000 --master
> >> --default
> >> radosgw-admin zone create --rgw-realm=${REALM}
> --rgw-zonegroup=${ZONEGRP}
> >> --rgw-zone=${ZONE1} --endpoints=http://${RGWHOST}:8000
> >> --access-key=${SYNC_AKEY} --secret=${SYNC_SKEY} --master --default
> >> radosgw-admin user create --uid=sync --display-name="zone sync"
> >> --access-key=${SYNC_AKEY} --secret=${SYNC_SKEY} --system
> >> radosgw-admin period update --commit
> >> sudo systemctl start ceph-radosgw@rgw.${RGWHOST}
> >>
> >> radosgw-admin zone create --rgw-realm=${REALM}
> --rgw-zonegroup=${ZONEGRP}
> >> --rgw-zone=${ZONE2} --access-key=${SYNC_AKEY} --secret=${SYNC_SKEY}
> >> --endpoints=http://${RGWHOST}:8002
> >> radosgw-admin zone modify --rgw-realm=${REALM}
> --rgw-zonegroup=${ZONEGRP}
> >> --rgw-zone=${ZONE2} --tier-type=elasticsearch
> >> --tier-config=endpoint=http://${ELASTICHOST}:9200,num_
> replicas=1,num_shards=10
> >> radosgw-admin period update --commit
> >>
> >> sudo systemctl restart ceph-radosgw@rgw.${RGWHOST}
> >> sudo radosgw --keyring /etc/ceph/ceph.client.admin.keyring -f
> >> --rgw-zone=${ZONE2} --rgw-frontends="civetweb port=8002"

Re: [ceph-users] Luminous RGW Metadata Search

2018-01-12 Thread Youzhong Yang
So I did the exact same thing using Kraken and the same set of VMs, no
issue. What is the magic to make it work in Luminous? Anyone lucky enough
to have this RGW ElasticSearch working using Luminous?

On Mon, Jan 8, 2018 at 10:26 AM, Youzhong Yang <youzh...@gmail.com> wrote:

> Hi Yehuda,
>
> Thanks for replying.
>
> >radosgw failed to connect to your ceph cluster. Does the rados command
> >with the same connection params work?
>
> I am not quite sure what to do by running rados command to test.
>
> So I tried again, could you please take a look and check what could have
> gone wrong?
>
> Here are what I did:
>
>  On ceph admin node, I removed installation on ceph-rgw1 and
> ceph-rgw2, reinstalled rgw on ceph-rgw1, stoped rgw service, removed all
> rgw pools. Elasticsearch is running on ceph-rgw2 node on port 9200.
>
> *ceph-deploy purge ceph-rgw1*
> *ceph-deploy purge ceph-rgw2*
> *ceph-deploy purgedata ceph-rgw2*
> *ceph-deploy purgedata ceph-rgw1*
> *ceph-deploy install --release luminous ceph-rgw1*
> *ceph-deploy admin ceph-rgw1*
> *ceph-deploy rgw create ceph-rgw1*
> *ssh ceph-rgw1 sudo systemctl stop ceph-rado...@rgw.ceph-rgw1*
> *rados rmpool default.rgw.log default.rgw.log
> --yes-i-really-really-mean-it*
> *rados rmpool default.rgw.meta default.rgw.meta
> --yes-i-really-really-mean-it*
> *rados rmpool default.rgw.control default.rgw.control
> --yes-i-really-really-mean-it*
> *rados rmpool .rgw.root .rgw.root --yes-i-really-really-mean-it*
>
>  On ceph-rgw1 node:
>
> *export RGWHOST="ceph-rgw1"*
> *export ELASTICHOST="ceph-rgw2"*
> *export REALM="demo"*
> *export ZONEGRP="zone1"*
> *export ZONE1="zone1-a"*
> *export ZONE2="zone1-b"*
> *export SYNC_AKEY="$( cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 20 |
> head -n 1 )"*
> *export SYNC_SKEY="$( cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 40 |
> head -n 1 )"*
>
>
> *radosgw-admin realm create --rgw-realm=${REALM} --default*
> *radosgw-admin zonegroup create --rgw-realm=${REALM}
> --rgw-zonegroup=${ZONEGRP} --endpoints=http://${RGWHOST}:8000 --master
> --default*
> *radosgw-admin zone create --rgw-realm=${REALM} --rgw-zonegroup=${ZONEGRP}
> --rgw-zone=${ZONE1} --endpoints=http://${RGWHOST}:8000
> --access-key=${SYNC_AKEY} --secret=${SYNC_SKEY} --master --default*
> *radosgw-admin user create --uid=sync --display-name="zone sync"
> --access-key=${SYNC_AKEY} --secret=${SYNC_SKEY} --system*
> *radosgw-admin period update --commit*
>
> *sudo systemctl start ceph-radosgw@rgw.${RGWHOST}*
>
> *radosgw-admin zone create --rgw-realm=${REALM} --rgw-zonegroup=${ZONEGRP}
> --rgw-zone=${ZONE2} --access-key=${SYNC_AKEY} --secret=${SYNC_SKEY}
> --endpoints=http://${RGWHOST}:8002*
> *radosgw-admin zone modify --rgw-realm=${REALM} --rgw-zonegroup=${ZONEGRP}
> --rgw-zone=${ZONE2} --tier-type=elasticsearch
> --tier-config=endpoint=http://${ELASTICHOST}:9200,num_replicas=1,num_shards=10*
> *radosgw-admin period update --commit*
>
> *sudo systemctl restart ceph-radosgw@rgw.${RGWHOST}*
>
> *sudo radosgw --keyring /etc/ceph/ceph.client.admin.keyring -f
> --rgw-zone=${ZONE2} --rgw-frontends="civetweb port=8002"*
> *2018-01-08 00:21:54.389432 7f0fe9cd2e80 -1 Couldn't init storage provider
> (RADOS)*
>
>  As you can see, starting rgw on port 8002 failed, but rgw on port
> 8000 was started successfully.
>  Here are some more info which may be useful for diagnosis:
>
> $ cat /etc/ceph/ceph.conf
> [global]
> fsid = 3e5a32d4-e45e-48dd-a3c5-f6f28fef8edf
> mon_initial_members = ceph-mon1, ceph-osd1, ceph-osd2, ceph-osd3
> mon_host = 172.30.212.226,172.30.212.227,172.30.212.228,172.30.212.250
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> osd_pool_default_size = 2
> osd_pool_default_min_size = 2
> osd_pool_default_pg_num = 100
> osd_pool_default_pgp_num = 100
> bluestore_compression_algorithm = zlib
> bluestore_compression_mode = force
> rgw_max_put_size = 21474836480
> [osd]
> osd_max_object_size = 1073741824
> [mon]
> mon_allow_pool_delete = true
> [client.rgw.ceph-rgw1]
> host = ceph-rgw1
> rgw frontends = civetweb port=8000
>
> $ wget -O - -q http://ceph-rgw2:9200/
> {
>   "name" : "Hippolyta",
>   "cluster_name" : "elasticsearch",
>   "version" : {
> "number" : "2.3.1",
> "build_hash" : "bd980929010aef404e7cb0843e61d0665269fc39",
> "build_timestamp" : "2016-04-04T12:25:05Z",
> "build_snapshot" : false,
> "lucene

Re: [ceph-users] Luminous RGW Metadata Search

2018-01-08 Thread Youzhong Yang
Hi Yehuda,

Thanks for replying.

>radosgw failed to connect to your ceph cluster. Does the rados command
>with the same connection params work?

I am not quite sure what to do by running rados command to test.

So I tried again, could you please take a look and check what could have
gone wrong?

Here are what I did:

 On ceph admin node, I removed installation on ceph-rgw1 and ceph-rgw2,
reinstalled rgw on ceph-rgw1, stoped rgw service, removed all rgw pools.
Elasticsearch is running on ceph-rgw2 node on port 9200.

*ceph-deploy purge ceph-rgw1*
*ceph-deploy purge ceph-rgw2*
*ceph-deploy purgedata ceph-rgw2*
*ceph-deploy purgedata ceph-rgw1*
*ceph-deploy install --release luminous ceph-rgw1*
*ceph-deploy admin ceph-rgw1*
*ceph-deploy rgw create ceph-rgw1*
*ssh ceph-rgw1 sudo systemctl stop ceph-rado...@rgw.ceph-rgw1*
*rados rmpool default.rgw.log default.rgw.log --yes-i-really-really-mean-it*
*rados rmpool default.rgw.meta default.rgw.meta
--yes-i-really-really-mean-it*
*rados rmpool default.rgw.control default.rgw.control
--yes-i-really-really-mean-it*
*rados rmpool .rgw.root .rgw.root --yes-i-really-really-mean-it*

 On ceph-rgw1 node:

*export RGWHOST="ceph-rgw1"*
*export ELASTICHOST="ceph-rgw2"*
*export REALM="demo"*
*export ZONEGRP="zone1"*
*export ZONE1="zone1-a"*
*export ZONE2="zone1-b"*
*export SYNC_AKEY="$( cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 20 |
head -n 1 )"*
*export SYNC_SKEY="$( cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 40 |
head -n 1 )"*


*radosgw-admin realm create --rgw-realm=${REALM} --default*
*radosgw-admin zonegroup create --rgw-realm=${REALM}
--rgw-zonegroup=${ZONEGRP} --endpoints=http://${RGWHOST}:8000 --master
--default*
*radosgw-admin zone create --rgw-realm=${REALM} --rgw-zonegroup=${ZONEGRP}
--rgw-zone=${ZONE1} --endpoints=http://${RGWHOST}:8000
--access-key=${SYNC_AKEY} --secret=${SYNC_SKEY} --master --default*
*radosgw-admin user create --uid=sync --display-name="zone sync"
--access-key=${SYNC_AKEY} --secret=${SYNC_SKEY} --system*
*radosgw-admin period update --commit*

*sudo systemctl start ceph-radosgw@rgw.${RGWHOST}*

*radosgw-admin zone create --rgw-realm=${REALM} --rgw-zonegroup=${ZONEGRP}
--rgw-zone=${ZONE2} --access-key=${SYNC_AKEY} --secret=${SYNC_SKEY}
--endpoints=http://${RGWHOST}:8002*
*radosgw-admin zone modify --rgw-realm=${REALM} --rgw-zonegroup=${ZONEGRP}
--rgw-zone=${ZONE2} --tier-type=elasticsearch
--tier-config=endpoint=http://${ELASTICHOST}:9200,num_replicas=1,num_shards=10*
*radosgw-admin period update --commit*

*sudo systemctl restart ceph-radosgw@rgw.${RGWHOST}*

*sudo radosgw --keyring /etc/ceph/ceph.client.admin.keyring -f
--rgw-zone=${ZONE2} --rgw-frontends="civetweb port=8002"*
*2018-01-08 00:21:54.389432 7f0fe9cd2e80 -1 Couldn't init storage provider
(RADOS)*

 As you can see, starting rgw on port 8002 failed, but rgw on port 8000
was started successfully.
 Here are some more info which may be useful for diagnosis:

$ cat /etc/ceph/ceph.conf
[global]
fsid = 3e5a32d4-e45e-48dd-a3c5-f6f28fef8edf
mon_initial_members = ceph-mon1, ceph-osd1, ceph-osd2, ceph-osd3
mon_host = 172.30.212.226,172.30.212.227,172.30.212.228,172.30.212.250
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
osd_pool_default_size = 2
osd_pool_default_min_size = 2
osd_pool_default_pg_num = 100
osd_pool_default_pgp_num = 100
bluestore_compression_algorithm = zlib
bluestore_compression_mode = force
rgw_max_put_size = 21474836480
[osd]
osd_max_object_size = 1073741824
[mon]
mon_allow_pool_delete = true
[client.rgw.ceph-rgw1]
host = ceph-rgw1
rgw frontends = civetweb port=8000

$ wget -O - -q http://ceph-rgw2:9200/
{
  "name" : "Hippolyta",
  "cluster_name" : "elasticsearch",
  "version" : {
"number" : "2.3.1",
"build_hash" : "bd980929010aef404e7cb0843e61d0665269fc39",
"build_timestamp" : "2016-04-04T12:25:05Z",
"build_snapshot" : false,
"lucene_version" : "5.5.0"
  },
  "tagline" : "You Know, for Search"
}

$ ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
719G  705G   14473M  1.96
POOLS:
NAMEID USED %USED MAX AVAIL OBJECTS
.rgw.root   17 6035 0  333G  19
zone1-a.rgw.control 180 0  333G   8
zone1-a.rgw.meta19  350 0  333G   2
zone1-a.rgw.log 20   50 0  333G 176
zone1-b.rgw.control 210 0  333G   8
zone1-b.rgw.meta220 0  333G   0

$ rados df
POOL_NAME   USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND
DEGRADED RD_OPS RDWR_OPS WR
.rgw.root   6035  19  0 38  0   0
  0817  553k 55 37888
zone1-a.rgw.control0   8  0 16  0   0
  0  0 0  0 0
zone1-a.rgw.log   

[ceph-users] Luminous RGW Metadata Search

2017-12-22 Thread Youzhong Yang
I followed the exact steps of the following page:

http://ceph.com/rgw/new-luminous-rgw-metadata-search/

"us-east-1" zone is serviced by host "ceph-rgw1" on port 8000, no issue,
the service runs successfully.

"us-east-es" zone is serviced by host "ceph-rgw2" on port 8002, the service
was unable to start:

# /usr/bin/radosgw -f --cluster ceph --name client.rgw.ceph-rgw2 --setuser
ceph --setgroup ceph   2017-12-22 16:35:48.513912 7fc54e98ee80 -1
Couldn't init storage provider (RADOS)

It's this mysterious error message "Couldn't init storage provider
(RADOS)", there's no any clue what is wrong, what is mis-configured or
anything like that.

Yes, I have elasticsearch installed and running on host 'ceph-rgw2'. Is
there any additional configuration required for ElasticSearch?

Did I miss anything? what is the magic to make this basic stuff work?

Thanks,

--Youzhong
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] radosgw: Couldn't init storage provider (RADOS)

2017-12-18 Thread Youzhong Yang
Hello,

I tried to install Ceph 12.2.2 (Luminous) on Ubuntu 16.04.3 LTS (kernel
4.4.0-104-generic), but I am having trouble starting radosgw service:

# systemctl status ceph-rado...@rgw.ceph-rgw1
â ceph-rado...@rgw.ceph-rgw1.service - Ceph rados gateway
   Loaded: loaded (/lib/systemd/system/ceph-radosgw@.service; enabled;
vendor preset: enabled)
   Active: inactive (dead) (Result: exit-code) since Mon 2017-12-18
16:10:18 EST; 15min ago
  Process: 4571 ExecStart=/usr/bin/radosgw -f --cluster ${CLUSTER} --name
client.%i --setuser ceph --setgroup ceph (code=exited, status=5)
 Main PID: 4571 (code=exited, status=5)

Dec 18 16:10:17 ceph-rgw1 systemd[1]: ceph-rado...@rgw.ceph-rgw1.service:
Unit entered failed state.
Dec 18 16:10:17 ceph-rgw1 systemd[1]: ceph-rado...@rgw.ceph-rgw1.service:
Failed with result 'exit-code'.
Dec 18 16:10:18 ceph-rgw1 systemd[1]: ceph-rado...@rgw.ceph-rgw1.service:
Service hold-off time over, scheduling restart.
Dec 18 16:10:18 ceph-rgw1 systemd[1]: Stopped Ceph rados gateway.
Dec 18 16:10:18 ceph-rgw1 systemd[1]: ceph-rado...@rgw.ceph-rgw1.service:
Start request repeated too quickly.
Dec 18 16:10:18 ceph-rgw1 systemd[1]: Failed to start Ceph rados gateway.

If I ran the following command directly, it failed immediately:

# /usr/bin/radosgw -f --cluster ceph --name client.rgw.ceph-rgw1 --setuser
ceph --setgroup ceph
2017-12-18 16:26:56.413135 7ff11b00fe80 -1 Couldn't init storage provider
(RADOS)

There's no issue when I installed Kraken (version 11.2.1). Did I miss
anything?

Your help would be very much appreciated.

Thanks,

--Youzhong
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com